I have the following table:
name | group |
---|---|
a | 1 |
b | 1 |
c | 2 |
d | 2 |
e | 3 |
f | 3 |
and I want to randomly re-assign the group membership by (i) making sure that the names will not be assigned to same group, and (ii) the probability of the group membership will remain the same. Also, I am trying to (iii) avoid all the names under the same group to be assigned to the same new group. In essence I want to achieve something like this:
name | group | new.group |
---|---|---|
a | 1 | 2 |
b | 1 | 3 |
c | 2 | 1 |
d | 2 | 3 |
e | 3 | 1 |
f | 3 | 2 |
How do I do this in R
?
CodePudding user response:
A base R option using sample
setdiff
transform(
df,
new.group = ave(group, group, FUN = function(x) sample(setdiff(group, x),length(x)))
)
gives
name group new.group
1 a 1 2
2 b 1 3
3 c 2 1
4 d 2 3
5 e 3 1
6 f 3 2
Data
> dput(df)
structure(list(name = c("a", "b", "c", "d", "e", "f"), group = c(1L,
1L, 2L, 2L, 3L, 3L)), class = "data.frame", row.names = c(NA,
-6L))
CodePudding user response:
With all the restrictions, this is barely shuffling anymore: You could use the modulo operator.
df %>%
group_by(group) %>%
mutate(new_group = (2 row_number() group) %% 3 1)
name group new_group
<chr> <int> <dbl>
1 a 1 2
2 b 1 3
3 c 2 3
4 d 2 1
5 e 3 1
6 f 3 2