Home > Blockchain >  how to randomly assign all observations to two groups evenly
how to randomly assign all observations to two groups evenly

Time:09-04

For example, my data has 2000 observations. I want to create a new variable called group, which essentially is a random assignment of 1 or 2 but has 1000 counts of 1 and 1000 counts of 2.

Is there a command from dplyr that can do this?

CodePudding user response:

To get the same number of each element but a random order, just do:

df %>%
  mutate(rand = sample(rep(c(1,2), times = 1000)))

The rep ensures that the counts of 1 and 2 are the same, and sample randomizes the order.

  • Related