I want to create a random binary outcome such that the probability of the outcome differs based on a column value in a data frame. For instance, using the data below I want persons in States "A" and "C" to have 70% of the occurrence of the outcome or 1 while persons in "B" and "D" have a random probability of occurrence. Sample data code:
ID <- c(1:200)
State <-c("A", "B", "C", "D")
State <- sort(rep(State, 50))
df <- data.frame(ID=ID,
State=State)
CodePudding user response:
The rbinom
function is vectorized over the probability, so you can use a different probability for the two groups. I'm assuming by "random" probability you mean 50%. That would look like
df$draw <- with(df, rbinom(length(ID), 1, ifelse(State %in% c("A","C"), .7, .5)))
You can check if things worked with
with(df, tapply(draws, State, mean))
With only 200 draws you'll see a lot of variability but if you run it a few times you should see that A and C on average are around 70%.