I am trying to generate binary variables for A, B, and C. 1 meaning you are a part of said group. Now, with that information, I would like to create another binary variable based on probabilities called X, which assigns 1 and 0 conditional on previous groups. I am trying to figure out what the correct way is for this conditional structure. Listed below are all hypothetical values, this is just a data-generating simulation exercise.
A,B,C binary indicators.
A <- c(rep(1,195), rep(0,805))
B <- c(rep(1,90), rep(0,910))
C <- c(rep(1,715), rep(0,285))
Creating X binary indicator, I was pursuing case_when but something is wrong and it is possible I am miss understand the ability of case_when since I have another function embedded in it. So, I understand this is probably not be correct, however, hopefully it illustrates what it is I am aiming to do.
mutate(english == case_when(A == 1 ~ rbinom(250,1,.40),
B == 1 ~ rbinom(250,1,.90),
C == 1 ~ rbinom(500,1,.10))
I know I can also use the ifelse function but I am not quite sure how to make that work based on my question -- still trying to think through it. Thank you!
CodePudding user response:
To generate a list of individuals with a unique race and then randomly assign the english variable, you could use
df <- tibble(race = c(rep("black", 195), rep("hispanic", 90), rep("white", 715)),
english = case_when(race == "black" ~ rbinom(1000, 1, 0.98),
race == "hispanic" ~ rbinom(1000, 1, 0.40),
race == "white" ~ rbinom(1000, 1, 0.98)))
Which does not require binary columns at all. If you still want them, you could try
df <- tibble(row_number = 1:1000,
temp = 1,
race = c(rep("black", 195), rep("hispanic", 90), rep("white", 715)),
english = case_when(race == "black" ~ rbinom(1000, 1, 0.98),
race == "hispanic" ~ rbinom(1000, 1, 0.40),
race == "white" ~ rbinom(1000, 1, 0.98))) %>%
pivot_wider(names_from =race, values_from = temp, values_fill = 0)
If you have another vector
other_var <- 1000:1
that you want to attach as a column, then you could use either
df <- df %>% bind_cols(other = other_var)
or
df$other <- other_var