I simulated the following general "survey experiment" data:
n <- 100
df <- data.frame(
Q1 = sample(c(18:90), n, rep = TRUE), #age
Q2 = sample(c("m", "f"), n, rep = TRUE), #sex
Q3 = sample(c(0,1), n, rep = TRUE, prob = c(0.55, 0.45)), #other general pre-treatment questions
Q4 = sample(c(0,1), n, rep = TRUE),
Q5 = sample(c(0,1), n, rep = TRUE), #treatment
Q6 = sample(c(0,1), n, rep = TRUE), #post-treatment
Q7 = sample(c(0,1), n, rep = TRUE),
Q8 = sample(c(0,1), n, rep = TRUE),
Q9 = sample(c(0,1), n, rep = TRUE),
Q10 = sample(c(0,1), n, rep = TRUE))
I'd like to simulate attrition (NA) data randomly. The following query deals with a similar issue: How do I add random `NA`s into a data frame
However, I'm interested in generating data that simulates respondents who left the survey completely, this may look something like this:
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
18 m 1 0 NA NA NA NA NA NA
30 f NA NA NA NA NA NA NA NA
25 f 1 0 1 0 NA NA NA NA
Thanks!
CodePudding user response:
With Base R
,
invisible(
sapply(1:nrow(df),function(x) {
a <- sample(3:10,1)
df[x,a:ncol(df)] <<- NA
}
))
head(df)
gives,
Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
1 29 f 1 1 1 0 NA NA NA NA
2 59 f NA NA NA NA NA NA NA NA
3 48 m 1 0 NA NA NA NA NA NA
4 38 m 0 1 0 NA NA NA NA NA
5 30 f 1 1 0 0 NA NA NA NA
6 57 m 1 1 1 1 0 NA NA NA