My for
loop below uses elements in time_to_remove
to remove them randomly a study
. The available elements of time
to remove are: 2, 3, 4
.
But there is little issue. time
elements have an order. So, after sample()
ing for removal, we can't end up with: time = 0,1,3
or time = 0,1,2,4
etc. We should only end up with: time = 0,1
, time = 0,1,2
, time=0,1,2,3
or time=0,1,2,3,4
.
I wonder if such a condition can be added to my current code below?
library(tidyverse)
data <- expand_grid(study=1:4,group=1:2, time=0:4)
n = 1
time_to_remove <- unique(data$time)[-(1:2)]
unique_study <- unique(data$study)
for(i in time_to_remove) {
studies_to_remove = sample(unique_study, size = n)
time_to_remove = i
unique_study <- setdiff(unique_study, studies_to_remove)
data <- data %>%
filter(
!( study %in% studies_to_remove &
time %in% time_to_remove ))
}
data %>% as.data.frame()
CodePudding user response:
I know this is not an answer. But the questioner wanted to know what the code would look like if he tried to use my comment:
for(i in time_to_remove) {
studies_to_remove = sample(unique_study, size = n)
# time_to_remove = i
unique_study <- setdiff(unique_study, studies_to_remove)
data <- data %>%
filter(
!( study %in% studies_to_remove &
time > i ))
}