I wonder how to loop my code below to make it more functional and generalizable for other data (the current data is just a toy):
FIRST
, I select a study
from data
using sample()
and then filter()
rows of it whose outcome == outcome_to_remove
. This gives datat
output.
SECOND
, I select a study
from datat
using sample()
and then filter()
rows of it whose outcome == outcome_to_remove2
. This gives the final output.
Can we possibly loop this process?
EDIT: The only conditional I would like to add to my code is that the length(unique(data$study))
before and after the looping should always remain the same. That is, it shouldn't be possible that a study
looses its outcome == "A"
in the FIRST
step, and outcome == "B"
at the SECOND
step, thus the whole study gets deleted.
(data <- expand_grid(study = 1:5, group = 1:2, outcome = c("A", "B")))
n = 1
#====-------------------- FIRST:
studies_to_remove = sample(unique(data$study), size = n)
outcome_to_remove = c("A")
datat <- data %>%
filter(
!( study %in% studies_to_remove &
outcome %in% outcome_to_remove
))
#====------------------- SECOND:
studies_to_remove2 = sample(unique(datat$study), size = n)
outcome_to_remove2 = c("B")
datat %>%
filter(
!( study %in% studies_to_remove2 &
outcome %in% outcome_to_remove2
))
CodePudding user response:
With the help of for
loop -
data <- tidyr::expand_grid(study = 1:5, group = 1:2, outcome = c("A", "B"))
n = 1
set.seed(9873)
outcome_to_remove <- unique(data$outcome)
unique_study <- unique(data$study)
for(i in outcome_to_remove) {
studies_to_remove = sample(unique_study, size = n)
outcome_to_remove = i
unique_study <- setdiff(unique_study, studies_to_remove)
cat('\nDropping study ', studies_to_remove, 'and outcome ', outcome_to_remove)
data <- data %>%
filter(
!( study %in% studies_to_remove &
outcome %in% outcome_to_remove
))
}
#Dropping study 3 and outcome A
#Dropping study 1 and outcome B
data
# study group outcome
# <int> <int> <chr>
# 1 1 1 A
# 2 1 2 A
# 3 2 1 A
# 4 2 1 B
# 5 2 2 A
# 6 2 2 B
# 7 3 1 B
# 8 3 2 B
# 9 4 1 A
#10 4 1 B
#11 4 2 A
#12 4 2 B
#13 5 1 A
#14 5 1 B
#15 5 2 A
#16 5 2 B