I have a dataset loaded into R and 8 dplyr filters defined on it. I would like to exhaustively apply all combinations of two filters to the dataset, to examine the size of the resulting groups. Currently, I have these filters stored into variables like so:
filter1 <- data %>% filter(var1 == x)
filter2 <- data %>% filter(var2 == y)
...
filter8 <- data %>% filter(varn == z)
Now I would to store these filters in a vector, so I can call apply them in a for-loop and do something like this:
filter_vec <- c(filter1, filter2, ..., filter8)
for (i in filter_vec){
for (j in filter_vec)
filter_insection <- dplyr::intersect(i,j)
print(nrow(filter_intersection)
}
}
However, this doesn't work. The result is always a list object of length 0, indicating that all intersection are empty, when in fact they are not.
I have tried a number of things to get the code to work. To start, instead of working with filters, I tried to get the filtering functionality from functions (as suggested in this post: Can we combine 2 filters in R?) and work with vectors of functions instead. Here I run into essentially the same problem (including "invalid subscript type 'list' errors)
Furthermore I tried: 1) to work with "expr" variables, 2) parse expressions using rlang and 3) passing items from the vector through "unlist" before plugging them into the dplyr function. However none seem to do the trick. Therefore I would like to know how I can store and apply dplyr filters from a vector in R?
CodePudding user response:
Store the dataframes in a list and later use the for
loop in the following way.
filter_vec <- list(filter1, filter2, ..., filter8)
output <- vector('list', length(filter_vec) * length(filter_vec))
k <- 0
for (i in filter_vec){
for (j in filter_vec){
k <- k 1
output[[k]] <- dplyr::intersect(i,j)
print(nrow(output[[k]]))
}
}
output