I'm trying to use an an any_of()
within a filter()
to handle variable names that may or may not be in a dataframe when I run a function across it. When doing this however I run across the error that any_of() must be used within a *selecting* function.
Is it possible to create something that can non-specificity or do I need to create a method to name the expected column explicitly?
I've made a quick little example that shows the issue and would be very interested in any work arounds or suggestions.
data("iris")
iris %>%
mutate(Sepal.Area = Sepal.Length*Sepal.Width,
Petal.Area = Petal.Length*Petal.Width) %>%
filter(if_all(starts_with("Sepal"), ~.>4),
any_of(c("Petal.Area", "Petal.Diameter"), ~.>2))
CodePudding user response:
We may have to wrap with if_all
over the any_of
or use matches
iris %>%
mutate(Sepal.Area = Sepal.Length*Sepal.Width,
Petal.Area = Petal.Length*Petal.Width) %>%
filter(if_all(starts_with("Sepal"), ~ .x > 4),
if_all(matches("Petal.Area|Petal.Diameter"), ~ .x > 2))
OR may need
iris %>%
filter(if_all(any_of(c("Petal.Length", "hello")), ~ .x > 2),
if_all(starts_with("Sepal"), ~ .x > 4))
These are two different cases - any_of
selects only the columns that are found in the dataset without returning an error if there are some columns not found, whereas if_all
loops over the columns selected and returns TRUE for a row only if all the columns under selection returns TRUE based on the condition (if_any
- returns TRUE if any of the columns selected are TRUE). e.g.
> d1 <- data.frame(col1 = 1:3, col2 = -1:1, col3 = 2:4)
# col4 is not found
> d1 %>%
filter(if_all(any_of(c("col1", "col2", "col4")), ~ .x > 0))
col1 col2 col3
1 3 1 4
> d1 %>%
filter(if_any(c("col1", "col2", "col4"), ~ .x > 0))
Error in `filter()`:
! Problem while expanding `..1 = if_any(c("col1", "col2", "col4"), ~.x > 0)`.
Caused by error in `if_any()`:
! Can't select columns that don't exist.
✖ Column `col4` doesn't exist.
> d1 %>%
filter(if_any(c("col1", "col2", "col3"), ~ .x > 0))
col1 col2 col3
1 1 -1 2
2 2 0 3
3 3 1 4