Home > Back-end >  Conditionally filter on multiple columns using regex to pick colnames R
Conditionally filter on multiple columns using regex to pick colnames R

Time:01-03

I have an issue similar to this thread:

Search across multiple columns with a regular expression and extract the match

Let's take an iris dataset as an example. I would like to filter data based on values in several columns: let's say >=4 in cols which names end with ".Length". IRL data are much more complex than this reprex, which is why I want to use regular expression in cols rather than pick them one by one by their indices.

Tried multiple ways, including the following:

filtered <- iris %>% dplyr::filter(across(matches('.Length')>=4))

to no avail. Please help.

CodePudding user response:

Using dplyr::if_all you could do:

library(dplyr)

iris1 <- iris %>% 
  filter(if_all(matches(".Length"), ~ .x >= 4))

str(iris1)
#> 'data.frame':    89 obs. of  5 variables:
#>  $ Sepal.Length: num  7 6.4 6.9 5.5 6.5 5.7 6.3 6.6 5.9 6 ...
#>  $ Sepal.Width : num  3.2 3.2 3.1 2.3 2.8 2.8 3.3 2.9 3 2.2 ...
#>  $ Petal.Length: num  4.7 4.5 4.9 4 4.6 4.5 4.7 4.6 4.2 4 ...
#>  $ Petal.Width : num  1.4 1.5 1.5 1.3 1.5 1.3 1.6 1.3 1.5 1 ...
#>  $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 2 2 2 2 2 2 2 2 2 2 ...

CodePudding user response:

Allthough dplyr::if_all and if_any were introduced for these specific situation to use it in conjunction with filter https://www.tidyverse.org/blog/2021/02/dplyr-1-0-4-if-any/

Here we could use an anonymous function:

across(.cols = everything(), .fns = NULL, ..., .names = NULL)

where the .fns argument could be in

purrr-style-lambda e.g. ~ . >= 4:

library(dplyr)
iris %>% 
    filter(across(ends_with('.Length'), ~ . >= 4))

> str(iris1)
'data.frame':   89 obs. of  5 variables:
 $ Sepal.Length: num  7 6.4 6.9 5.5 6.5 5.7 6.3 6.6 5.9 6 ...
 $ Sepal.Width : num  3.2 3.2 3.1 2.3 2.8 2.8 3.3 2.9 3 2.2 ...
 $ Petal.Length: num  4.7 4.5 4.9 4 4.6 4.5 4.7 4.6 4.2 4 ...
 $ Petal.Width : num  1.4 1.5 1.5 1.3 1.5 1.3 1.6 1.3 1.5 1 ...
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 2 2 2 2 2 2 2 2 2 2 ...
  • Related