Home > other >  Find a specific string with grepl across all columns in R dplyr
Find a specific string with grepl across all columns in R dplyr

Time:09-10

In a huge data.frame I am trying to search all columns for a string using dplyr in R

I am unsure where I am doing wrong, but here is an example of what I am trying. Let's say that I am trying in mpg to find audi, and audi exists in multiple columns, and I want to extract only the rows that contain audi.

This would not work ANy ideas

library(tidyverse)
head(mpg)
#> # A tibble: 6 × 11
#>   manufacturer model displ  year   cyl trans      drv     cty   hwy fl    class 
#>   <chr>        <chr> <dbl> <int> <int> <chr>      <chr> <int> <int> <chr> <chr> 
#> 1 audi         a4      1.8  1999     4 auto(l5)   f        18    29 p     compa…
#> 2 audi         a4      1.8  1999     4 manual(m5) f        21    29 p     compa…
#> 3 audi         a4      2    2008     4 manual(m6) f        20    31 p     compa…
#> 4 audi         a4      2    2008     4 auto(av)   f        21    30 p     compa…
#> 5 audi         a4      2.8  1999     6 auto(l5)   f        16    26 p     compa…
#> 6 audi         a4      2.8  1999     6 manual(m5) f        18    26 p     compa…
mpg |> 
  filter(if_all(.cols = everything(), ~grepl("audi",.)))
#> # A tibble: 0 × 11
#> # … with 11 variables: manufacturer <chr>, model <chr>, displ <dbl>,
#> #   year <int>, cyl <int>, trans <chr>, drv <chr>, cty <int>, hwy <int>,
#> #   fl <chr>, class <chr>

Created on 2022-09-09 with reprex v2.0.2

CodePudding user response:

Use if_any to match a row if any of the column (i.e. at least one among all) matches the pattern. With if_all, every column would have to match the pattern.

mpg |> 
  filter(if_any(.cols = everything(), ~ grepl("audi", .)))

CodePudding user response:

Here is a base R option:

library(ggplot2) # Load for mpg dataset
mpg[Reduce(`|`, lapply(mpg, grepl, pattern = "audi")),]
#> # A tibble: 18 × 11
#>    manufacturer model      displ  year   cyl trans drv     cty   hwy fl    class
#>    <chr>        <chr>      <dbl> <int> <int> <chr> <chr> <int> <int> <chr> <chr>
#>  1 audi         a4           1.8  1999     4 auto… f        18    29 p     comp…
#>  2 audi         a4           1.8  1999     4 manu… f        21    29 p     comp…
#>  3 audi         a4           2    2008     4 manu… f        20    31 p     comp…
#>  4 audi         a4           2    2008     4 auto… f        21    30 p     comp…
#>  5 audi         a4           2.8  1999     6 auto… f        16    26 p     comp…
#>  6 audi         a4           2.8  1999     6 manu… f        18    26 p     comp…
#>  7 audi         a4           3.1  2008     6 auto… f        18    27 p     comp…
#>  8 audi         a4 quattro   1.8  1999     4 manu… 4        18    26 p     comp…
#>  9 audi         a4 quattro   1.8  1999     4 auto… 4        16    25 p     comp…
#> 10 audi         a4 quattro   2    2008     4 manu… 4        20    28 p     comp…
#> 11 audi         a4 quattro   2    2008     4 auto… 4        19    27 p     comp…
#> 12 audi         a4 quattro   2.8  1999     6 auto… 4        15    25 p     comp…
#> 13 audi         a4 quattro   2.8  1999     6 manu… 4        17    25 p     comp…
#> 14 audi         a4 quattro   3.1  2008     6 auto… 4        17    25 p     comp…
#> 15 audi         a4 quattro   3.1  2008     6 manu… 4        15    25 p     comp…
#> 16 audi         a6 quattro   2.8  1999     6 auto… 4        15    24 p     mids…
#> 17 audi         a6 quattro   3.1  2008     6 auto… 4        17    25 p     mids…
#> 18 audi         a6 quattro   4.2  2008     8 auto… 4        16    23 p     mids…

Created on 2022-09-09 with reprex v2.0.2

  • Related