Home > OS >  R extract rows with a set of values in multiple columns of dataframe
R extract rows with a set of values in multiple columns of dataframe

Time:10-13

I´ve got a dataframe with 15 columns (data are categorical). I´d like to extract lines with contraditory categories (based in a set of rules). I tried Df %>% filter_at(vars(col_1,col_2), any_vars(. %in% c(8, 1))) and it works fine for lines with category 8 or lines with category 1 ... problem is I´d like both 8 and 1 in the same line (that´s the way I figure it would catch the contraditions in the dataset).

Appreciate any ideas for this matter.

CodePudding user response:

We may use & with ==

library(dplyr)
Df %>%
   filter(if_any(c(col_1, col_2), ~ .x == 8) & if_any(c(col_1, col_2), ~ .x == 1))

-output

  ID col_1 col_2
1  1     1     8

Or another option is to paste the columns and detect with a regex

library(stringr)
Df %>%
    filter(str_detect(str_c(col_1, col_2), "18|81"))

-output

  ID col_1 col_2
1  1     1     8

If there are more than 2 values, we may also use

library(purrr)
Df %>% 
  filter(map(c(1, 8), \(x) if_any(c(col_1, col_2), ~ .x == x)) %>%
    reduce(`&`))
  ID col_1 col_2
1  1     1     8

data

Df <- data.frame(ID = 1:5, col_1 = c(1, 2, 3, 4, 1), col_2 = c(8, 8, 3, 4, 1))
  • Related