Apologies if this has been answered before. I struggled to find the answer that helps me.
Let's say I have a data frame: -
Name <- c('P1','P2;P3','P4','P5','P6;P7', "P8", "P9")
Count <- c(15,3,10,4,3,11,9)
df <- data.frame(Name, Count)
I want to filter the rows where text in column Name match with the list below: -
list <- c("P1", "P2", "P6", "P9")
Note, list has fewer values than the number of rows in the df. The resulting data frame should be:-
Name | Count |
---|---|
P1 | 15 |
P2;P3 | 3 |
P6;P7 | 3 |
P9 | 9 |
Every way I try, R doesn't recognize the values separated by semi-colons and leaves them out of the filtering process. I prefer using Tidyverse-based functions but any help would be greatly received.
Many thanks in advance, Andy
CodePudding user response:
You could do:
df %>% filter(sapply(Name, function(x) any(stringr::str_detect(x, list))))
#> Name Count
#> 1 P1 15
#> 2 P2;P3 3
#> 3 P6;P7 3
#> 4 P9 9
Or in full tidyverse idiom:
library(tidyverse)
df %>% filter(map_lgl(Name, ~any(str_detect(.x, list))))
#> Name Count
#> 1 P1 15
#> 2 P2;P3 3
#> 3 P6;P7 3
#> 4 P9 9
As an obligatory side note, it is bad practice to call a variable list
, since this clashes with the name of the function list
CodePudding user response:
Split names and compare with the list:
df[ sapply(strsplit(df$Name, ";"), function(i) any(i %in% list)), ]
Or grepl with OR - "|"
:
df[ grepl(paste(list, collapse = "|"), df$Name), ]
# Name Count
# 1 P1 15
# 2 P2;P3 3
# 5 P6;P7 3
# 7 P9 9