How do I keep only rows that contain a certain string given a list of strings. What I'm trying to say is I don't want to use grepl()
and hardcode the values I would like to exclude. Let's assume that I want to only keep records that contain abc
or bbc
or bcc
or 20 more options in one of the columns, and I have x <- c("abc", "bbc", ....)
.
What can I do to only keep records containing values of x
in the dataframe?
CodePudding user response:
You can use %in%
:
df_out <- df[df$v1 %in% x, ]
Or, you could form a regex alternation with the values in x
and then use grepl
:
regex <- paste0("^(?:", paste(x, collapse="|"), ")$")
df_out <- df[grepl(regex, df$v1), ]
CodePudding user response:
The stringi package has good functions for extracting string pattern matches
newdat <- stringi::stri_extract_all(str, pattern)
https://rdrr.io/cran/stringi/man/stri_extract.html
You can even pass the function a list of strings as your pattern to match