Home > Blockchain >  Keep only rows containing certain string from a list of strings provided
Keep only rows containing certain string from a list of strings provided


How do I keep only rows that contain a certain string given a list of strings. What I'm trying to say is I don't want to use grepl() and hardcode the values I would like to exclude. Let's assume that I want to only keep records that contain abc or bbc or bcc or 20 more options in one of the columns, and I have x <- c("abc", "bbc", ....).

What can I do to only keep records containing values of x in the dataframe?

CodePudding user response:

You can use %in%:

df_out <- df[df$v1 %in% x, ]

Or, you could form a regex alternation with the values in x and then use grepl:

regex <- paste0("^(?:", paste(x, collapse="|"), ")$")
df_out <- df[grepl(regex, df$v1), ]

CodePudding user response:

The stringi package has good functions for extracting string pattern matches

newdat <- stringi::stri_extract_all(str, pattern)


You can even pass the function a list of strings as your pattern to match

  •  Tags:  
  • r
  • Related