Home > Mobile >  Remove some word/string from a regular expression pattern
Remove some word/string from a regular expression pattern

Time:12-08

I have a vector with the following values:

list <- c("test_data", "train_data", "random_forest_output", "xgboost_output", "light_gbm_output", "all_output", "all_output_final")

I need to select all variables that contain "output" word, but in the case of values "all_output" and "all_output_final" I need to choose only the "all_output_final". That is, I need to get a list like this:

new_list <- c("random_forest_output", "xgboost_output", "light_gbm_output", "all_output_final")

Is it possible to implement using a regular expression?

CodePudding user response:

First off, it's misleading to other users to call a data object list For one thing list is an important R function and for another your object "list" is not an R list. That said it's pretty easy to do this with logical values returned from the regex function grepl by using &! to eliminate the unwanted pattern which is denoted regex-ly by including the end of string "$"

list[ grepl("output", list)&!grepl("^all_output$",list)]
[1] "random_forest_output" "xgboost_output"       "light_gbm_output"     "all_output_final"  

You could read that &! (and NOT) expression as "... all of the preceding stuff but nonone of the following matches"

  • Related