Home > Back-end >  Filter First,Middle and Last string word like using grepl function
Filter First,Middle and Last string word like using grepl function

Time:07-24

Trying to filter the data of the first,middle and last string word like matching using grepl function, but it is also picking words like HEV along with DEV(intended match)

Airport_ID<-c("3001","3002","3003","3004")
Airport_Name<-c("DEV Adelaide DTSUpdated","HEV Brisbane HEV Land Airport Land ADTS",
                "DEVAST Washington INC Airport DTSUpdated","DALLAS DEVASTAirport HEV INCUpdated")
dfu<-data.frame(Airport_ID,Airport_Name)

Filter_Data_F <- dfu %>%  
               dplyr::filter(grepl("^DEV" , Airport_Name , fixed = F) |
                             grepl(" \\DEV\\ " , Airport_Name , fixed = F) |
                             grepl("DEV$" , Airport_Name , fixed = F) )

CodePudding user response:

\\D has a special meaning in regex. It matches any character that is not a digit character. So in the second condition it is matching a non-digit character (H) followed by EV, hence you get HEV in the output.

Secondly, grepl has by default fixed = FALSE so you can ignore that argument.

Also, I am not sure if you should write separate grepl arguments with |. Only one grepl should do it.

library(dplyr)
dfu %>%  dplyr::filter(grepl('DEV', Airport_Name))

#  Airport_ID                             Airport_Name
#1       3001                  DEV Adelaide DTSUpdated
#2       3003 DEVAST Washington INC Airport DTSUpdated
#3       3004      DALLAS DEVASTAirport HEV INCUpdated

If you want to exactly match DEV so DEVAST does not match, use word boundaries (\\b).

dfu %>%  dplyr::filter(grepl('\\bDEV\\b', Airport_Name))

#  Airport_ID            Airport_Name
#1       3001 DEV Adelaide DTSUpdated
  • Related