Home > Software design >  Match text except comma
Match text except comma

Time:09-09

I have a string vector of symptoms, possibly multiple symptoms which are separated by commas, say:

x <- c('throat dry, muscles a bit painful', 'throat is a bit painful', 'throat pain, chest tightness', 'throatpain')

I'd like to use grepl or other regex function to return TRUE if "throat pain" or any slight variation is matched. In the example vector above, the result should be FALSE TRUE TRUE TRUE.

Thanks.

CodePudding user response:

This works for your example. Look for "throat", then "anything but comma 0 or more times", then "pain".

library(stringr)

str_detect(x, "throat[^,]{0,}pain")

[1] FALSE  TRUE  TRUE  TRUE

CodePudding user response:

Does this work: Using negative lookbehind.

grepl('throat[A-z\\s]*(,?<|)pain.*', x, perl = 1)
[1] FALSE  TRUE  TRUE  TRUE
  • Related