Home > Enterprise >  Regex issue in R when escaping regex special characters with str_extract
Regex issue in R when escaping regex special characters with str_extract

Time:06-09

I'm trying to extract the status -- in this case the word "Active" from this pattern:

Status\nActive\nHometown\

Using this regex: https://regex101.com/r/xegX00/1, but I cannot get it to work in R using str_extract. It does seem weird to have dual escapes, but I've tried every possible combination here and cannot get this to work. Any help appreciated!

mutate(status=str_extract(df, "(?<=Status\\\\n)(.*?)(?=\\\\)")) 

CodePudding user response:

You can use sub in base R -

x <- "Status\nActive\nHometown\n"
sub('.*Status\n(.*?)\n.*', '\\1', x)
#[1] "Active"

If you want to use stringr, here is a suggestion with str_match which avoids using lookahead regex

stringr::str_match(x, 'Status\n(.*)\n')[, 2]
#[1] "Active"
  • Related