I have the following example
x <- "carr proc proc_ proca select procb() procth;"
pattern <- "proc"
The expected result would be
"proc" "proca" "procb" "procth"
could be a list or a vector.
I tried several regex with stringr::str_extract_all, but could not get all the words that I wanted.
CodePudding user response:
Use
pattern <- "\\bproc[[:alnum:]]*\\b"
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
--------------------------------------------------------------------------------
proc 'proc'
--------------------------------------------------------------------------------
[[:alnum:]]* any character of: letters and digits (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
\b the boundary between a word char (\w) and
something that is not a word char
CodePudding user response:
What about this?
> unique(agrep(pattern, unlist(strsplit(x, "[^[:alpha:]] ")), value = TRUE))
[1] "proc" "proca" "procb" "procth"