I'd like to extract the name John Doe
from the following string:
str <- 'Name: | |John Doe |'
I can do:
library(stringr)
str_extract(str,'(?<=Name: \\| \\|).*(?= \\|)')
[1] "John Doe"
But that involves typing a lot of spaces, and it doesn't work well when the number of spaces is not fixed. But when I try to use a quantifier (
), I get an error:
str_extract(str,'(?<=Name: \\| \\|).*(?= \\|)')
Error in stri_extract_first_regex(string, pattern, opts_regex = opts(pattern)) :
Look-Behind pattern matches must have a bounded maximum length. (U_REGEX_LOOK_BEHIND_LIMIT, context=`(?<=Name: \| \|).*(?= \|)`)
The same goes for other variants:
str_extract(str,'(?<=Name: \\|\\s \\|).*(?=\\s \\|)')
str_extract(str,'(?<=Name: \\|\\s{1,}\\|).*(?=\\s{1,}\\|)')
Is there a solution to this?
CodePudding user response:
How about:
First we remove Name
Then we replace all special characters with space
and finally str_squish
it
Library(stringr)
str_squish(str_replace_all( str_remove(str, "Name"), "[^[:alnum:]]", " "))
[1] "John Doe"
CodePudding user response:
Another solution using base R:
sub("Name: \\|\\s \\|(.*\\S)\\s \\|", "\\1", str)
# [1] "John Doe"
CodePudding user response:
You might also use the \K
to keep what is matched so far out of the regex match.
Name: \|\h \|\K.*?(?=\h \|)
Explanation
Name: \|
matchName: |
\h \|
Match 1 spaces and|
\K
Forget what is matched so far.*?
Match as least as possible chars(?=\h \|)
Positive lookahead, assert 1 more spaces to the right followed by|
See a regex demo and a R demo.
Example
library(stringr)
str <- 'Name: | |John Doe |'
regmatches(str, regexpr("Name: \\|\\h \\|\\K.*?(?=\\h \\|)", str, perl=T))
Output
[1] "John Doe"