I have a vector a
.
I want to extract the numbers between PUBMED
and \nREFERENCE
, which means the number is 32634600
I don't know how to code it using str_extract()
.
a = "234 4dfd 123PUBMED 32634600\nREFERENCE"
# expected output is 32634600
CodePudding user response:
Using a lookbehind and stringr
:
library(stringr)
str_extract_all(a, "(?<=PUBMED )[0-9] ")
[[1]]
[1] "32634600"
CodePudding user response:
We can use sub()
here with a capture group:
a <- "234 4dfd 123PUBMED 32634600\nREFERENCE"
num <- sub(".*PUBMED\\s*(\\d )\\s*\\bREFERENCE\\b.*", "\\1", a)
num
[1] "32634600"