Home > Back-end >  How to get the number between two characters in R
How to get the number between two characters in R

Time:09-06

I have a vector a.

I want to extract the numbers between PUBMED and \nREFERENCE, which means the number is 32634600

I don't know how to code it using str_extract().

a = "234  4dfd 123PUBMED   32634600\nREFERENCE"

# expected output is 32634600

CodePudding user response:

Using a lookbehind and stringr:

library(stringr)
str_extract_all(a, "(?<=PUBMED   )[0-9] ")
[[1]]
[1] "32634600"

CodePudding user response:

We can use sub() here with a capture group:

a <- "234  4dfd 123PUBMED   32634600\nREFERENCE"
num <- sub(".*PUBMED\\s*(\\d )\\s*\\bREFERENCE\\b.*", "\\1", a)
num

[1] "32634600"
  • Related