I would like to know how to extract the number that is on the line of "specificWord"
s <-"
01-06-2021
line : 0.15
Rate : 0,30 %
specificWord: 0,14
01-06-2021
line : 2
Rate : 0,30 %
specificWord: 0,20
01-06-2021
line : 1.15
Rate : 1,05 %
specificWord: 1
"
p <-"(?<=specificWord:\\s)\\d ,\\d*"
str_match_all(s, p)
CodePudding user response:
I get good results with this expression:
(?<=specificWord:\s )\d(,\d )?
The main difference to your expression is the quantifier of the whitespace character in the positive lookbehind.
For your purposes, you need to escape the backslashes of course.
Find an interactive example of the expression here if you need to tune it before heading back to your code: regexr.com/6956g
CodePudding user response:
You can try:
trimws(c(stringr::str_match_all(s, "(?<=specificWord:)\\s*\\d ,?\\d*")[[1]]))
#[1] "0,14" "0,20" "1"
or
x <- grep("specificWord:", strsplit(s, "\\n")[[1]], value = TRUE)
regmatches(x, regexpr("\\d (,\\d )?", x))
#[1] "0,14" "0,20" "1"
CodePudding user response:
You can use str_extract_all
to extract the target numbers defined by their co-occurrence to the right of the positive lookbehind (?<=specificWord:\\s{1,100})
:
library(stringr)
unlist(str_extract_all(s, "(?<=specificWord:\\s{1,100})[\\d,] "))
[1] "0,14" "0,20" "1"
or:
str_extract_all(s, "(?<=specificWord:\\s{1,100})[\\d,] ")[[1]]