Home > Back-end >  Return value from previous row in regex
Return value from previous row in regex

Time:12-22

I am looking to return a specific group in the previous row via regex.

Suppose I have the following information and the target is to extract the value 90 on the basis of the differentiation in the following line.

QTY 66:90:PCE
SCC 2
DTM 45:20200416:15
QTY 66:60:PCE
SCC 3
DTM 35:20210614:2

If I were to traget the value 90, I'd have to look for the SCC 2 tag and if I were to loom for the value 60, it would be the SCC 3 tag.

I got this far in an attempt to return the value 90 (?<=^QTY\ 66:)(\d )(.*\n.*SCC\ 2.*) but it seems convoluted and I fail to extract only Group 1. Here is the link to regex101. I am using R for the actual application. Thanks for the help !

CodePudding user response:

You can use

(?<=:)\d (?=[^\d\r\n]*[\r\n] .*SCC\ 2)

See the regex demo. Details:

  • (?<=:) - a : must occur immediately to the left of the current location
  • \d - one or more digits
  • (?=[^\d\r\n]*[\r\n] .*SCC\ 2) - immediately to the right, there must be
  • [^\d\r\n]* - any zero or more chars other than digits, CR and LF
  • [\r\n] - one or more CR or LF chars
  • .*SCC\ 2 - any text on a line up to the rigthmost occurrence of SCC 2.

In R, you can use

library(stringr)
str_extract(vec, "(?<=:)\\d (?=[^\\d\r\n]*[\r\n] .*SCC\\ 2)")

And a couple of base R approaches with sub:

sub(".*?\\ \\d :(\\d )[^\r\n]*[\r\n] [^\r\n]*SCC\\ 2.*", "\\1", vec)
sub("(?s).*?\\ \\d :(\\d )(?-s).*\\R.*SCC\\ 2(?s).*", "\\1", vec, perl=TRUE)

See regex 1 demo and regex 2 demo.

See the R demo online:

vec <- "QTY 66:90:PCE\nSCC 2\nDTM 45:20200416:15\nQTY 66:60:PCE\nSCC 3\nDTM 35:20210614:2"
sub(".*?\\ \\d :(\\d )[^\r\n]*[\r\n] [^\r\n]*SCC\\ 2.*", "\\1", vec)
sub("(?s).*?\\ \\d :(\\d )(?-s).*\\R.*SCC\\ 2(?s).*", "\\1", vec, perl=TRUE)
library(stringr)
str_extract(vec, "(?<=:)\\d (?=[^\\d\r\n]*[\r\n] .*SCC\\ 2)")

All yield [1] "90".

  • Related