Home > other >  How to match characters between two occurrences of the same but random string
How to match characters between two occurrences of the same but random string

Time:01-30

Base string looks like:

repeatedRandomStr ABCXYZ /an/arbitrary/@#-~/sequence/of_characters=I WANT TO MATCH/repeatedRandomStr/the/rest/of/strings.etc

The things I know about this base string are:

  • ABCXYZ is constant and always present.
  • repeatedRandomStr is random, but its first occurrence is always at the beginning and before ABCXYZ

So far I looked at regex context matching, recursion and subroutines but couldn't come up with a solution myself.

My currently working solution is to first determine what repeatedRandomStr is with:

^(.*)\sABCXYZ

and then use:

repeatedRandomStr\sABCXYZ\s(.*)\srepeatedRandomStr

to match what I want in $1. But this requires two separate regex queries. I want to know if this can be done in a single execution.

CodePudding user response:

In Go, where RE2 library is used, there is no way other than yours: keep extracting the value before the ABCXYZ and then use the regex to match a string between two strings, as RE2 does not and won't support backreferences.

In case the regex flavor can be switched to PCRE or compatible, you can use

^(.*?)\s ABCXYZ\s(.*)\1
^(.*?)\s ABCXYZ\s(.*?)\1

See the regex demo.

Details:

  • ^ - start of string
  • (.*?) - Group 1: zero or more chars other than line break chars as few as possible
  • \s - one or more whitespaces
  • ABCXYZ - some constant string
  • \s - a whitespace
  • (.*) - Group 2: zero or more chars other than line break chars as many as possible
  • \1 - the same value as in Group 1.
  • Related