Home > Software design >  Match specific string from the first 3 characters and specific string anywhere from the 4th characte
Match specific string from the first 3 characters and specific string anywhere from the 4th characte

Time:02-09

I want to capture the string if the first 3 char is YZA, YZB, and YZB and another string is NUM12345 situated anywhere after the first 3 chars. Anyone can help correct the pattern below?

Regex: ^YZ[ABC]\sNUM12345

YZC/GH/A1/M,KNUM12345
YZB/M,SD-GG,K*NUM12345/A2
YZA/A1/M,SD-GG,KNUM12345/A2
YZB/M,SD-GG,K*NUM12345/A2/AA
YZA/A1/M,SD-GG,KNUM12345/A2/A2A
YZW/GH/A1/M,KNUM12345
YZR/M,SD-GG,K*NUM12345/A2
YZS/A1/M,SD-GG,KNUM12345/A2
YZT/M,SD-GG,K*NUM12345/A2/AA
YZJ/A1/M,SD-GG,KNUM12345/A2/A2A

Result/Matched:
YZC/GH/A1/M,KNUM12345
YZB/M,SD-GG,K*NUM12345/A2
YZA/A1/M,SD-GG,KNUM12345/A2
YZB/M,SD-GG,K*NUM12345/A2/AA
YZA/A1/M,SD-GG,KNUM12345/A2/A2A

CodePudding user response:

You can use

^YZ[ABC].*NUM12345.*

See the regex demo:

enter image description here

If you want to remove all lines other than the ones that match the above pattern, you can use

^(?!YZ[ABC].*NUM12345).*\R*

See the regex demo:

enter image description here

Details:

  • ^YZ[ABC].*NUM12345.* - start of string, YZ, A / B or C, then any zero or more chars (other than line break chars) as many as possible, NUM12345 and then any zero or more chars (other than line break chars) as many as possible.
  • ^(?!YZ[ABC].*NUM12345).*\R* - start of string, then a negative lookahead that fails the match if the above pattern is matched, and then any zero or more chars (other than line break chars) as many as possible and then any zero or more line break sequences.
  •  Tags:  
  • Related