I'm having a hard time getting a regex to do what I need.
This is the regex that I came up with:
(^([A-Z0-9]{3}[WTL])(TB)?(?!LG))
This is what I need it to do:
Capture any 3 char/number sequence from the beginning like ABC
or A2C
or XYZ
Continue to capture W
, T
or L
Optionally capture if following sequence is TB
Now if the current capture includes LG
after W
, T
, L
or after TB
, break the whole capture and return nothing.
The last part with LG
is what I'm having problems with.
Here are some examples strings that I'm working with
The | notes the spot up until I need the capture but it is not included in the original strings.
Should capture
ABCWTB|12345
ABCLTB|12345
FGHT|12345
AAAW|12345
B2BL|12345
XYZTTB|345345
Should not capture anything (these work)
ABCLLG12345
FGHTLG12345
X2ZWLG12345
Should not capture anything (these don't work)
ABCWTBLG12345
XYZTTBLG345345
F2HLTBLG345345
My current regex works for strings that don't have the optional TB
but if that is present, it matches the first 4 chars. What do I need to do to break capturing if LG
is present after the optional TB
I tried so many things to get this working. Any help with a little explanation would be greatly appreciated.
CodePudding user response:
You need to include the optional pattern into the negative lookahead and move the lookahead before the optional pattern:
^([A-Z0-9]{3}[WTL])(?!(?:TB)?LG)(TB)?
See the regex demo.
Details:
^
- start of string[A-Z0-9]{3}
- three uppercase letters or digits[WTL]
-W
/T
/L
(?!(?:TB)?LG)(TB)?
- a negative lookahead that fails the match if there isTBLG
orLG
immediately to the right of the current location(TB)?
- an optionalTB
char sequence.
If your regex flavor supports possessive quantifiers, you can use ^([A-Z0-9]{3}[WTL])(TB)? (?!LG)
, see this regex demo. Or, an atomic group: ^([A-Z0-9]{3}[WTL])(?>(TB)?)(?!LG)
.