I need a regular expression that matches tab symbol by the following rules:
"—>text" does not match
"1.—>text" does not match
"1—>text" does not match
"A.—>text" does not match
"text—>text" match
That is, it shouldn't match tabs that are at the beginning of the text, after a listed item mark [A-Z] or [0-9]. Here is my expression:
(?<!^((?:\d |[A-Z])(?:\.)?))\t(?!\1)
How to fix it?
CodePudding user response:
You can use
(?<!^(?:(?:\d |[A-Z])\.?)?)\t
See the regex demo. Details:
(?<!^(?:(?:\d |[A-Z])\.?)?)
- a negative lookbehind that fails the match if, immediately to the left of the current location, there are^
- start of string(?:(?:\d |[A-Z])\.?)?
- an optional sequence of(?:\d |[A-Z])
- one or more digits or an uppercase ASCII letter\.?
- an optional.
\t
- a tab char.
Note that (?:\.)?
is the same as \.?
.
Also, capturing groups inside a negative lookbehind makes little sense as regex processing will be stopped before your backreference pattern is reached.