Home > Blockchain >  Regex expression to match tab
Regex expression to match tab

Time:12-08

I need a regular expression that matches tab symbol by the following rules:

"—>text"       does not match
"1.—>text"     does not match
"1—>text"      does not match
"A.—>text"     does not match
"text—>text"   match

That is, it shouldn't match tabs that are at the beginning of the text, after a listed item mark [A-Z] or [0-9]. Here is my expression:

(?<!^((?:\d |[A-Z])(?:\.)?))\t(?!\1)

enter image description here

How to fix it?

CodePudding user response:

You can use

(?<!^(?:(?:\d |[A-Z])\.?)?)\t

See the regex demo. Details:

  • (?<!^(?:(?:\d |[A-Z])\.?)?) - a negative lookbehind that fails the match if, immediately to the left of the current location, there are
    • ^ - start of string
    • (?:(?:\d |[A-Z])\.?)? - an optional sequence of
      • (?:\d |[A-Z]) - one or more digits or an uppercase ASCII letter
      • \.? - an optional .
  • \t - a tab char.

Note that (?:\.)? is the same as \.?.

Also, capturing groups inside a negative lookbehind makes little sense as regex processing will be stopped before your backreference pattern is reached.

  • Related