Home > Blockchain >  Regular expression that stops at first letter encountered
Regular expression that stops at first letter encountered

Time:05-19

I want my regex expression to stop matching numbers of length between 2 and 10 after it encounters a letter.

So far I've come up with (\d{2,10})(?![a-zA-Z]) this. But it continues to match even after letters are encountered. 2216101225 /ROC/PL FCT DIN 24.03.2022 PL ERBICIDE' - this is the text I've been testing the regex on, but it matches 24 03 and 2022 also. This is tested and intended for C#.

Can you help ? Thanks

CodePudding user response:

Another option is to anchor the pattern and to match any character except chars a-zA-Z or a newline, and then capture the 2-10 digits in a capture group.

Then get the capture group 1 value from the match.

^[^A-Za-z\r\n]*\b([0-9]{2,10})\b

Explanation

  • ^ Start of string
  • [^A-Za-z\r\n]* Optionally match chars other than a-zA-Z or a newline
  • \b([0-9]{2,10})\b Capture 2-10 digits between word boundaries in group 1

See a regex demo.


Note that in .NET \d matches all numbers except for only 0-9.

CodePudding user response:

You can use the following .NET regex

(?<=^\P{L}*)(?<!\d)\d{2,10}(?!\d)
(?<=^[^a-zA-Z]*)(?<!\d)\d{2,10}(?!\d)

See the regex demo. Details:

  • (?<=^\P{L}*) - there must be no letters from the current position till the start of string ((?<=^[^a-zA-Z]*) only supports ASCII letters)
  • (?<!\d) - no digit immediately on the left is allowed.
  • \d{2,10} - two to ten digits
  • (?!\d) - no digit immediately on the right is allowed.
  • Related