I want my regex expression to stop matching numbers of length between 2 and 10 after it encounters a letter.
So far I've come up with (\d{2,10})(?![a-zA-Z])
this. But it continues to match even after letters are encountered.
2216101225 /ROC/PL FCT DIN 24.03.2022 PL ERBICIDE'
- this is the text I've been testing the regex on, but it matches 24 03 and 2022 also.
This is tested and intended for C#.
Can you help ? Thanks
CodePudding user response:
Another option is to anchor the pattern and to match any character except chars a-zA-Z or a newline, and then capture the 2-10 digits in a capture group.
Then get the capture group 1 value from the match.
^[^A-Za-z\r\n]*\b([0-9]{2,10})\b
Explanation
^
Start of string[^A-Za-z\r\n]*
Optionally match chars other than a-zA-Z or a newline\b([0-9]{2,10})\b
Capture 2-10 digits between word boundaries in group 1
See a regex demo.
Note that in .NET \d
matches all numbers except for only 0-9.
CodePudding user response:
You can use the following .NET regex
(?<=^\P{L}*)(?<!\d)\d{2,10}(?!\d)
(?<=^[^a-zA-Z]*)(?<!\d)\d{2,10}(?!\d)
See the regex demo. Details:
(?<=^\P{L}*)
- there must be no letters from the current position till the start of string ((?<=^[^a-zA-Z]*)
only supports ASCII letters)(?<!\d)
- no digit immediately on the left is allowed.\d{2,10}
- two to ten digits(?!\d)
- no digit immediately on the right is allowed.