Write regular expression for matching digits-CodePudding

There is a string s = 'kjlj lkj3 444 2345 add56fg' and I wonder how to match numbers that of lengh from 1 to 3 only. Thus number '2345' should not be returned, but ['3', '444', '56']. My first approach was to use this expression r'\d{1,3}', but it returns ['3', '444', '234', '5', '56']. Then I came up with an idea to filter out digits and then digits with len <= 3. r'\d ' -> ['3', '444', '2345', '56'] and len <= 3. It is fine, but I wonder if it is possible to achieve it using only REGEX.

CodePudding user response：

r'\d{1,3}' doesn't work because you're not ensuring there's no number before and no number after.

You should use a negative lookbehind and a negative lookahead to make sure you don't just capture part of the number.

(?<!\d)\d{1,3}(?!\d)

In python:

import re

s = 'kjlj lkj3 444 2345 add56fg'

data = re.findall(r'(?<!\d)\d{1,3}(?!\d)', s)
print(data)  # ['3', '444', '56']

CodePudding user response：

Well, this looks like a use case for lookarounds.

You need to match \d{1,3} not preceded and not followed by by another digit. The former is called a negative lookbehind and the latter is called a negative lookahead.

Both are a form of zero-width assertions: they don't include characters into the final result, but they certainly affect what text is matched.

(?<!\d)\d{1,3}(?!\d)