Home > other >  regex: don't match number preceded by certain character
regex: don't match number preceded by certain character

Time:05-17

Following code extracts the first sequence of numbers that appear in a string:

num = re.findall(r'^\D*(\d )', string)

I'd like to add that the regular expression doesn't match numbers preceded by vor V.

Example:

string = 'foobarv2_34 423_wd"
Output: '34'

CodePudding user response:

If you need to get the first match, you need to use re.search, not re.findall.

In this case, you can use a simpler regular expression like (?<!v)\d with re.I:

import re
m = re.search(r'(?<!v)\d ', 'foobarv2_34 423_wd', re.I)
if m:
    print(m.group()) # => 34

See the Python demo.

Details

  • (?<!v) - a negative lookbehind that fails the match if there is a v (or V since re.I is used) immediately to the left of the current location
  • \d - one or more digits.

If you cannot use re.search for some reason, you can use

^.*?(?<!v)(\d )

See this regex demo. Note that \D* (zero or more non-digits) is replaced with .*? that matches zero or more chars other than line break chars as few as possible (with re.S or re.DOTALL, it will also match line breaks) since there is a need to match all digits not preceded with v.

More details:

  • ^ - start of string
  • .*? - zero or more chars other than line break chars as few as possible
  • (?<!v) - a negative lookbehind that fails the match if there is a v (or V since re.I is used) immediately to the left of the current location
  • (\d ) - Group 1: one or more digtis.
  • Related