I want to use regex findall to parse a html page. Using : (\d{4,9})$(?<!#\*)
, I am able to exclude items that ends in # or *, but I also want to parse from items that end in other characters. Below is a example of what I am trying to achieve.
input string
test: 11111###
test: 222222
test: 3333333<br>
expected output
["222222", "3333333"]
CodePudding user response:
You can use
:\s*(\d )(?![#*\d])
See the regex demo. Details:
:
- colon\s*
- zero or more whitespaces(\d )
- Group 1: one or more digits(?![#*\d])
- a negative lookahead that fails the match if there is#
,*
or a digit immediately to the right of the current location.