Home > Mobile >  Python re.findall issue
Python re.findall issue

Time:04-29

I cannot for the life of me see where the issue is, so I'm asking SO.

The first re below does not match 'arctic' without one of the required following words but does match 'arctic ice', which is what I want, but re.findall() returns only 'arctic', not 'arctic ice' as I would have expected. The second re behaves as I would expect. I have have a dozen regular expressions and the arctic one is the only one giving me issues, but I can't see the typo/mistake.

>>> import re
>>> r = re.compile(r'\b(:?polar|arc?tic|ant[-_ ]*arc?tic)\w*[-_ ]*(?:ice|shelf|temper|warm|cool)\w*')
>>> r.findall("arctic")
[]
>>> r.findall("this will test if arctic matches here")
[]
>>> r.findall("this will test if arctic ice matches here")
['arctic']
>>> r = re.compile(r'\b(?:global|environ|weather|ocean|sea|atmos|historic|season)\w*\W*(?:warm\w*|cool\w*|temper\w*|heat\w*|hot[estr]*\b|cold[estr]*\b)')
>>> r.findall("this will test if global warming matches here")
['global warming']

CodePudding user response:

Nevermind. (:? should have been (?:

CodePudding user response:

Capturing groups notation is ?:

r = re.compile(r'\b((?:polar|arc?tic|ant[-_ ]*arc?tic)\w*[-_ ]*(?:ice|shelf|temper|warm|cool))\w*')
  • Related