Home > Net >  Capture list of targets after cue
Capture list of targets after cue

Time:02-10

I am trying to create an expression that will retrieve a target or list of targets after a cue. I can currently get the first one, but I seem to be doing something wrong to get the repetitions. This is what I have so far:

import regex

text = "Some text cue: target, target, target and target. Other text."

expression = regex.compile(
fr"""
(?:cue)  # cue before targets. non capture (?:)
(?:.*?)  # text before the match. non capture (?:), as short as possible (?)
(target)
""",
re.VERBOSE,
)

matches = regex.findall(
    expression,
    text,
    overlapped=True,
)

I have tried (target,\s)* but that is not working.

(Note that the real-use case will be more involved, where targets are actually a collection of strings, etc.)

The ideal output should be:

["target", "target", "target", "target"]

CodePudding user response:

You can use

import regex

text = "Some text cue: target, target, target and target. Other text."

expression = regex.compile(fr"""
cue:  # cue before targets
(?:\s*(?:(?:,|\band\b)\s*)?(?P<targets>target)) 
""",
regex.VERBOSE,
)

match = regex.search(expression, text)
if match:
    print( match.captures("targets") )

# => ['target', 'target', 'target', 'target']

See the Python demo.

The cue:(?:\s*(?:(?:,|\band\b)\s*)?(?P<targets>target)) regex matches

  • cue: - a string
  • (?:\s*(?:(?:,|\band\b)\s*)?(?P<targets>target)) - one or more sequences of
    • \s* - zero or more whitespaces
    • (?:(?:,|\band\b)\s*)? - an optional sequence of a comma or a whole word and followed with zero or more whitespaces
    • (?P<targets>target) - Group "targets" matching target
  • Related