Home > Net >  Regex capture group that precedes or follows another capture group
Regex capture group that precedes or follows another capture group

Time:10-27

I'm trying to create a capture group that could precede or follow another capture group.

Given:

TAKE 4 MG BY MOUTH
INHALE 14 PUFFS
4 PUFFS INHALE

Wanted:

qty unit  rte
--- ----  ---
4   MG    BY MOUTH
14  PUFFS INHALE
4   PUFFS INHALE

My attempt, (?:(?'qty'\d )\s(?'unit'(PUFFS|MG))).*(?'rte'(BY MOUTH|INHALE)), works only when the rte follows the qty/unit group. What is this concept called? A "look-around"?

Example: https://regex101.com/r/IRTYgU/1

CodePudding user response:

You can use

^(?=.*(?'rte'BY MOUTH|INHALE)).*\b(?'qty'\d )\s(?'unit'PUFFS|MG)

See the regex demo.

Details:

  • ^ - start of string
  • (?=.*(?'rte'BY MOUTH|INHALE)) - after any zero or more chars other than line break chars as many as possible, there must be either BY MOUTH or INHALE (Group "rte")
  • .* - any zero or more chars other than line break chars as many as possible
  • \b - a word boundary (to match the digits as a full number)
  • (?'qty'\d ) - Group "qty": one or more digits
  • \s - a whitespace
  • (?'unit'PUFFS|MG) - Group "unit": PUFFS or MG

CodePudding user response:

You may use this regex with a lookahead that contains a capture group:

^(?=.*\b(?'rte'BY MOUTH|INHALE))(?:\w \s )?(?'qty'\d )\s (?'unit'PUFFS|MG)

RegEx Demo

Breakdown:

  • ^: Start
  • (?=.*\b(?'rte'BY MOUTH|INHALE)): Lookahead to make sure that line contains BY MOUTH or INHALE somewhere after start and we also capture this in capture group rte.
  • (?:\w \s )?: Optionally match a word followed by 1 whitespaces
  • (?'qty'\d ): Capture group qty to match 1 digits
  • \s : match 1 whitespaces
  • (?'unit'PUFFS|MG): Capture group unit to match PUFFS or MG
  • Related