I'm trying to create a capture group that could precede or follow another capture group.
Given:
TAKE 4 MG BY MOUTH
INHALE 14 PUFFS
4 PUFFS INHALE
Wanted:
qty unit rte
--- ---- ---
4 MG BY MOUTH
14 PUFFS INHALE
4 PUFFS INHALE
My attempt, (?:(?'qty'\d )\s(?'unit'(PUFFS|MG))).*(?'rte'(BY MOUTH|INHALE))
, works only when the rte
follows the qty
/unit
group. What is this concept called? A "look-around"?
Example: https://regex101.com/r/IRTYgU/1
CodePudding user response:
You can use
^(?=.*(?'rte'BY MOUTH|INHALE)).*\b(?'qty'\d )\s(?'unit'PUFFS|MG)
See the regex demo.
Details:
^
- start of string(?=.*(?'rte'BY MOUTH|INHALE))
- after any zero or more chars other than line break chars as many as possible, there must be eitherBY MOUTH
orINHALE
(Group "rte").*
- any zero or more chars other than line break chars as many as possible\b
- a word boundary (to match the digits as a full number)(?'qty'\d )
- Group "qty": one or more digits\s
- a whitespace(?'unit'PUFFS|MG)
- Group "unit":PUFFS
orMG
CodePudding user response:
You may use this regex with a lookahead that contains a capture group:
^(?=.*\b(?'rte'BY MOUTH|INHALE))(?:\w \s )?(?'qty'\d )\s (?'unit'PUFFS|MG)
Breakdown:
^
: Start(?=.*\b(?'rte'BY MOUTH|INHALE))
: Lookahead to make sure that line containsBY MOUTH
orINHALE
somewhere after start and we also capture this in capture grouprte
.(?:\w \s )?
: Optionally match a word followed by 1 whitespaces(?'qty'\d )
: Capture groupqty
to match 1 digits\s
: match 1 whitespaces(?'unit'PUFFS|MG)
: Capture groupunit
to matchPUFFS
orMG