I'm struggling with a regex case engine: net core 5
Example test:
50y old guy living a 5th avenue with weight 50,5kg
I would like to split this in a "description" and a "weight group"
My expectation would be that the next regex should get a general description group (greedy) and that the ? (also greedy) quantifier would pick up the weight
(?<Description>[A-Za-z0-9 ×,\.-]*)((?<weight>([0-9,.] )(?<weightUnit>kg|G)))?.*?(?<heartbeat>([0-9,.]*))bpm
But I'm missing something as the description group takes the complete text and doesn't seem to exclude the weight and heartbeat (or those aren't filtered out).
In addition; I use the ? parameter after the weight as there are strings where the weight is present and there are strings where the weight isn't present and there are still groups behind (the heartbeat for example which has the same issue). Making the first description group lazy (*?) solves the heartbeat problem but not the weight issue. Probably due the none or one "?" character which is needed as the group isn't always there
https://regex101.com/r/NHaMXi/1
CodePudding user response:
Add a negative look ahead (?![0-9,.] (kg|G)|[0-9,.]*bpm)
to the description
match to prevent it consuming weight or bpm:
(?<Description>((?![0-9,.] (kg|G)|[0-9,.]*bpm)[A-Za-z0-9 ×,.-])*)((?<weight>[0-9,.] )(?<weightUnit>kg|G))?.*?(?<heartbeat>[0-9,.]*)bpm
See live demo.
Some unnecessary escapes and brackets also cleaned up.
Side note: The symbol for gram is g
not G
, but I left that as is in case your input is expected to contain G
.