Home > Back-end >  Regex with word not in range
Regex with word not in range

Time:10-23

I'm struggling with a regex case engine: net core 5

Example test:

50y old guy living a 5th avenue with weight 50,5kg

I would like to split this in a "description" and a "weight group"

My expectation would be that the next regex should get a general description group (greedy) and that the ? (also greedy) quantifier would pick up the weight

(?<Description>[A-Za-z0-9 ×,\.-]*)((?<weight>([0-9,.] )(?<weightUnit>kg|G)))?.*?(?<heartbeat>([0-9,.]*))bpm

But I'm missing something as the description group takes the complete text and doesn't seem to exclude the weight and heartbeat (or those aren't filtered out).

In addition; I use the ? parameter after the weight as there are strings where the weight is present and there are strings where the weight isn't present and there are still groups behind (the heartbeat for example which has the same issue). Making the first description group lazy (*?) solves the heartbeat problem but not the weight issue. Probably due the none or one "?" character which is needed as the group isn't always there

https://regex101.com/r/NHaMXi/1

CodePudding user response:

Add a negative look ahead (?![0-9,.] (kg|G)|[0-9,.]*bpm) to the description match to prevent it consuming weight or bpm:

(?<Description>((?![0-9,.] (kg|G)|[0-9,.]*bpm)[A-Za-z0-9 ×,.-])*)((?<weight>[0-9,.] )(?<weightUnit>kg|G))?.*?(?<heartbeat>[0-9,.]*)bpm

See live demo.

Some unnecessary escapes and brackets also cleaned up.


Side note: The symbol for gram is g not G, but I left that as is in case your input is expected to contain G.

  • Related