Home > OS >  Regex should not match if special character found anywhere in the string
Regex should not match if special character found anywhere in the string

Time:08-18

Please help me!

I am parsing strings which contain weights. But here is the catch: some strings contain range (see line 3 of that example below), which I consider an ambiguous value and do not want to match at all.

examples are:

1.0kg - should return group(1)='1.0', group(2)='kg'
400.00g - should return group(1)='400.00', group(2)='g'
100-800g - right now returns group(1)='800', group(2)='g', but should not return match!

Regex I am using right now is:

r"([\d.,] )(g|kg)"

How to modify it to exclude 3rd line from returning a match?

Right now I check if string contains '-' before using a regex, but I wonder how to do it using a regex patter without extra if-else statements.

CodePudding user response:

You may use the following regex pattern:

(?<!-)\b\d (?:\.\d )?\wg

This pattern excludes numbers which are immediately preceded by a dash, while still also requiring that the matching number is bounded on the left by a word boundary.

Explanation:

  • (?<!-) assert that hyphen does not preceded (eliminate 100-800g)
  • \b but still match a word boundary
  • \d match an integer
  • (?:\.\d )? optional decimal component
  • \w single letter unit in front of grams
  • g match 'g' for grams

Here is a working demo.

  • Related