Home > Net >  Regex with different results
Regex with different results

Time:09-30

I trying some regex coding and I need your help.

Here my code and my test :

Regex : \b\d{4}[\.\-/\\ _]{0,2}\d{2,4}(/\d{2,4}){0,}

String : ... chemical tank 6211-10/20/30/40 and other equipment ...

Result : 6211-10/20/30/40

What I am expected :

-6211-10
-6211-10/20
-6211-10/20/30
-6211-10/20/30/40

I have found something interesting like : (?:/\d{2,4}) which gave me /20 , /30 and /40 but I dunno how to concatenate several regex in one condition.

The number of tags can change, 6311-22/42 or 6158-47/84/85/86/87/88/89 for samples.

Thanks you in advance !

CodePudding user response:

You can test your regular expressions online. For example regex101.com can help you.

I have saved your expression and corrected it here:
https://regex101.com/r/YX1X0f/1

You have to escape the / in your regular expression, because it is a delimiter:
\b\d{4}[\.\-/\\ _]{0,2}\d{2,4}(\/\d{2,4}){0,}

This matches all your scenarios (also see link above).

CodePudding user response:

What you might do is use the regex PyPi module (install using pip install regex) with a lookbehind and a capture group to get the values.

\b(?<=(\d{4}[./\\ _-]{0,2}\d{2,4}(?:\/\d{2,4})*))
  • \b A word boundary
  • (?<= Positive lookbehind to assert what to the left is
    • ( Capture group 1, this value will be returned by findall)
      • \d{4}[./\\ _-]{0,2}\d{2,4} Match the pattern with the digits and 0-2 occurrences of the character class
      • (?:\/\d{2,4})* Optionally repeat / and 2-4 digits
    • ) Close group 1
  • ) Close the lookbehind

See a regex demo or a Python demo.

For example

import regex as re

s = "... chemical tank 6211-10/20/30/40 and other equipment ..."
pattern = r"\b(?<=(\d{4}[./\\ _-]{0,2}\d{2,4}(?:/\d{2,4})*))"
print(re.findall(pattern, s))

Output

['6211-10', '6211-10/20', '6211-10/20/30', '6211-10/20/30/40']
  • Related