I trying some regex coding and I need your help.
Here my code and my test :
Regex : \b\d{4}[\.\-/\\ _]{0,2}\d{2,4}(/\d{2,4}){0,}
String : ... chemical tank 6211-10/20/30/40 and other equipment ...
Result : 6211-10/20/30/40
What I am expected :
-6211-10
-6211-10/20
-6211-10/20/30
-6211-10/20/30/40
I have found something interesting like :
(?:/\d{2,4})
which gave me /20 , /30 and /40 but I dunno how to concatenate several regex in one condition.
The number of tags can change, 6311-22/42 or 6158-47/84/85/86/87/88/89 for samples.
Thanks you in advance !
CodePudding user response:
You can test your regular expressions online. For example regex101.com can help you.
I have saved your expression and corrected it here:
https://regex101.com/r/YX1X0f/1
You have to escape the /
in your regular expression, because it is a delimiter:
\b\d{4}[\.\-/\\ _]{0,2}\d{2,4}(\/\d{2,4}){0,}
This matches all your scenarios (also see link above).
CodePudding user response:
What you might do is use the regex PyPi module (install using pip install regex
) with a lookbehind and a capture group to get the values.
\b(?<=(\d{4}[./\\ _-]{0,2}\d{2,4}(?:\/\d{2,4})*))
\b
A word boundary(?<=
Positive lookbehind to assert what to the left is(
Capture group 1, this value will be returned by findall)\d{4}[./\\ _-]{0,2}\d{2,4}
Match the pattern with the digits and 0-2 occurrences of the character class(?:\/\d{2,4})*
Optionally repeat/
and 2-4 digits
)
Close group 1
)
Close the lookbehind
See a regex demo or a Python demo.
For example
import regex as re
s = "... chemical tank 6211-10/20/30/40 and other equipment ..."
pattern = r"\b(?<=(\d{4}[./\\ _-]{0,2}\d{2,4}(?:/\d{2,4})*))"
print(re.findall(pattern, s))
Output
['6211-10', '6211-10/20', '6211-10/20/30', '6211-10/20/30/40']