Home > Software engineering >  Extract content from regex python
Extract content from regex python

Time:10-28

I'm trying to write a generic enough regex that would allow me to extract from the strings (using re in python).

"nºcam21090005" and "cam21090005" or "n°cam21090005" ("º" != "°") the substring "cam21090005"

Any idea how I could do that? Maybe something in the spirit of r"(\w\W)*(?P<x>\w*)" ...

Also the input string is not necessarily starting with the id. There can be words before and after. So maybe more ideal to look for something that can identify from the following "word1 word2 n°cam21090005 word3 word4"

CodePudding user response:

Try this

import re

ss = '"nºcam21090005" "cam21090005" "n°cam21090005" "cam21090005"'
pattern = r'(?:n.)?(\w \d{8})'
print(re.findall(pattern, ss))

Bear in mind that this assumes the ID number will always be 8 digits: {8}

See here for a breakdown of the RegEx

CodePudding user response:

Think this one works r"(\w \s)*(?:n.)?(\s)*(?P<x>\w )"

  • Related