I am writing regex matching in python where the string that needs to be matched is this:
entity("entity1")
I want to match if a string is written in this format and extract the string inside the double quotes, for example:entity1
here.
Note that the it can be anything inside those double quotes and it should still match. For example:
entity("wow")
entity(" 1 2 3 - 5")
entity("!323d de462")
should all match.
I have tried using \entity(\"(.*?)\")\)
as regex matching pattern but that doesn't work. Any ideas?
CodePudding user response:
Your attempt was very close.
The correct regex is entity\("(.*?)"\)
.
You don't need the leading \
since you don't want to escape the e
, and you had an extra, erroneous )
in the end.
Extract is possible using .group
, as usual:
import re
foo = 'entity("entity1")'
match = re.search(r'entity\("(.*?)"\)', foo)
if match:
print(match.group(1))
outputs
entity1
CodePudding user response:
Just to add some insight, you can use DEBUG to check your regex, look:
import re
# \" is the same as " and is fixed by black
# \e won't compile at all and will throw an exception when you try to use it in e.g. re.search
p = 'entity("(.*?)"\)\)'
re.compile(p, re.DEBUG)
# re.error: missing ), unterminated subpattern at position 6
# Adding "\" before first bracket
p = 'entity\("(.*?)"\)\)'
re.compile(p, re.DEBUG)
(...)
49. LITERAL 0x22 ('"')
51. LITERAL 0x29 (')')
53. LITERAL 0x29 (')') # so expecting two brackets at the end, something is wrong!
To extract value, use group
match = re.search(p, my_str)
print(match.group(1))