I am trying to grab fary_trigger_post
in the code below using Regex. However, I don't understand why it always includes "
in the end of the matched pattern, which I don't expect.
Any idea or suggestion?
re.match(
r'-instance[ "\']*(. )[ "\']*$',
'-instance "fary_trigger_post" '.strip(),
flags=re.S).group(1)
'fary_trigger_post"'
Thank you.
CodePudding user response:
The (.
) is greedy and grabs ANY character until the end of the input. If you modified your input to include characters after the final double quote (e.g. '-instance "fary_trigger_post" asdf'
) you would find the double quote and the remaining characters in the output (e.g. fary_trigger_post" asdf
). Instead of .
you should try [^"\']
to capture all characters except the quotes. This should return what you expect.
re.match(r'-instance[ "\']*([^"\'] )[ "\'].*$', '-instance "fary_trigger_post" '.strip(), flags=re.S).group(1)
Also, note that I modified the end of the expression to use .*
which will match any characters following the last quote.
CodePudding user response:
Here's what I'd use in your matching string, but it's hard to provide a better answer without knowing all your cases:
r'-instance\s "(. )"\s*$'
CodePudding user response:
When you try to get group 1 (i.e. (. )
) regex will follow this match to the end of string, as it can match .
(any character) 1 or more times (but it will take maximum amount of times). I would suggest use the following pattern:
'-instance[ "\']*(. )["\'] *$'
This will require regex to match all spaces in the end and all quoutes separatelly, so that it won't be included into group 1