I am attempting to implement a conditional statement within regex, applied via the pandas.Series.str.extractall method. Given the reading I've done here, this seems like a pretty easy problem to solve, but I am still getting stuck...
I have the following regex in the Pythex tester:
(a)(?(1)b|c)
As I understand it, (a)
is my first test group. The conditional block (?(1)b|c)
should attempt to match "b" if my first test group is a match, or else it will attempt to match "c". The results I am hoping for are as follows:
- "b" = No Match
- "ab" = Match
- "c" = Match
- "ac" = No Match
The (a)(?(1)b|c)
statement achieves 1, 2, and 4, but it misses 3... Any tips?
Thank you!
CodePudding user response:
I am not seeing how your code matches up with the docs.
It supplies id 1
, fine.
Will try to match with yes-pattern if the group with given id ... exists, ...
For example,
(<)?(\w @\w (?:\.\w ) )(?(1)>|$)
Notice how the example's group 1 is optional, it matches or it doesn't and then we move on for more matching.
In your (a)
expression the "a" is non-optional.
I would expect that group 1 always exists
at the point that we're evaluating the conditional.
Which makes it not especially conditional.
Start with (a)?
to improve matters.
CodePudding user response:
To get the matches, you don't need a conditional.
If a, then also match b.. else match c, can be written as:
\b(?:ab|c)\b