I would like to write a regex to fulfil following requirements:
Test case | Test string | Is Valid |
---|---|---|
1 | board add 0/1 aaa |
True |
2 | board add 0/1 xxx |
False |
3 | EMPTY_STRING |
True |
4 | |
True |
5 | board add 0/2 aaa |
True |
Then, I decided to build the regex with python by make use of
(?(xxx)YES-PATTERN|NO-PATTERN)
I come up with following
(board add 0/1)?(?(1) (aaa|bbb))
- If
(board add 0/1)
exists, we check whether it followsaaa
orbbb
- If
(board add 0/1)
does not exists, we make it pass
- If
But, the regex above just does not work as expected. It failed on test case 2. Anyone know how to fix it?
You can check my regex by following url
https://regex101.com/r/M8UEsb/1
CodePudding user response:
You are not matching the 2 in example nr. 5, only the 1.
But as the group 1 value is optional, and you only test for group 1 in the if/else clause, it can match at any position and the pattern is also unanchored.
You could also write an alternation that allows all the patterns instead of using if/else:
^[^\S\n]*(?:board add 0/[12] (?:aaa|bbb))?$
Explanation
^
Start of string[^\S\n]*
Mach optional spaces without newlines(?:
Non capture groupboard add 0/[12]
Match the string ending on either 1 or 2(?:aaa|bbb)
Match one of the alternatives
)?
Close the group$
End of string
Example
import re
strings = ["board add 0/1 aaa", "board add 0/1 xxx", "", " ", "board add 0/2 aaa"]
for s in strings:
m = re.match(r"^[^\S\n]*(?:board add 0/[12] (?:aaa|bbb))?$", s)
print(f"'{s}' ==> {bool(m)}")
Output
'board add 0/1 aaa' ==> True
'board add 0/1 xxx' ==> False
'' ==> True
' ' ==> True
'board add 0/2 aaa' ==> True
CodePudding user response:
This pattern checks the existence of aaa
or bbb
if the string starts with board add 0/1
or board add 0/2
. And if any of board add 0/1
and board add 0/2
does not exist, it passes.
^(?:(?!board add 0/[12]).)*$|board add 0/[12] (?:aaa|bbb)
Regex Explanation
^
Start of a string(?:
Non-capturing group(?!
Negative lookahead assertion - assert that the following regex does not matchboard add 0/[12]
Matchboard add 0/1
orboard add 0/2
)
Close lookahead.
Any character except newline
)
Close non-capturing group*
The previous match can be matched zero or more times$
End of a string|
OR. If the whole previous pattern did not match then check the nextboard add 0/[12]
Matchboard add 0/1
orboard add 0/2
(?:
Non-capturing groupaaa|bbb
Matchaaa
orbbb
)
Close non-capturing group
See the demo
Python Example
import re
strings = [
'board add 0/1 aaa',
'board add 0/1 xxx',
'',
' ',
'board add 0/2 aaa'
]
for string in strings:
print(bool(re.match(r'^(?:(?!board add 0/[12]).)*$|board add 0/[12] (?:aaa|bbb)', string)))
Output
True
False
True
True
True