I have this regex ("^[-A-Z0-9-[O]]{1,8}$"
) comming from a client requirement (normally should not be changed).
But it doesn't work in python (it works in C).
from re import search
var = "MY01C0DE"
regex = "^[-A-Z0-9-[O]]{1,8}$"
print(search(regex, var))
this prints None.
But if I change the regex to "^[-A-NP-Z0-9]{1,8}$"
, this works.
from re import search
var = "MY01C0DE"
regex = "^[-A-NP-Z0-9]{1,8}$"
print(search(regex, var))
So basically the -[O]
part doesn't work in python if I understand correctly. But I have checked and this regex works in C.
Is there any way to make this way of excluding characters (-[O]
) work in python also?
CodePudding user response:
You can use PyPi regex
module that supports .NET-like character class subtraction and use
from regex import search, V1
var = "MY01C0DE"
regex = "^[-A-Z0-9-[O]]{1,8}$"
print(search(regex, var, V1))
# => <regex.Match object; span=(0, 8), match='MY01C0DE'>
See the Python demo.
Check the "Nested sets and set operations" section:
For example, the pattern
[[a-z]--[aeiou]]
is treated in the version 0 behaviour (simple sets, compatible with the re module) as:
- Set containing “[” and the letters “a” to “z”
- Literal “–”
- Set containing letters “a”, “e”, “i”, “o”, “u”
- Literal “]”
but in the version 1 behaviour (nested sets, enhanced behaviour) as:
- Set which is:
- Set containing the letters “a” to “z”
- but excluding:
- Set containing the letters “a”, “e”, “i”, “o”, “u”
Version 0 behaviour: only simple sets are supported.
Version 1 behaviour: nested sets and set operations are supported.