Does Python have something like vim where it allows inlining a portion of the pattern that may have flags, for example being case-insensitive? Here would be an example:
re.search(r'he\cllo', string)
\c
being the case-insensitive inline indicator. Or is it an all or nothing in python with the re.I
flag?
CodePudding user response:
Python has an atypical way of implementing the case-insensitive inline modifier.
- It can be enabled globally, using
(?i)
. This applies to the entire string...- ...except that if may be selectively disabled for a group:
(?-i:...)
. - There is no "disable global" flag.
- ...except that if may be selectively disabled for a group:
- It can be enabled selectively for a group:
(?i:...)
- Anything outside of the group is not affected when the flag is applied inside the group.
Here are some examples.
Using the global flag:
q="HeLLo WorLD"
re.match(r"(?i)he(?-i:LL)o\swoRld", q) # this matches
re.match(r"(?i)he(?-i:ll)o\swoRld", q) # this doesn't match, since 'LL' != 'll'
re.match(r"(?i)he(?-i:Ll)o\swoRld", q) # nor does this, 'LL' != 'Ll'
This can be done multiple times. Only the characters enclosed in the groups will be treated as case-sensitive:
q="HeLLo WorLD"
re.match(r"(?i)he(?-i:LL)o\swo(?-i:r)ld", q) # this matches: 'LL' = 'LL' and 'r' == 'r'
re.match(r"(?i)he(?-i:LL)o\swo(?-i:R)ld", q) # but this doesn't, 'LL' == 'LL' but 'R' != 'r'
The global flag can be applied anywhere in the pattern, but anywhere other than the front is deprecated, and yields a DepricationWarning
. As of Python 3.8 it does still work, and follows the same rules.
Here is the non-global method:
q="HeLLo WorLD"
re.match(r"(?i:h)eLLo\sWorLD", q) # matches, since only enabled for 'h'
re.match(r"(?i:h)eLLo\sworld", q) # doesn't match: flag only applies to the group
Some combinations are redundant, but the engine handles them fine:
q="HeLLo WorLD"
re.match(r"(?i:h)e(?-i:ll)o\sWorLD", q) # this fails; disabling the flag is redundant
re.match(r"(?i)(?i:h)e(?-i:LL)o\sworld", q) # this matches, but enabling the flag in the first group is redundant, since it's enabled globally
Note: tested on python 3.8. I think older versions may have handled this slightly differently.