How to extract last three chars from word if they are uppercase?
a = "aaaAAA"
b = "bbbbBBB"
c = "ccc CCC"
d = "dddddDDD"
e = "eeeEEEE"
My function:
def get_three(value):
search = re.search("[A-Z]{3}$", value)
if search:
return search.group(0)
return "NONE"
It returns:
AAA
BBB
CCC
DDD
EEE
but should be:
AAA
BBB
CCC
DDD
NONE
because there is EEEE
at the end, not EEE
.
CodePudding user response:
You can use a negative lookbehind:
(?<![A-Z])[A-Z]{3}$
See the regex demo.
Details:
(?<![A-Z])
- a negative lookbehind that fails the match if there is an uppercase letter immediately to the left of the current location[A-Z]{3}
- three uppercase letters$
- end of string.
If you need to support any Unicode uppercase letters:
import sys
pLu = '[{}]'.format("".join([chr(i) for i in range(sys.maxunicode) if chr(i).isupper()]))
pattern = fr'(?<!{pLu}){pLu}{{3}}$'