Home > Net >  How to extract last three chars from word if they are uppercase?
How to extract last three chars from word if they are uppercase?

Time:11-11

How to extract last three chars from word if they are uppercase?

a = "aaaAAA"
b = "bbbbBBB"
c = "ccc CCC"
d = "dddddDDD"
e = "eeeEEEE"

My function:

def get_three(value):
    search = re.search("[A-Z]{3}$", value)
    
    if search:
        return search.group(0)
 
     return "NONE"

It returns:

AAA
BBB
CCC
DDD
EEE

but should be:

AAA
BBB
CCC
DDD
NONE

because there is EEEE at the end, not EEE.

CodePudding user response:

You can use a negative lookbehind:

(?<![A-Z])[A-Z]{3}$

See the regex demo.

Details:

  • (?<![A-Z]) - a negative lookbehind that fails the match if there is an uppercase letter immediately to the left of the current location
  • [A-Z]{3} - three uppercase letters
  • $ - end of string.

If you need to support any Unicode uppercase letters:

import sys

pLu = '[{}]'.format("".join([chr(i) for i in range(sys.maxunicode) if chr(i).isupper()]))
pattern = fr'(?<!{pLu}){pLu}{{3}}$'
  • Related