pandas apply function to series that checks if the first n characters matches a predefined string va-CodePudding

I am trying to apply a function to a pandas series that checks the first 3 as well as the first 2 characters of the values in the series.

If they match either, the first 3 or 2 characters (depending on which matched) needs to be replaced with '0', the rest of the characters remain the same.

The original dtype was of type 'O', I have tried converting this to type 'string' but still can't get this to work.

Sample data looks like so:

012xxxxxxx
 27xxxxxxxx
011xxxxxxx
27xxxxxxxx
etc...

The condition I am evaluating is if the first 3 characters == ' 27' replace ' 27' with '0' or if the first 2 characters == '27' replace '27' with '0'

I have the following apply method but the values aren't being updated.

def normalize_number(num):
       
    if num[:3] == ' 27':
        # num.str.replace(num[:3], '0') ## First Method
        return '0'   num[4:] ## Second Method
    else:
        return num
        
    if num[:2] == '27':
        # num.str.replace(num[:2], '0') 
        return '0'   num[3:] 
    else:
        return num

df['number'].apply(normalize_number)

What am I missing here?

CodePudding user response：

It looks like you should use a regex here. The the string starts with 27 with an optional in front, replace with 0:

df['number2'] = df['number'].str.replace('^\ ?27', '0', regex=True)

output:

        number     number2
0   012xxxxxxx  012xxxxxxx
1   27xxxxxxxx   0xxxxxxxx
2   011xxxxxxx  011xxxxxxx
3   27xxxxxxxx   0xxxxxxxx

why your approach failed

your approach failed because your returned too early with an else statement. You should have used:

def normalize_number(num):
    if num[:3] == ' 27':
        return '0'   num[4:] ## Second Method
    elif num[:2] == '27':
        return '0'   num[3:] 
    else:
        return num

NB. Use the regex approach above, it will be much more efficient

regex

^      # match start of string
\      # match literal  
?      # make previous match (the " ") optional
27     # match literal 27

regex demo