Home > front end >  Loop to replace the value in a column based on a condition
Loop to replace the value in a column based on a condition

Time:09-21

I am looking to replace the value in a column depending on whether it starts with a number or not. The loop below returns NaNs for each of the values.

My desired output is:

novalue (First 6 digits)
novalue
one (Extract what is in parentheses)
two
    df = pd.DataFrame({'VALUE': ['novalue1', 'novalue2', '22n(one)', '22n(two)',
                             'completed', 'none'],})
    
    
    import re
    for row in df.iterrows():
        try:
            value=row[1].VALUE
            if re.search("^[a-zA-Z]", value) is not None:
                df['VALUE'] = df['VALUE'].str[:6]
            else:
                df['VALUE'] = df['VALUE'].str.extract(r'\((.*)\)', expand=False) 
        except :
                print(value, ' : Unsuccessful')
    
    df

CodePudding user response:

Problem is that df['VALUE'] in the loop means the whole column but you only want change that exact row. You can try np.where instead.

df['out'] = np.where(df['VALUE'].str[0].str.isalpha(), df['VALUE'].str[:7], df['VALUE'].str.extract('\((.*)\)')[0])
print(df)

       VALUE      out
0   novalue1  novalue
1   novalue2  novalue
2   22n(one)      one
3   22n(two)      two
4  completed  complet
5       none     none
  • Related