Home > Blockchain >  Replacing only certain values of a column based on condition of another column
Replacing only certain values of a column based on condition of another column

Time:02-18

I have df below as:

name                flag 
company night      night
company day          day
dark night         night
night day both     night
night day both     night

How can I change the flag column to say both for all rows in the name column where the word both exists?

Desired output:

  name              flag 
company night      night
company day          day
dark night         night
night day both      both
night day both      both

used the methods below, but both use the first value in the row and dont correctly map to both for the applicable rows

method1:

r = '(both|night|day)'

c = dict(both = 'Both', night='Night', day='Day')

dfc['Identifier'] = dfc['NAME'].str.lower().str.extract(r, expand=False).map(c)

    

method2:

conditions = [dfc["NAME"].str.lower().str.contains("night"), 
               dfc["NAME"].str.lower().str.contains("day"),
               dfc["NAME"].str.lower().str.contains("both")]

values = [ 'night', 'day', 'both']
    
dfc["identifiter"] = np.select(conditions, values, default=np.nan)

Thanks for help

CodePudding user response:

You can use str.contains to create a boolean Series and use it as a condition in np.where to assign values to "flag" column:

import numpy as np
df['flag'] = np.where(df['name'].str.contains('both'), 'both', df['flag'])

Another option is to loc instead:

df.loc[df['name'].str.contains('both'), 'flag'] = 'both'

Output:

             name   flag
0   company night  night
1     company day    day
2      dark night  night
3  night day both   both
4  night day both   both

CodePudding user response:

Another method is to use list comprehensions.

dfc['Identifier']=['both' if 'both' in y else x for x,y in zip(df['flag'],df['name'])
  • Related