Home > Software design >  Pandas column value replace using a dictionary with case insensitive match
Pandas column value replace using a dictionary with case insensitive match

Time:10-12

Pandas column value replace using a dictionary with case insensitive match

I have a replacement dictionary and my conditions as below:

Replace the pandas df values with the replace_dict, also if any value ends with . followed by one or more zeros replace '.'zeros with ''(strip the .0s part)

import pandas as pd
replace_dict = {('True', 'Yes'): 1, ('False', 'No'): 0, '.0': ''}
df = pd.DataFrame(data = ['True','False', 'Yes', 2.0, '2.00000'])

CodePudding user response:

We can use where from numpy in this case :

import numpy as np


condlist = [df[0] == 'True', 
            df[0] == 'Yes', 
            df[0] == 'False', 
            df[0] == 'No', 
            df[0] == '.0']

choicelist = [1,
              1,
              0,
              0,
              '']            

df['new_vals'] = np.select(condlist, choicelist, default=np.nan)

Output :

    0       new_vals
0   True    1
1   False   0
2   Yes     1
3   2.0     nan
4   2.00000 nan

CodePudding user response:

Try using pd.replace: pandas.DataFrame.replace

And replace the tuple with a single key and single value:

Input:

    col1
0   True
1   False
2   Yes
3   2.0
4   2.00000

Script:

df['col1'] = df['col1'].astype(str).str.lower()
replace_dict = {'true': 1, 'yes': 1, 'false': 0, 'no': 0, '.0': ''}
df['col1'] = df['col1'].replace(replace_dict)
df

Output:

col1
0   1
1   0
2   1
3   2.0
4   2.00000

If you don't want to change non-relevant rows to lower case, you can try this:

Input:

col1
0   True
1   False
2   Yes
3   2.0
4   2.00000
5   Hey I AM not relevant!

Script:

replace_dict = {'true': 1, 'yes': 1, 'false': 0, 'no': 0, '.0': ''}
mask_relevant_rows = df['col1'].astype(str).str.lower().isin(replace_dict.keys())
df.loc[mask_relevant_rows, 'col1'] = df[mask_relevant_rows]['col1'].astype(str).str.lower().replace(replace_dict)

Output:

col1
0   1
1   0
2   1
3   2.0
4   2.00000
5   Hey I AM not relevant!

Hope it helps

  • Related