Home > database >  Replace values in dataframe using second dataframe as value map
Replace values in dataframe using second dataframe as value map

Time:11-24

Given the dataframe

df = pd.DataFrame({
    'a': [1, 2, 3, 4], 
    'b': [1, 2, 3, 4]
})

   a  b
0  1  1
1  2  2
2  3  3
3  4  4

For each column ('a', 'b'), I want to replace the values based upon the following dataframe:

replace_value_map  = pd.DataFrame({
    'column_to_map': ['a', 'a', 'b', 'b'], 
    'to_replace':    [1,    2,   1,   3 ], 
    'replace_with':  ['x', 'y', 'z', 'x'], 

})

  column_to_map  to_replace replace_with
0             a           1            x
1             a           2            y
2             b           1            z
3             b           3            x

Giving:

  • For column a: replace 1 with 'x' and 2 with 'y'
  • For column b: replace 1 with 'z' and 3 with 'x' This is the expected result:
df_expected = pd.DataFrame({
    'a': ['x','y', 3, 4], 
    'b': ['z', 2,'x',4]
})

   a  b
0  x  z
1  y  2
2  3  x
3  4  4

CodePudding user response:

You can make your replace map into dict , then just replace it

map_d = replace_value_map.groupby('column_to_map').apply(lambda x : x.set_index('to_replace')['replace_with'].to_dict()).to_dict()
df = df.replace(map_d)
Out[88]: 
   a  b
0  x  z
1  y  2
2  3  x
3  4  4
  • Related