My dataframe:
import pandas as pd
import numpy as np
df = pd.DataFrame({'c1': [10, 11, 12, 13], 'c2': [100, 110, 120, 130], 'c3': [100, 110, 120, 130], 'c4': ['A', np.nan, np.nan, 'B']})
I need to replace row c2 and c3 from another dataframe using column 'c4'
replacer df:
df_replacer = pd.DataFrame({'c2': [11, 22], 'c3': [99, 299], 'c4': ['A', 'B']})
Below is how I am doing: (Is there a cleaner way to do?)
df = df.merge(df_replacer, on=['c4'], how='left')
df.loc[~df.c4.isna(), 'c2_x'] = df['c2_y']
df.loc[~df.c4.isna(), 'c3_x'] = df['c3_y']
df = df.rename({'c2_x': 'c2', 'c3_x':'c3'}, axis=1)
df = df[['c1', 'c2', 'c3', 'c4']]
CodePudding user response:
I don't see another way to do it without using the merge, maybe you could do something like this :
df = df.merge(df_replacer, on='c4', how='left', suffixes=('', '_replacer'))
df['c2'] = np.where(df['c2_replacer'].notnull(), df['c2_replacer'], df['c2'])
df['c3'] = np.where(df['c3_replacer'].notnull(), df['c3_replacer'], df['c3'])
df = df.drop(['c2_replacer', 'c3_replacer'], axis=1)
CodePudding user response:
# list of columns to update
cols=['c2', 'c3']
# set the index on column to use for matching the two DF
df.set_index('c4', inplace=True)
df_replacer.set_index('c4', inplace=True)
# use update to replace value in DF
df.update(df_replacer[cols] )
# reset the index
df.reset_index()
c4 c1 c2 c3
0 A 10 11.0 99.0
1 NaN 11 22.0 299.0
2 NaN 12 120.0 120.0
3 B 13 22.0 299.0