I want to merge two different dataframe which the second one has some rows to complete in the first one.
df4 = pd.DataFrame({'a':['red','green','yellow','blue'],'b':[1,5,6,7],'c':[1,7,8,9]})
df5 = pd.DataFrame({'a':'red','b':44, 'c':55}, index=[0])
print(pd.merge(df4,df5, how='left', on='a'))
Output
a b_x c_x b_y c_y
0 red 1 1 44.0 55.0
1 green 5 7 NaN NaN
2 yellow 6 8 NaN NaN
3 blue 7 9 NaN NaN
Expected Output
a b c
0 red 44 55
1 green 5 7
2 yellow 6 8
3 blue 7 9
CodePudding user response:
Replace -
with np.nan
and use combine_first
:
df4.replace('-',np.nan,inplace=True)
df4.combine_first(df5)
prints:
a b c
0 red 44.0 55.0
1 green 5.0 7.0
2 yellow 6.0 8.0
3 blue 7.0 9.0
CodePudding user response:
Concatenate and drop duplicates by column 'a'.
print(pd.concat([df5, df4]).drop_duplicates(['a'], keep='first'))
CodePudding user response:
You can use DataFrame.update
:
df4.update(df5)
Output:
>>> df4
a b c
0 red 44.0 55.0
1 green 5 7
2 yellow 6 8
3 blue 7 9