I have two data sets. I need to concatenate them and replace the values of one column in the dataframe with the values of another column in the other dataframe.
Example data:
df1
date col1 col2 col3
2020-03-27 NaN 0 NaN
2020-03-28 1 NaN NaN
2020-03-29 2 4 2
2020-03-30 3 3 NaN
2020-03-31 3 2 NaN
2020-04-01 NaN 2 NaN
df2
date col3
2020-03-27 NaN
2020-03-28 1
2020-03-29 2
2020-03-30 3
2020-03-31 NaN
As a result of the union, I want to get the following result:
df_res
date col1 col2 col3
2020-03-27 NaN 0 NaN
2020-03-28 1 NaN 1
2020-03-29 2 4 2
2020-03-30 3 3 3
2020-03-31 3 2 NaN
2020-04-01 NaN 2 NaN
I tried pandas update
but it didn't work for me.
Worked best: df_res = pd.merge(df1, df2, how='left', on='date')
, but it creates stores two versions of the columns:
df_res
date col1 col2 col3_x col3_y
2020-03-27 NaN 0 NaN NaN
2020-03-28 1 NaN NaN 1
2020-03-29 2 4 2 2
2020-03-30 3 3 NaN 3
2020-03-31 3 2 NaN NaN
2020-04-01 NaN 2 NaN NaN
How can this merger be done?
CodePudding user response:
You can use pd.DataFrame.combine_first
Combine two DataFrame objects by filling null values in one DataFrame with non-null values from other DataFrame.
Doing:
df1 = df1.set_index('date')
df2 = df2.set_index('date')
df = df1.combine_first(df2)
print(df)
Output:
col1 col2 col3
date
2020-03-27 NaN 0.0 NaN
2020-03-28 1.0 NaN 1.0
2020-03-29 2.0 4.0 2.0
2020-03-30 3.0 3.0 3.0
2020-03-31 3.0 2.0 NaN
2020-04-01 NaN 2.0 NaN
CodePudding user response:
After merging in df_res, you can do this maybe;
df_res["col3"] = df_res.apply(lambda x: x["col3_x"] if pd.isna(x["col3_y"]) else x["col3_x"],axis=1)
CodePudding user response:
Another possible solution, which is based on the idea of first using date
as index in both dataframes and then assigning col3
of df2
to col3
of df1
:
out = df1.set_index('date')
df2 = df2.set_index('date')
out['col3'] = df2['col3']
out = out.reset_index()
Output:
date col1 col2 col3
0 2020-03-27 NaN 0.0 NaN
1 2020-03-28 1.0 NaN 1.0
2 2020-03-29 2.0 4.0 2.0
3 2020-03-30 3.0 3.0 3.0
4 2020-03-31 3.0 2.0 NaN
5 2020-04-01 NaN 2.0 NaN