I want to do a left join like below, but I'm getting date1_x
, date1_y
and date2_x
and date2_y
. What I want is to replace the column if it exists.
df1 = df1.merge(df2, on='id', how='left')
df1
id, date1, val1, date2
1, '2021-03-10', 'XEP', '2021-04-10'
df2
id, date1, date2
1, '2021-09-03', '2021-09-04'
2, '2021-09-05', '2021-09-06'
3, '2021-09-07', '2021-09-08'
expected resulting df:
1, '2021-09-03', 'XEP' , '2021-09-04'
I tried the code below
df11 = df1.set_index('id')
df22 = df2.set_index('id')
df11.update(df22)
But I noticed that my column id
disappears from my columns.
CodePudding user response:
You can use combine_first
dropna
df1 = df2.combine_first(df1).dropna()[df1.columns]
Output:
id date1 val1 date2
0 1 '2021-09-03' 'XEP' '2021-09-04'
CodePudding user response:
just add this one line it will work.
df11 = df1.set_index('id')
df22 = df2.set_index('id')
df11.update(df22)
df11.reset_index(inplace=True)