avoid _y and _x in pandas dataframe-CodePudding

I want to do a left join like below, but I'm getting date1_x, date1_y and date2_x and date2_y. What I want is to replace the column if it exists.

df1 = df1.merge(df2, on='id', how='left')

df1

id, date1, val1,  date2

1,  '2021-03-10', 'XEP', '2021-04-10'

df2

 id,    date1,      date2
 1,  '2021-09-03', '2021-09-04'
 2,  '2021-09-05', '2021-09-06'
 3,  '2021-09-07', '2021-09-08'

expected resulting df:

1,  '2021-09-03', 'XEP' , '2021-09-04'

I tried the code below

df11 = df1.set_index('id')
df22 = df2.set_index('id') 
df11.update(df22)

But I noticed that my column id disappears from my columns.

CodePudding user response：

You can use combine_first dropna

df1 = df2.combine_first(df1).dropna()[df1.columns]

Output:

   id        date1   val1         date2
0   1  '2021-09-03'  'XEP'  '2021-09-04'

CodePudding user response：

just add this one line it will work.

df11 = df1.set_index('id')
df22 = df2.set_index('id') 
df11.update(df22)
df11.reset_index(inplace=True)