I have two data frame as below
Data Frame 1
Data Frame 2
I would like to merge this two data frames into something like below;
I try to use pd.merge and join as below
frames = pd.merge(df1, df2, how='outer', on=['apple_id','apple_wgt_colour', 'apple_wgt_no_colour'])
But the result is like this one
Anyone can help?
CodePudding user response:
You can do it by using concat()
and groupby()
. Because you want to sum the corresponding values from apple_wgt_colour and apple_wgt_no_colour, you should use agg()
to sum at the end.
You first should concat the two dataframes, then use group by to aggreate the two columns, apple_wgt_colour and apple_wgt_no_colour.
# Generating the two dataframe you exampled.
df1 = pd.DataFrame(
{
'apple_id': [1, 2, 3],
'apple_wgt_1': [9, 16, 8],
'apple_wgt_colour': [9, 6, 8],
'apple_wgt_no_colour': [0, 10, 13],
}
)
df2 = pd.DataFrame(
{
'apple_id': [1, 2, 3],
'apple_wgt_2': [9, 16, 8],
'apple_wgt_colour': [9, 6, 8],
'apple_wgt_no_colour': [0, 10, 13],
}
)
print(df1)
print(df2)
apple_id apple_wgt_1 apple_wgt_colour apple_wgt_no_colour
0 1 9 9 0
1 2 16 6 10
2 3 8 8 13
apple_id apple_wgt_2 apple_wgt_colour apple_wgt_no_colour
0 1 9 9 0
1 2 16 6 10
2 3 8 8 13
Next code will make a result you want:
frames = pd.concat([df1, df2]).groupby('apple_id', as_index=False).agg(sum)
# to change column order as you want
frames = frames[['apple_id', 'apple_wgt_1', 'apple_wgt_2', 'apple_wgt_colour', 'apple_wgt_no_colour']]
print(frames)
apple_id apple_wgt_1 apple_wgt_2 apple_wgt_colour apple_wgt_no_colour
0 1 9.0 9.0 18 0
1 2 16.0 16.0 12 20
2 3 8.0 8.0 16 26