my first data frame:
import pandas as pd
df1 = pd.DataFrame({'CONTRACT':['Tom', 'nick', 'krish', 'jack'],
'buy_qty':[20, 40, 50, 60],
'buy_val':[100,120,140,160],
'factor':[15,20,25,30],
})
CONTRACT buy_qty buy_val factor
0 Tom 20 100 96
1 nick 40 110 93
2 krish 50 140 91
3 jack 60 160 83
second data frame:
df2 = pd.DataFrame({'CONTRACT': ['Tom', 'nick', 'amit', 'joy'],
'buy_qty': [5, 6, 15, 30],
'buy_val': [15, 20, 35, 16],
'factor': [67, 89, 76, 69]}
df2:
CONTRACT buy_qty buy_val factor
0 Tom 5 15 67
1 nick 6 20 89
2 amit 15 35 76
3 joy 30 16 69
I want dataframe Like this (all CONTRACT values of df2 and uncommon CONTRACT values of df1, by adding common CONTRACTS's buy_qty and buy_val and factor will be from df2(latest data frame) ...)-
CONTRACT buy_qty buy_val factor
0 Tom 25* 115* 67
1 nick 46* 130* 89
2 krish 50 140 91
4 jack 60 160 83
2 amit 15 35 76
3 joy 30 16 69
* denotes added values
I tried like this-
final = pd.concat([df2, df1]).add(df1['buy_qty','buy_val'],df2['buy_qty','buy_val']).drop_duplicates(subset=["CONTRACT"]).reset_index(drop=True)
but it's not working fine.
Can anyone suggest a better way?
CodePudding user response:
Unlike your previous question, here you have to groupby
because you want to aggregate
some values on your columns:
buy_qty
:sum
buy_val
:sum
factor
:first
*
* first
because you want to keep the values from df2 and this is the first dataframe on pd.concat
.
out = (pd.concat([df2, df1]).groupby('CONTRACT', as_index=False)
.agg({'buy_qty': 'sum', 'buy_val': 'sum', 'factor': 'first'})
print(out)
# Output
CONTRACT buy_qty buy_val factor
0 Tom 25 115 67
1 amit 15 35 76
2 jack 60 160 83
3 joy 30 16 69
4 krish 50 140 91
5 nick 46 130 89
CodePudding user response:
You can use add
with fill_value=0
:
(df1
.set_index('CONTRACT')
.add(df2.set_index('CONTRACT'), fill_value=0)
.reset_index()
)
output:
CONTRACT buy_qty buy_val factor
0 Tom 25.0 115.0 163.0
1 amit 15.0 35.0 76.0
2 jack 60.0 160.0 83.0
3 joy 30.0 16.0 69.0
4 krish 50.0 140.0 91.0
5 nick 46.0 130.0 182.0
CodePudding user response:
Try:
df3=pd.concat([df1,df2])
df3.groupby('CONTRACT').sum()
result into:
buy_qty buy_val factor
CONTRACT
Tom 40 200 30
amit 50 140 25
jack 60 160 30
joy 60 160 30
krish 50 140 25
nick 80 240 40