I have a dataframe as follows. I'm attempting to sum the values in the Total column, for each date, for each unique pair from columns P_buy and P_sell.
-------- ---------- ------- --------- -------- ---------- -----------------
| Index | Date | Type | Quantity| P_buy | P_sell | Total |
-------- ---------- ------- --------- -------- ---------- -----------------
| 0 | 1/1/2020 | 1 | 10 | 1 | 1 | 10 |
| 1 | 1/1/2020 | 1 | 10 | 2 | 1 | 20 |
| 2 | 1/1/2020 | 2 | 20 | 3 | 1 | 25 |
| 3 | 1/1/2020 | 2 | 20 | 4 | 1 | 20 |
| 4 | 2/1/2020 | 3 | 30 | 1 | 1 | 35 |
| 5 | 2/1/2020 | 3 | 30 | 2 | 1 | 30 |
| 6 | 2/1/2020 | 1 | 40 | 3 | 1 | 45 |
| 7 | 2/1/2020 | 1 | 40 | 4 | 1 | 40 |
| 8 | 3/1/2020 | 2 | 50 | 1 | 1 | 55 |
| 9 | 3/1/2020 | 2 | 50 | 2 | 1 | 53 |
-------- ---------- ------- --------- -------- ---------- -----------------
My desired output would be as follows: Where for each combination of unique P_buy/P_sell pairs, I'm receiving a sum of the total at each date.
-------- ---------- ------- ---------
| P_buy | P_sell | Total |
-------- ---------- ------- ---------
| 1 | 1 | 100 |
| 2 | 1 | 103 |
| 3 | 1 | 70 |
-------- ---------- ------- ---------
My attempts have been using the groupby function, but I haven't been able to successfully implement.
CodePudding user response:
# use a groupby on the desired columns and sum the total
df.groupby(['P_buy','P_sell'], as_index=False)['Total'].sum()
P_buy P_sell Total
0 1 1 100
1 2 1 103
2 3 1 70
3 4 1 60