I'm looking to find an alternate way to group "total" groupings. Below is over simplified example of what I'm currently doing to create a total grouping. Currently, I'm creating a separate "total" data frame and merging it with the original one. Is there a better way to approach this problem?
import pandas as pd
df = pd.DataFrame({'new_ren':['new', 'ren', 'new', 'new', 'ren'],
'dt':['2022-01-01', '2022-01-01', '2022-02-01', '2022-02-01', '2022-03-01'],
'ct':[1, 3, 4, 1, 2]})
# create total grouping
df_tot = df.groupby('dt')['ct'].sum().reset_index()
df_tot['new_ren'] = 'total'
# combined both dataframes
pd.concat([df, df_tot])
CodePudding user response:
There's nothing wrong with your approach (I'd say, it's the most intuitive and readable for your current problem) but here's an alternative way using merge
to get the same output:
out = df.merge(df.groupby('dt', as_index=False).sum('ct').assign(new_ren='total'), how='outer')
Output:
new_ren dt ct
0 new 2022-01-01 1
1 ren 2022-01-01 3
2 new 2022-02-01 4
3 new 2022-02-01 1
4 ren 2022-03-01 2
5 total 2022-01-01 4
6 total 2022-02-01 5
7 total 2022-03-01 2