Home > Software engineering >  Nested groupby using Pandas
Nested groupby using Pandas

Time:11-04

I would like to sum the energy values for US, China and Japan and label this 'group1' Then groupby date, country, type and sum the energy values.

Data

We are grouping by date, and type and taking the sum of these specific countries: US, China and Japan - renaming this combination as group1

date        country    type energy
8/1/2022    US         aa   10
8/1/2022    US         aa   11
8/1/2022    China      bb   50
8/1/2022    Japan      bb   20
10/1/2022   Australia  bb   5
                    

Desired

date        country   type  energy
8/1/2022    group1    aa    21
8/1/2022    group1    bb    70
10/1/2022   Australia bb    5

Doing

df.groupby(['country','date', 'type'], as_index=False).agg({'energy': sum})

The above script performs the groupby and sum perfectly, but unsure of how to condense certain categories into a group before doing this step.

Any suggestion is appreciated

CodePudding user response:

If you want to combine them, how about just filter first and change the value to group1 and then do the groupby?

df.loc[df['country'].isin(['US', 'China', 'Japan']), 'country'] = 'group1'
df.groupby(['date', 'type', 'country'], as_index=False, sort=False).agg({'energy': sum})
        date type    country  energy
0   8/1/2022   aa     group1      21
1   8/1/2022   bb     group1      70
2  10/1/2022   bb  Australia       5

CodePudding user response:

# define a dictionary to group the countries
d={'US': 'Group-1',
   'China':'Group-1',
   'Japan' :'Group-1'}

# create a group column, based on mapping
# keeping a separate column, to avoid losing original values
# it can very well be a country
df['group']=df['country'].map(d).fillna(df['country'])

# do a groupby
out= df.groupby(['group','date', 'type'], as_index=False).agg({'energy': sum})




    group           date    type    energy
0   Australia   10/1/2022   bb      5
1   Group-1     8/1/2022    aa     21
2   Group-1     8/1/2022    bb     70
  • Related