Home > Software engineering >  Column sum in pandas groupby
Column sum in pandas groupby

Time:12-31

Below is the dataframe

Skill   Category    Location    Market Type Count
Java    Cat1        Europe      Tier1   A    2       
Java    Cat1        Europe      Tier1   B    1       
Java    Cat1        Europe      Tier1   C    1       
Java    Cat2        Asia        Tier2   D    1       
Java    Cat3        Asia        Tier1   E    1       

Below is the intended output dataframe

Skill   Category    Location    Market Type Count   Sum_Market
Java    Cat1        Europe      Tier1   A    2       4
Java    Cat1        Europe      Tier1   B    1       4
Java    Cat1        Europe      Tier1   C    1       4
Java    Cat2        Asia        Tier2   D    1       1
Java    Cat3        Asia        Tier1   E    1       1

Problem Statement : Sum_Market should be done using groupby of specific skill, category, location with sum of market tier in each of these selection. Below is the try from my end:

df.groupby(['Skill','Category','Location','Market','Type'])['count'].sum()

CodePudding user response:

Just merge back to original one:

df.merge(
df.groupby(['Skill','Category','Location','Market','Type'])['count'].sum().rename('Sum_Market').reset_index()
)

CodePudding user response:

Use

df['Sum_Market'] = df.groupby(['Skill','Category','Location'])['Count'].transform('sum')

OUTPUT

  Skill Category Location Market Type  Count  Sum_Market
0  Java     Cat1   Europe  Tier1    A      2           4
1  Java     Cat1   Europe  Tier1    B      1           4
2  Java     Cat1   Europe  Tier1    C      1           4
3  Java     Cat2     Asia  Tier2    D      1           1
4  Java     Cat3     Asia  Tier1    E      1           1
  • Related