Home > Mobile >  How to get only observations with maximum values after using groupby.sum?
How to get only observations with maximum values after using groupby.sum?

Time:12-11

Sample data:

df = pd.DataFrame({
    'Company': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
    'Model': ['A1', 'A2', 'A1', 'A3', 'A1', 'A2', 'A2', 'A3'],
    'Units_sold': [55, 67, 58, 72, 52, 64, 68, 83]
})

After using groupby with sum function

df.groupby(['Company', 'Model'])['Units_sold'].agg('sum')

I get the following output:

Company  Model
A        A1       113
         A2        67
         A3        72
B        A1        52
         A2       132
         A3        83

I only want to get observations where Units_sold is maximum. That is expected output should be:

Company  Model
A        A1       113
B        A2       132

CodePudding user response:

Use this code:

df = df.groupby(['Company', 'Model'])['Units_sold'].agg('sum')
df = df.loc[df.groupby('Company').idxmax()]
print(df)

Output will be:

Company  Model
A        A1       113
B        A2       132
  • Related