Sample data:
df = pd.DataFrame({
'Company': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
'Model': ['A1', 'A2', 'A1', 'A3', 'A1', 'A2', 'A2', 'A3'],
'Units_sold': [55, 67, 58, 72, 52, 64, 68, 83]
})
After using groupby with sum function
df.groupby(['Company', 'Model'])['Units_sold'].agg('sum')
I get the following output:
Company Model
A A1 113
A2 67
A3 72
B A1 52
A2 132
A3 83
I only want to get observations where Units_sold
is maximum. That is expected output should be:
Company Model
A A1 113
B A2 132
CodePudding user response:
Use this code:
df = df.groupby(['Company', 'Model'])['Units_sold'].agg('sum')
df = df.loc[df.groupby('Company').idxmax()]
print(df)
Output will be:
Company Model
A A1 113
B A2 132