I have this data:
data = {'name':['dave','bob','barry','bob','dave','barry'],
'age':[35,30,41,50,44,53],
'weight':[144,158,150,187,250,197]}
I want to grab the heaviest person by weight for each name - so I only want barry to appear once with an of 53 and his weight of 197 as this barry combination is heavier than 41 age barry.
I can get it working but not return back the column of weight - here's the full code:
import pandas as pd
data = {'name':['dave','bob','barry','bob','dave','barry'],
'age':[35,30,41,50,44,53],
'weight':[144,158,150,187,250,197]}
df = pd.DataFrame(data)
print(df.groupby('name')['age'].max())
So, I get back this as my output:
name barry 53 bob 50 dave 44 Name: age, dtype: int64
I have tried this this:
print(df.groupby('name')['age'].max())('weight')
but it doesn't work. I need it in the right order, name, age, weight.
Thanks in advance
CodePudding user response:
One method:
df.loc[df.groupby('name')['weight'].idxmax()]
Output:
name age weight
5 barry 53 197
3 bob 50 187
4 dave 44 250
Alternative:
df.sort_values('weight', ascending=False).groupby('name').head(1)
# Or , as_index=False).first()
Output:
name age weight
4 dave 44 250
5 barry 53 197
3 bob 50 187