Let's say I have a pandas dataframe:
brand | category | size |
---|---|---|
nike | sneaker | 9 |
adidas | boots | 11 |
nike | boots | 9 |
There could be more than 100 brands and some brands could have more categories than others. How do I get a table that will group them based on brands? That is the first column(index) that should be the brands, the second should be the categories belonging to the brand, and if possible the mean size for each brand as well, using pandas.
brand | category | size |
---|---|---|
nike | sneaker | 10.5 |
boots | ||
adidas | boots | 11 |
CodePudding user response:
Maybe their is a little error in the size from the example (mean is 9 instead of 10.5), but a solution might be :
df.groupby(['brand'], as_index=False).agg({'category': list, 'size': 'mean'})
Output :
brand category size
0 adidas [boots] 11
1 nike [sneaker, boots] 9