Hi I need to get average prices of the top four brands, find the top four brands by the combined number of the categories, there are just three categories. the top four is not who has more money that is who has more products combine.
this is the dataframe:
brand price category
Apple 499.0 phone
Huawei 200.0 phone
Apple 150.0 headphones
Samsung 800.0 phone
Apple 55.0 accesory
xiomi 25.0 accesory
Huawei 140.0 headphones
this is the result I need, with brands as index with the avarage price of the four brands, with the price round in 2 decimals
phone headphones accesory
brand_1 XXXX XXXX XXXX
brand_2 XXXX XXXX XXXX
brand_3 XXXX XXXX XXXX
brand_4 XXXX XXXX XXXX
I'm beginner so I don't have too much experience, I hope you can help me, thank you
CodePudding user response:
Try:
# get top 4 brands by size:
top_4 = df.groupby("brand").size().nlargest(4)
# pivot the table, format it and get top_4 brands:
df_out = (
df.pivot_table(
index="brand",
columns="category",
values="price",
aggfunc="mean",
fill_value=0,
)
.apply(lambda x: ["{:.2f}".format(v) for v in x])
.loc[top_4.index]
)
print(df_out)
Prints:
category accesory headphones phone
brand
Apple 55.00 150.00 499.00
Huawei 0.00 140.00 200.00
Samsung 0.00 0.00 800.00
xiomi 25.00 0.00 0.00