Using python data frames, I want to find the volume of different products and get them together in one line. The basic is like this:
Week | Product | Volume |
---|---|---|
1 | Drink | 5 |
2 | Drink | 5 |
1 | Burger | 3 |
2 | Burger | 2 |
3 | Drink | 10 |
3 | Burger | 4 |
And I would like the output to be like this. Where the volume of the product of that week is obvious, but I don't know how else to visualize this. The goal is to see if in the week where we had a promotion on drinks and sold 10, the sale of burgers also went up.
The only thing is this is divided over 4 stores, 2 products and 5 variants per product.
Week | Product | Volume | Volume Burger | Volume Drink |
---|---|---|---|---|
1 | Drink | 5 | 3 | 5 |
2 | Drink | 5 | 2 | 5 |
1 | Burger | 3 | 3 | 5 |
2 | Burger | 2 | 2 | 5 |
3 | Drink | 10 | 4 | 10 |
3 | Burger | 4 | 10 | 4 |
What I tried:
First tried with groupby
Product/Week and use shift. But this does not work with multiple products.
Then tried something like this:
data['Volume Burgers'] = data['Volume'].loc[('Product' == 'Burger') & ('Week' == 'Week')]
CodePudding user response:
IIUC, use a combination of pivot
and merge
:
df.merge(df.pivot(index='Week', columns='Product', values='Volume')
.add_prefix('Volume '),
left_on='Week', right_index=True)
output:
Week Product Volume Volume Burger Volume Drink
0 1 Drink 5 3 5
2 1 Burger 3 3 5
1 2 Drink 5 2 5
3 2 Burger 2 2 5
4 3 Drink 10 4 10
5 3 Burger 4 4 10