I've written a code which will groupby and get the difference. Then dividing the difference and SUM_PROD_SELL_QTY with shifting the row by 1.
Below is the code
df = pd.DataFrame({'PRODUCT_NAME':['HYDT300ML','HYDT300ML', 'NEW', 'HYDT300ML', 'NEW', 'HYDT300ML','HYDT300ML'],
'WEEK_NUMBER':[1, 1, 1, 2, 2, 1, 3],
'PROMO_DEPTH':[0, 0.3, 0.3, 0.3, 0.3, 0.5, 0.3],
'SUM_PROD_SELL_QTY':[6, 29, 11, 10, 19, 20, 45]})
df_groupby = df.groupby(['PRODUCT_NAME', 'WEEK_NUMBER', 'PROMO_DEPTH'])['SUM_PROD_SELL_QTY'].sum().reset_index()
# Groupby the PRODUCT_NAME, WEEK_NUMBER, PROMO_DEPTH and get the difference in SUM_PROD_SELL_QTY
df_groupby['difference'] = df_groupby.groupby(['PRODUCT_NAME', 'PROMO_DEPTH'])['SUM_PROD_SELL_QTY'].diff()
# Create a uplift and downlift values with it's precentages with the reference of 29-10=19 where 19 has to be divided by it's previous SUM_PROD_SELL_QTY value i.e.29, then 19/29, in the same way for every row.
df_groupby['uplift_downlift_percentage'] = df_groupby['difference'] / df_groupby['SUM_PROD_SELL_QTY'].shift(1)
df_groupby
But the output it's returning is different. Below is the output
PRODUCT_NAME WEEK_NUMBER PROMO_DEPTH SUM_PROD_SELL_QTY difference uplift_downlift_percentage
0 HYDT300ML 1 0.0 6 NaN NaN
1 HYDT300ML 1 0.3 29 NaN NaN
2 HYDT300ML 1 0.5 20 NaN NaN
3 HYDT300ML 2 0.3 10 -19.0 -0.950000
4 HYDT300ML 3 0.3 45 35.0 3.500000
5 NEW 1 0.3 11 NaN NaN
6 NEW 2 0.3 19 8.0 0.727273
I'm trying to get the division value of 19/29 because it comes under same PROMO_DEPTH which is 0.3 and same PRODUCT_NAME which is HYDT300ML. But in the above output it's taking -19 from difference and 20 from SUM_PROD_SELL_QTY. How to groupby on PRODUCT_NAME, WEEK_NUMBER and PROMO_DEPTH? The Expected output is below
PRODUCT_NAME WEEK_NUMBER PROMO_DEPTH SUM_PROD_SELL_QTY difference uplift_downlift_percentage
0 HYDT300ML 1 0.0 6 NaN NaN
1 HYDT300ML 1 0.3 29 NaN NaN
2 HYDT300ML 1 0.5 20 NaN NaN
3 HYDT300ML 2 0.3 10 -19.0 -0.655
4 HYDT300ML 3 0.3 45 35.0 3.500000
5 NEW 1 0.3 11 NaN NaN
6 NEW 2 0.3 19 8.0 0.727273
Can anyone please help me with this?
CodePudding user response:
Your shift should occur group-wise:
df_groupby['uplift_downlift_percentage'] = df_groupby['difference']/df_groupby.groupby(['PRODUCT_NAME', 'PROMO_DEPTH'])['SUM_PROD_SELL_QTY'].shift(1)