I have the following dataframe for which I'm trying to compute a weighted rolling average:
import pandas as pd
df = pd.DataFrame({'player_ID': {0: 123,
1: 123,
2: 123,
3: 123,
4: 123,
5: 456,
6: 456,
7: 456,
8: 456,
9: 456},
'hole_sg': {0: 3.14,
1: 2.70,
2: 5.20,
3: -0.02,
4: 2.09,
5: -2.92,
6: -2.01,
7: 3.02,
8: -0.72,
9: -1.77}})
This array indicates the weights that I want to apply to the rolling averages:
import numpy as np
weights = np.array([0.66342043, 0.6983373 , 0.73509189])
The code that I used to calculate weighted averages (shown below) works when I don't group, but I want to be able to group by 'player_ID'
df['rolling'] = df.groupby('player_ID')['hole_sg'].rolling(3).apply(lambda x: (np.dot(x, weights)/weights).sum())
When I try to group I get the following error:
TypeError: incompatible index of inserted column with frame index
Does anyone have any suggestions on how to fix this? If everything worked properly, this is the result I would get:
CodePudding user response:
When you groupby
and use rolling
you get a MultiIndex. To align with the original DataFrame, you can use:
df["rolling"] = df.groupby('player_ID')['hole_sg'].rolling(3).apply(lambda x: (np.dot(x, weights))/weights.sum()).droplevel(0)
>>> df
player_ID hole_sg rolling
0 123 3.14 NaN
1 123 2.70 NaN
2 123 5.20 3.715635
3 123 -0.02 2.579053
4 123 2.09 2.371253
5 456 -2.92 NaN
6 456 -2.01 NaN
7 456 3.02 -0.534549
8 456 -0.72 0.117432
9 456 -1.77 0.095197