I aim to make a pivot table with a rolling average over two days. I am using pivot_table() with aggfunc='mean' and therefore I get the average on each day. Additionally, I would need to adjust the mean function to also take the previous day into the calculation of the average. Here is some sample data:
df = pd.DataFrame({
'Date':['2021-01-01', '2021-01-01', '2021-01-01', '2021-01-02', '2021-01-02', '2021-01-02', '2021-01-02', '2021-01-03', '2021-01-03', '2021-01-03'],
'Name':['Tim', 'Tim', 'Ben', 'Leo', 'Tim', 'Ben', 'Leo', 'Leo', 'Ben', 'Tim'],
'Ratings':[9.0, 8.0, 5.0, 3.0, 'NaN', 'NaN', 6, 5, 3, 5]})
At the moment I only know how to perform a pivot table for the mean on each day which looks like this:
df.pivot_table(
values='Ratings', index='Date', columns='Name',
fill_value=0, aggfunc='mean')
But I try to figure out a way to get a moving average of two days which would look like this:
df = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03'],
'Tim':[8.5, 8.5, 5],
'Ben':[5, 5, 3],
'Leo':['NaN', 4.5, 4.66],})
Thanks a lot for your help :)
CodePudding user response:
First aggregate sum
and count
s for MultiIndex
, use rolling sum
for sum of sums and sum of counts and last for mean use division:
df['Date'] = pd.to_datetime(df['Date'])
df = df.pivot_table(
values='Ratings', index='Date', columns='Name',
aggfunc=['sum','count'])
df = df.rolling('2d').sum()
df = df.xs('sum', axis=1, level=0).div(df.xs('count', axis=1, level=0))
print (df)
Name Ben Leo Tim
Date
2021-01-01 5.0 NaN 8.5
2021-01-02 5.0 4.500000 8.5
2021-01-03 3.0 4.666667 5.0