This is a variation of this question, only with a new column, Track
:
Date | Jockey ID | Position | Track |
---|---|---|---|
23-12-2018 | 4340 | 1 | Sha Tin |
25-11-2018 | 4340 | 5 | Sha Tin |
19-12-2018 | 4340 | 10 | Happy Valley |
01-01-2019 | 4340 | 3 | Happy Valley |
18-10-2017 | 8443 | 1 | Sha Tin |
18-02-2018 | 8443 | 6 | Sha Tin |
12-05-2018 | 8443 | 7 | Happy Valley |
What I am looking for is a way to compute the rolling mean final position, in the last 1000 days, on a specifically on Sha Tin. The final result:
Date | Jockey ID | Position | Track | Mean Position |
---|---|---|---|---|
23-12-2018 | 4340 | 1 | Sha Tin | 1 (1/1) |
25-11-2018 | 4340 | 5 | Sha Tin | 3 (1 5)/2 |
19-12-2018 | 4340 | 10 | Happy Valley | |
01-01-2019 | 4340 | 3 | Happy Valley | |
18-10-2017 | 8443 | 1 | Sha Tin | 1 (1/1) |
18-02-2018 | 8443 | 6 | Sha Tin | 3.5 (1 6)/2 |
12-05-2018 | 8443 | 7 | Happy Valley |
CodePudding user response:
Add Track
column to groupby
:
df['new'] = (df.set_index('Date')
.groupby(['Jockey ID','Track', pd.Grouper(freq='1000D')])['Position']
.transform(lambda x: x.expanding().mean())
.to_numpy())
print (df)
If need filter is possible, but output is different, for not matched are NaN
s:
mask = df['Track'].eq('Sha Tin')
df.loc[mask, 'new'] = (df[mask].set_index('Date')
.groupby(['Jockey ID','Track', pd.Grouper(freq='1000D')])['Position']
.transform(lambda x: x.expanding().mean())
.to_numpy())
print (df)