I have the following data frame:
Track | Surface | HorseId | FGrating | New FGrating |
---|---|---|---|---|
Sha Tin | Grass | 1736 | 110 | -1 |
Sha Tin | Grass | 1736 | 124 | -1 |
Sha Tin | Grass | 1736 | 118 | -1 |
Happy Valley | Grass | 1736 | 117 | -1 |
Sha Tin | Grass | 13973 | 144 | -1 |
Sha Tin | Grass | 13973 | 137 | -1 |
I want to fill the New FGrating
column with the last FGrating of every horse only if the track is Sha Tin and the surface is Grass. The rest of the columns can remain -1. The result that I am looking for is this:
Track | Surface | HorseId | FGrating | New FGrating |
---|---|---|---|---|
Sha Tin | Grass | 1736 | 110 | -1 |
Sha Tin | Grass | 1736 | 124 | 110 |
Sha Tin | Grass | 1736 | 118 | 124 |
Happy Valley | Grass | 1736 | 117 | -1 |
Sha Tin | Grass | 13973 | 144 | -1 |
Sha Tin | Grass | 13973 | 137 | 144 |
For this, I tried to extract the required data intro a dataframe called temp
. The code I used is this:
temp = featured_data.loc[(featured_data.Track == 'Sha Tin') & (featured_data.Surface == 'Gress')]
temp = temp[['HorseId', 'FGrating']]
temp = temp.groupby('HorseId')
temp['New FGrating'] = temp['FGrating'].apply(lambda x: x.shift(1))
The only problem this code has is that at the temp=temp.groupby('HorseId')
line, the entire 'HorseId' column disappears.
What am I doing wrong? How can I solve this problem?
CodePudding user response:
I can't check it myself at the moment but I think you can try it like this:
temp = featured_data.loc[(featured_data.Track == 'Sha Tin') & (featured_data.Surface == 'Grass')]
temp = temp[['HorseId', 'FGrating']]
temp['New FGrating'] = temp.groupby('HorseId')['FGrating'].apply(lambda x: x.shift(1))
I think it is only a typo but you filtered for 'Gress' instead of 'Grass in your code example