How can I determine the last value of a field given some predefined conditions using groupby?-CodePudding

I have the following data frame:

Track	Surface	HorseId	FGrating	New FGrating
Sha Tin	Grass	1736	110	-1
Sha Tin	Grass	1736	124	-1
Sha Tin	Grass	1736	118	-1
Happy Valley	Grass	1736	117	-1
Sha Tin	Grass	13973	144	-1
Sha Tin	Grass	13973	137	-1

I want to fill the New FGrating column with the last FGrating of every horse only if the track is Sha Tin and the surface is Grass. The rest of the columns can remain -1. The result that I am looking for is this:

Track	Surface	HorseId	FGrating	New FGrating
Sha Tin	Grass	1736	110	-1
Sha Tin	Grass	1736	124	110
Sha Tin	Grass	1736	118	124
Happy Valley	Grass	1736	117	-1
Sha Tin	Grass	13973	144	-1
Sha Tin	Grass	13973	137	144

For this, I tried to extract the required data intro a dataframe called temp. The code I used is this:

temp = featured_data.loc[(featured_data.Track == 'Sha Tin') & (featured_data.Surface == 'Gress')]
temp = temp[['HorseId', 'FGrating']]
temp = temp.groupby('HorseId')
temp['New FGrating'] = temp['FGrating'].apply(lambda x: x.shift(1))

The only problem this code has is that at the temp=temp.groupby('HorseId') line, the entire 'HorseId' column disappears.

What am I doing wrong? How can I solve this problem?

CodePudding user response：

I can't check it myself at the moment but I think you can try it like this:

temp = featured_data.loc[(featured_data.Track == 'Sha Tin') & (featured_data.Surface == 'Grass')]
temp = temp[['HorseId', 'FGrating']]
temp['New FGrating'] = temp.groupby('HorseId')['FGrating'].apply(lambda x: x.shift(1))

I think it is only a typo but you filtered for 'Gress' instead of 'Grass in your code example