I have the following DataFrame:

Date	Track	Surface	FGrating	HorseId	Last FGrating at Sha Tin, grass
2017-09-03	Sha Tin	Grass	110	1736
2017-09-16	Sha Tin	Grass	124	1736	110
2017-10-14	Sha Tin	Grass	118	1736	124
2017-11-11	Sha Tin	Grass	107	1736	118
2018-03-28	Happy Valley	Grass	117	1736
2018-04-11	Happy Valley	Grass	114	1736
2018-09-22	Sha Tin	Grass	124	1736	107

As you can see, there are spaces for the Last FGrating at Sha Tin, grass. These spaces have two causes:

There is no last FGrating (the first record);
There is a last FGrating for this track, but it was not written when the race for the same horse was on another track (Happy Valley grass, in this table).

So far, I managed to do only the filling part, according to this code:

def return_mask_and_text_from_tracks(data, track_no, metric):
    if track_no == 0:  # Sha Tin - grass
        mask = (data.Track == 'Sha Tin') & (data.Surface == 'Grass')
        text = str(metric)   ' at Sha-Tin Grass'
    if track_no == 1:  # Sha Tin - dirt
        mask = (data.Track == 'Sha Tin') & (data.Surface == 'Dirt')
        text = str(metric)   ' at Sha-Tin Dirt'
    if track_no == 2:  # Happy Valley - grass
        mask = (data.Track == 'Happy Valley') & (data.Surface == 'Grass')
        text = str(metric)   ' at Happy Valley Grass'
    return mask, text

def compute_last_fgrating(data, mask=''):
    if len(mask) == 0:
        return data.groupby('HorseId')['FGrating'].apply(lambda x: x.shift(1))
    else:
        return data.loc[mask][['HorseId', 'FGrating']].groupby('HorseId')['FGrating'].apply(lambda x: x.shift(1))

for i in range(3):
    mask, text = return_mask_and_text_from_tracks(featured_data, i, 'Last FGrating')
    featured_data[text] = compute_last_fgrating(featured_data, mask=mask)

I am looking for ways to fill the rightmost column with the following values:

0, when no last FGrating exist (the first record, in this case)
The last FGrating on Sha Tin grass on the records where the race was on other tracks, for the same horse.

The final table should look like this:

Date	Track	Surface	FGrating	HorseId	Last FGrating at Sha Tin, grass
2017-09-03	Sha Tin	Grass	110	1736	0 (no previous FGrating exists)
2017-09-16	Sha Tin	Grass	124	1736	110
2017-10-14	Sha Tin	Grass	118	1736	124
2017-11-11	Sha Tin	Grass	107	1736	118
2018-03-28	Happy Valley	Grass	117	1736	118 (the lastFGrating on Sha Tin grass at this point)
2018-04-11	Happy Valley	Grass	114	1736	118
2018-09-22	Sha Tin	Grass	124	1736	107

How can I do it?

CodePudding user response：

Here you only need

df["Last FGrating at Sha Tin, grass"] = df["Last FGrating at Sha Tin, grass"].fillna(method="ffill").fillna(0)

Explanation

The first fillna fill NaN with the latest value available (method="ffill") then you have left only one NaN on your first row and you can fill it with 0.