I am attempting to filter a pandas dataframe from the fastf1 package. I want to retrieve 'long runs', meaning that laptimes are similar over several laps.
In order to do this I want to find all monotonicity (of increasing by 1 each time) 'sections' within a dataframe after filtering.
I have filtered the LapTimes so most irrelevant laptimes are out of the dataframe, which gives:
print(fp_d1[['LapNumber', 'LapTime']])
output:
LapNumber LapTime
1 2 0 days 00:01:25.230000
3 4 0 days 00:01:44.087000
4 5 0 days 00:01:23.449000
6 7 0 days 00:01:23.234000
8 9 0 days 00:01:22.853000
9 10 0 days 00:01:33.581000
11 12 0 days 00:01:22.840000
12 13 0 days 00:01:40.480000
14 15 0 days 00:01:26.013000
15 16 0 days 00:01:25.739000
16 17 0 days 00:01:25.621000
17 18 0 days 00:01:25.750000
18 19 0 days 00:01:25.681000
19 20 0 days 00:01:25.556000
20 21 0 days 00:01:25.832000
21 22 0 days 00:01:25.669000
22 23 0 days 00:01:25.450000
23 24 0 days 00:01:25.408000
24 25 0 days 00:01:25.694000
From here I would like to make a function that in this case will only return laps 15-25.
Any help would be appreciated, if more information has to be given please let me know as well.
CodePudding user response:
Here is one possible solution.
I added the condition that there need to be more than 2 rows back to back to be present in the desired output:
condition = df['LapNumber'].diff().eq(1)
mask = condition.ne(condition.shift(-1)).cumsum()
print(mask)
out = df[df.groupby(mask)['LapNumber'].transform('count') > 2]
print(out)
Output:
LapNumber LapTime
14 15 0 days 00:01:26.013000
15 16 0 days 00:01:25.739000
16 17 0 days 00:01:25.621000
17 18 0 days 00:01:25.750000
18 19 0 days 00:01:25.681000
19 20 0 days 00:01:25.556000
20 21 0 days 00:01:25.832000
21 22 0 days 00:01:25.669000
22 23 0 days 00:01:25.450000
23 24 0 days 00:01:25.408000