If I have the dataframe:
'| | time_index | values |
|---:|-------------:|---------:|
| 0 | 1 | 21 |
| 1 | 2 | 5 |
| 2 | 3 | 25 |
| 3 | 4 | 0 |
| 4 | 5 | 4 |
| 5 | 6 | 13 |
| 6 | 7 | 20 |
| 7 | 8 | 2 |
| 8 | 9 | 15 |
| 9 | 10 | 21 |'
I want to take the all the subsets of 3, in increments of one, so the first operation takes index [0,1,2], second iteration [1,2,3]. This logic implemented to the column values I would like to check if the value in the middle, is the max of the subset and flag it in another column.
Iterations:
- Values: [21,5,25], max(values) == 5? False => ignore.
- Values: [5,25,0], max(values) = 25? True => add flag in new column.
I have the feeling that this has to do with a rolling window but I am not sure how to go about it.
CodePudding user response:
To make rolling window calculations use the rolling
method.
Then you can apply the logic to each group using agg
with a custom function.
# val.iat[1] stands for the middle value of each group
df['is_max'] = (
df['values'].rolling(window=3, center=True)
.agg(lambda vals: vals.iat[1] == vals.max())
.astype('boolean')
)
>>> df
time_index values is_max
0 1 21 <NA>
1 2 5 False
2 3 25 True
3 4 0 False
4 5 4 False
5 6 13 False
6 7 20 True
7 8 2 False
8 9 15 False
9 10 21 <NA>