Lets say i have dataframe:
index | x |
---|---|
0 | 5 |
1 | 20 |
2 | 25 |
3 | 50 |
4 | 10 |
5 | 11 |
6 | 18 |
7 | 28 |
8 | 22 |
9 | 55 |
10 | 40 |
And I want to get previous 5 values for every value/index in column x so expected output would be:
index | x | expected_output |
---|---|---|
0 | 5 | [null null null null null ] |
1 | 20 | null null null null 5 |
2 | 25 | null null null 5 20 |
3 | 50 | null null 5 20 25 |
4 | 10 | null 5 20 25 50 |
5 | 11 | 5 20 25 50 10 |
6 | 18 | 20 25 50 10 11 |
7 | 28 | 25 50 10 11 18 |
8 | 22 | 50 10 11 18 28 |
9 | 55 | 10 11 18 28 22 |
10 | 40 | 11 18 28 22 55 |
I have used this iloc
for index, row in df.iterrows():
print(index,row['x'],df.iloc[-6:-1,2].values)
Is there any similar iloc that can solve this method?
CodePudding user response:
Use strides with append missing values:
n = 5
x = np.concatenate([[np.nan] * (n), df['x']])
def rolling_window(a, window):
shape = a.shape[:-1] (a.shape[-1] - window 1, window)
strides = a.strides (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
df['expected_output'] = rolling_window(x, n)[:-1].tolist()
print (df)
x expected_output
index
0 5 [nan, nan, nan, nan, nan]
1 20 [nan, nan, nan, nan, 5.0]
2 25 [nan, nan, nan, 5.0, 20.0]
3 50 [nan, nan, 5.0, 20.0, 25.0]
4 10 [nan, 5.0, 20.0, 25.0, 50.0]
5 11 [5.0, 20.0, 25.0, 50.0, 10.0]
6 18 [20.0, 25.0, 50.0, 10.0, 11.0]
7 28 [25.0, 50.0, 10.0, 11.0, 18.0]
8 22 [50.0, 10.0, 11.0, 18.0, 28.0]
9 55 [10.0, 11.0, 18.0, 28.0, 22.0]
10 40 [11.0, 18.0, 28.0, 22.0, 55.0]