Have the following dataframe. How do i perform a rolling 3 window quantile(0.4) which combines values from 2 columns?
### Sample Dataframe
np.random.seed(0) # Freeze randomness
a = pd.DataFrame(np.random.randint(1,10,size=(20, 1)), columns=list('A'))
b = pd.DataFrame(np.random.randint(50,90,size=(20, 1)), columns=list('B'))
df = pd.concat([a,b], axis=1)
df
A | B | quantile_AB (expected ans) | |
---|---|---|---|
0 | 6 | 75 | NaN |
1 | 1 | 63 | NaN |
2 | 4 | 58 | 6.0 |
3 | 4 | 59 | 4.0 |
40th percentile of (6,1,4,75,63,58) should give me 6.0.
Below formula gives me the rolling quantile for 2 columns separately.
df.rolling(3)[['A','B']].quantile(0.4)
CodePudding user response:
IIUC, use numpy
and sliding_window_view
:
from numpy.lib.stride_tricks import sliding_window_view
m = df[['A', 'B']].to_numpy()
W = 3
N = m.shape[1]
Q = 0.4
q = np.quantile(np.reshape(sliding_window_view(m, (W, N)), (-1, W*N)), q=Q, axis=1)
df['quantile_AB'] = pd.Series(q, index=df.index[N:])
Output:
>>> df
A B quantile_AB
0 6 75 NaN
1 1 63 NaN
2 4 58 6.0
3 4 59 4.0
4 8 70 8.0
5 4 66 8.0
6 6 55 8.0
7 3 65 6.0
8 5 50 6.0
9 8 68 8.0
10 7 85 8.0
11 9 74 9.0
12 9 79 9.0
13 2 69 9.0
14 7 69 9.0
15 8 64 8.0
16 8 89 8.0
17 9 82 9.0
18 2 51 9.0
19 6 59 9.0
CodePudding user response:
Use stack and with rolling quantile
df.stack(dropna=False).rolling(window=3*len(df.columns)).\
quantile(0.4)[cols-1::cols].reset_index(-1, drop=True)
Dataframe
A B
0 6 75
1 1 63
2 4 58
3 4 59
Output:
0 NaN
1 NaN
2 6.0
3 4.0
dtype: float64