I have a time series dataframe of energy consumption.
How can I find the section or window of least variance? Let's say the size of the window is three, how can I get index 3,4,5?
index | time | energy |
---|---|---|
0 | 2021-04-21 16:00:00 | 14 |
1 | 2021-04-21 17:00:00 | 87 |
2 | 2021-04-21 18:00:00 | 3 |
3 | 2021-04-21 19:00:00 | 349 |
4 | 2021-04-21 20:00:00 | 355 |
5 | 2021-04-21 21:00:00 | 350 |
6 | 2021-04-21 22:00:00 | 21 |
I can do this by iterating through the rows, but there is probably a better Pandas way of doing this, right?
CodePudding user response:
Use Series.rolling
with Rolling.var
, then get index of minimal value by Series.idxmin
and last get 3 indices by indexing
:
N = 3
idx = df['energy'].rolling(N).var().idxmin()
pos = df.index.get_loc(idx) 1
out = df.index[pos - N:pos].tolist()
print (out)
[3, 4, 5]
If there is default index:
out = df.index[idx - N 1:idx 1].tolist()
print (out)
[3, 4, 5]