I would like to resample a pandas Series without a datetime index, by group of n
Since the I googled the exact title of this question and didn't find and answer, I guess I only have an issue of wording my question properly, and therefore it's likely this question is a duplicate (if so I'll delete it).
Also, I've been trying with Series.rolling
ans Series.groupby
but I didn't achieve the intended result.
I could do it with a dirty loop but I assume a cleaner, more canonical, and more performant way exists.
import pandas as pd
s = pd.Series([1,2,3,4,11,22,33,44,444,555,666])
# 0 1
# 1 2
# 2 3
# 3 4
# 4 11
# 5 22
# 6 33
# 7 44
# 8 444
# 9 555
# 10 666
# dtype: int64
# What I want:
s.resampleByGroupOfN(4).max().reset_index()
# 0 4
# 1 44
# 2 666
Edit: The solution bellow proposed by Derek O works like a charm. Just in case you're like me (ie: often too lazy to read the comments) Umar.H pointed out a more compact version of it: s.groupby(s.index // 4).max()
CodePudding user response:
What you can do is reindex your series with repeated values for your groups, and then apply a groupby.
For example:
s.index = [x//4 for x in s.index]
The index of s becomes: Int64Index([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2], dtype='int64')
Then apply the groupby:
>>> s.groupby(level=0).max()
0 4
1 44
2 666