Home > OS >  Resample pandas Series by group
Resample pandas Series by group

Time:12-01

I would like to resample a pandas Series without a datetime index, by group of n

Since the I googled the exact title of this question and didn't find and answer, I guess I only have an issue of wording my question properly, and therefore it's likely this question is a duplicate (if so I'll delete it).

Also, I've been trying with Series.rolling ans Series.groupby but I didn't achieve the intended result.

I could do it with a dirty loop but I assume a cleaner, more canonical, and more performant way exists.

import pandas as pd

s = pd.Series([1,2,3,4,11,22,33,44,444,555,666])
# 0       1
# 1       2
# 2       3
# 3       4
# 4      11
# 5      22
# 6      33
# 7      44
# 8     444
# 9     555
# 10    666
# dtype: int64

# What I want:
s.resampleByGroupOfN(4).max().reset_index()
# 0       4
# 1       44
# 2       666

Edit: The solution bellow proposed by Derek O works like a charm. Just in case you're like me (ie: often too lazy to read the comments) Umar.H pointed out a more compact version of it: s.groupby(s.index // 4).max()

CodePudding user response:

What you can do is reindex your series with repeated values for your groups, and then apply a groupby.

For example:

s.index = [x//4 for x in s.index]

The index of s becomes: Int64Index([0, 0, 0, 0, 1, 1, 1, 1, 2, 2, 2], dtype='int64')

Then apply the groupby:

>>> s.groupby(level=0).max()
0      4
1     44
2    666
  • Related