Home > database >  create new DataFrame resulted from processing N rows of other DataFrame
create new DataFrame resulted from processing N rows of other DataFrame

Time:11-11

I want to process each N rows of a DataFrame separately.
If my data has 15 row indexed from 0 to 14 I want to process rows from index 0 to 3 , 4 to 7, 8 to 11, 12 to 15
for example let's say for each 4 rows I want the sum(A) and the mean(B)

Index A B
0 4 4
1 7 9
2 9 3
3 0 4
4 7 9
5 9 2
6 3 0
7 7 4
8 7 2
9 1 6

The Resulted DataFrame should be

Index A B
0 20 5
1 26 3.75
2 8 4

TLDR: how to let DataFrame.apply takes multiple rows instead of a single row at a time

CodePudding user response:

Use GroupBy.agg with integer division by 4 by index:

#default RangeIndex
df = df.groupby(df.index // 4).agg({'A':'sum', 'B':'mean'})

#any index
df = df.groupby(np.arange(len(df.index)) // 4).agg({'A':'sum', 'B':'mean'})
print (df)
    A     B
0  20  5.00
1  26  3.75
2   8  4.00
  • Related