I have a large pandas dataframe, I want to average first 12 rows, then next 12 rows and so on. I wrote a for loop for this task
df_list=[]
for i in range(0,len(df),12):
print(i,i 12)
df_list.append(df.iloc[i:i 12].mean())
pd.concat(df_list,1).T
Is there an efficient way to do this without for loop
CodePudding user response:
You can divide the index by N i.e. 12 in your case, then group the dataframe by the quotient, and finally call mean
on these groups:
# Random dataframe of shape 120,4
>>> df=pd.DataFrame(np.random.randint(10,100,(120,4)), columns=list('ABCD'))
>>> df.groupby(df.index//12).mean()
A B C D
0 49.416667 52.583333 63.833333 47.833333
1 60.166667 61.666667 53.750000 34.583333
2 49.916667 54.500000 50.583333 64.750000
3 51.333333 51.333333 56.333333 60.916667
4 51.250000 51.166667 50.750000 50.333333
5 56.333333 50.916667 51.416667 59.750000
6 53.750000 57.000000 45.916667 59.250000
7 48.583333 59.750000 49.250000 50.750000
8 53.750000 48.750000 51.583333 68.000000
9 54.916667 48.916667 57.833333 43.333333
CodePudding user response:
I believe you want to split your dataframe to seperate chunks with 12 rows. Then you can use np.arange
inside groupby to take the mean of each seperate chunk:
df.groupby(np.arange(len(df)) // 12).mean()