Iterating over rows to find mean of a data frame in Python-CodePudding

I have a dataframe of 100 random numbers and I would like to find the mean as follows:

mean0 should have mean of 0,5,10,... rows

mean1 should have mean of 1,6,11,16,.... rows

. mean4 should have mean of 4,9,14,... rows.

So far, I am able to find the mean0 but I am not able to figure out a way to iterate the process in order to obtain the remaining means.

My code is as follows:

import numpy as np
import pandas as pd
import csv

data = np.random.randint(1, 100, size=100)
df = pd.DataFrame(data)

print(df)

df.to_csv('example.csv', index=False)

df1 = df[::5]
print("Every 12th row is:\n",df1)

df2 = df1.mean()
print(df2)

CodePudding user response：

Since df[::5] is equivalent to df[0::5], you could use df[1::5], df[2::5], df[3::5], and df[4::5] for the remaining dataframes with subsequent application of mean by df[i::5].mean().

It is not explicitly showcased in the Pandas documentation examples but identical list slicing with [start:stop:step].

CodePudding user response：

I would use the underlying numpy array:

df[0].to_numpy().reshape(-1, 5).mean(0)

output: array([40.8 , 52.75, 43.2 , 55.05, 47.45])