Home > Software design >  calculationg the mean from each Dataframe column
calculationg the mean from each Dataframe column

Time:04-30

I've to write a function (column_means), that calculates the mean of each column from Dataframe and give me a list of means at the end. I'm not allowed to use the mean function .mean(), so I'm implementing the general formula of the mean: sum(x_i)/Number of elements.

This is my code:

df = pd.DataFrame({'a':[1,2,3], 'b': [4,5,6]})
def column_means(df):
    means  = [] 
    for i,n in zip(df.columns,  df.shape[0]):
        means [n] = sum(df[i])/ df.shape[0]
    return means

It doesn't work as intended. could you please help me and tell me, what are my mistakes?

Thank you in advance.

CodePudding user response:

You are iterating over int in zip function, as df.shape[0] is returning single integer and not an iterable datatype.

So you can simply do as following:

def column_means(df):
    means = []
    for i in df.columns:
        means.append(sum(df[i]) / df.shape[0])
    return means

And if you want mean to be just an integer instead of float, you can just do sum(df[i]) // df.shape[0]

I hope this answers your question.

CodePudding user response:

Do you want the mean of each column? You have to be careful if they don't have the exact same length:

import pandas as pd

df = pd.DataFrame({'a':[1,2,3], 'b': [4,5,6]})

def column_means(df):
    means  = [] 
    for i,n in enumerate(df.columns):
        means.append(sum(df[n])/len(df[n]))
    return means

print(column_means(df))

You can also use the mean method of pd DataFrame

df.mean()

CodePudding user response:

change the first df.shape[0] to df.indexand the assignment line.

def column_means(df):
   means  = [] 
   for i,n in zip(df.columns,  df.index):
      means.append(sum(df[i])/ df.shape[0])
   return means

CodePudding user response:

If the only thing you're not allowed to use is the df.mean() function, then you could do:

def column_means(df):
    return df.sum(axis=0).div(df.shape[0]).to_list()

Sum over the columns, divide the result by the number of rows, and convert it to a list.

  • Related