Home > Software design >  Calculate mean from multiple columns
Calculate mean from multiple columns

Time:11-26

I have 12 columns filled with wages. I want to calculate the mean but my output is 12 different means from each column, but I want one mean which is calculated with the whole dataset as one. This is how my df looks:

Month 1  Month 2  Month 3  Month 4  ...  Month 9  Month 10  Month 11  Month 12
0   1429.97  2816.61  2123.29  2123.29  ...  2816.61   2816.61   1429.97   1776.63
1   3499.53  3326.20  3499.53  2112.89  ...  1939.56   2806.21   2632.88   2459.55
2   2599.95  3119.94  3813.26  3466.60  ...  3466.60   3466.60   2946.61   2946.61
3   2599.95  2946.61  3466.60  2773.28  ...  2253.29   3119.94   1906.63   2773.28

I used this code to calculate the mean:

mean = df.mean()

Do i have to convert these 12 columns into one column or how can i calculate one mean?

CodePudding user response:

Just call the mean again to get the mean of those 12 values:

df.mean().mean()

CodePudding user response:

Use numpy.mean with convert values to 2d array:

mean = np.mean(df.to_numpy())
print (mean)
2914.254166666667

Or use DataFrame.melt:

mean = df.melt()['value'].mean()

print (mean)
2914.254166666666

CodePudding user response:

You can also use stack:

df.stack().mean()

Suppose this dataframe:

>>> df
    A   B   C   D   E   F   G   H
0  60   1  59  25   8  27  34  43
1  81  48  32  30  60   3  90  22
2  66  15  21   5  23  36  83  46
3  56  42  14  86  41  64  89  56
4  28  53  89  89  52  13  12  39
5  64   7   2  16  91  46  74  35
6  81  81  27  67  26  80  19  35
7  56   8  17  39  63   6  34  26
8  56  25  26  39  37  14  41  27
9  41  56  68  38  57  23  36   8
>>> df.stack().mean()
41.6625
  • Related