Home > Back-end >  pandas how to get the mean value of a column base on the index
pandas how to get the mean value of a column base on the index

Time:09-17

I have a df you can have it by running this code:

import pandas as pd
from io import StringIO

df = """
 Model      Final
f0901        4
f0901        2
rf0902       2
rf0902       5
rf0902       3
indi0902     4
indi0902     3
indi0902     2
indi0902     1

"""
df= pd.read_csv(StringIO(df.strip()), sep='\s ')
df.set_index("Model", inplace = True)

The output is:

         Final
Model   
f0901       4
f0901       2
rf0902      2
rf0902      5
rf0902      3
indi0902    4
indi0902    3
indi0902    2
indi0902    1

Now how can I get the mean value of 'Final' for each 'Model'? Then add a mean column for each model.

The output should be:

    Model     Final     mean
0   f0901       4       3.0
1   f0901       2       3.0
2   rf0902      4       4.0
3   rf0902      5       4.0
4   rf0902      3       4.0
5   indi0902    4       2.5
6   indi0902    3       2.5
7   indi0902    1       2.5
8   indi0902    2       2.5

CodePudding user response:

You can assign the output of groupby.transform('mean') and reset_index:

out = (df.assign(mean=df.groupby(level=0)['Final'].transform('mean'))
         .reset_index()
       )

Output:

      Model  Final      mean
0     f0901      4  3.000000
1     f0901      2  3.000000
2    rf0902      2  3.333333
3    rf0902      5  3.333333
4    rf0902      3  3.333333
5  indi0902      4  2.500000
6  indi0902      3  2.500000
7  indi0902      2  2.500000
8  indi0902      1  2.500000
  • Related