Home > Software engineering >  How to get mean of selected rows with another column's values in pandas
How to get mean of selected rows with another column's values in pandas

Time:11-07

I have a dataframe like this:

enter image description here

I want to take the mean of WFR between 2009-2015 for each NAME and put it for all the years of each NAME. any idea? Thanks

How to setup

data = {'NAME': ['A', 'B', 'B', 'B', 'B', 'C', 'C', 'D', 'D', 'D'],
        'YEAR': [2017, 2009, 2011, 2017, 2018, 2010, 2018, 2014, 2015, 2016],
        'WFR': [20, 50, 80, 60, 90, 10, 30, 40, 55, 45]}
df = pd.DataFrame(data)

CodePudding user response:

Use groupby_mean after filter years then map the mean for each name:

tmp = df.loc[df['YEAR'].between(2009, 2015)].groupby('NAME')['WFR'].mean()
df['MEAN'] = df['NAME'].map(tmp)

Output:

>>> df
  NAME  YEAR  WFR  MEAN
0    A  2017   20   NaN
1    B  2009   50  65.0
2    B  2011   80  65.0
3    B  2017   60  65.0
4    B  2018   90  65.0
5    C  2010   10  10.0
6    C  2018   30  10.0
7    D  2014   40  47.5
8    D  2015   55  47.5
9    D  2016   45  47.5

>>> tmp
NAME
B    65.0
C    10.0
D    47.5
Name: WFR, dtype: float64

Setup:

data = {'NAME': ['A', 'B', 'B', 'B', 'B', 'C', 'C', 'D', 'D', 'D'],
        'YEAR': [2017, 2009, 2011, 2017, 2018, 2010, 2018, 2014, 2015, 2016],
        'WFR': [20, 50, 80, 60, 90, 10, 30, 40, 55, 45]}
df = pd.DataFrame(data)
  • Related