Home > Blockchain >  Select the number of elements in each group before calculating the mean
Select the number of elements in each group before calculating the mean

Time:04-22

I have a dataframe df with one column only. The data is monthly. The data type is float. The index is dates with the 'start of month' format. Sample data is here

np.random.seed(167)
rng = pd.date_range('2000-01-01', periods=200, freq='MS')

df = pd.DataFrame(
    {"x": np.cumsum([np.random.uniform(-0.01, 0.01) for _ in range(200)])
    }, index=rng
)

I can obtain a dataframe with the monthly averages (i.e. twelve values, one for each month of the year) with this:

df.groupby(df.index.to_series().dt.month).mean().

But I am trying to obtain the monthly averages across the last few years, for example the last three years. I am trying this:

df.groupby(df.index.to_series().dt.month).apply(lambda x: x.iloc[-3:]).mean()

but it returns one single value rather than the desired dataframe with twelve values, one for each month of the year.

CodePudding user response:

First filter by DataFrame.last and then use month periods by DatetimeIndex.to_period instead months:

df = df.last('3Y')

df = df.groupby(df.index.to_period('m')).mean()

print (df)
                x
2014-01  0.033955
2014-02  0.024319
2014-03  0.021346
2014-04  0.029669
2014-05  0.032332
2014-06  0.029898
2014-07  0.031526
2014-08  0.031103
2014-09  0.035407
2014-10  0.036398
2014-11  0.027093
2014-12  0.027330
2015-01  0.022826
2015-02  0.023622
2015-03  0.023993
2015-04  0.029287
2015-05  0.036700
2015-06  0.037833
2015-07  0.028914
2015-08  0.031578
2015-09  0.025939
2015-10  0.020513
2015-11  0.013072
2015-12  0.006051
2016-01  0.010619
2016-02  0.008403
2016-03  0.017979
2016-04  0.017292
2016-05  0.026308
2016-06  0.032746
2016-07  0.033926
2016-08  0.041456

Or if no problem with datetimes:

df = df.last('3Y').resample('m').mean()
print (df)
                   x
2014-01-31  0.033955
2014-02-28  0.024319
2014-03-31  0.021346
2014-04-30  0.029669
2014-05-31  0.032332
2014-06-30  0.029898
2014-07-31  0.031526
2014-08-31  0.031103
2014-09-30  0.035407
2014-10-31  0.036398
2014-11-30  0.027093
2014-12-31  0.027330
2015-01-31  0.022826
2015-02-28  0.023622
2015-03-31  0.023993
2015-04-30  0.029287
2015-05-31  0.036700
2015-06-30  0.037833
2015-07-31  0.028914
2015-08-31  0.031578
2015-09-30  0.025939
2015-10-31  0.020513
2015-11-30  0.013072
2015-12-31  0.006051
2016-01-31  0.010619
2016-02-29  0.008403
2016-03-31  0.017979
2016-04-30  0.017292
2016-05-31  0.026308
2016-06-30  0.032746
2016-07-31  0.033926
2016-08-31  0.041456
  • Related