Home > Software design >  Create some features based on the average growth rate of y for the month over the past few years
Create some features based on the average growth rate of y for the month over the past few years

Time:11-08

Assuming we have dataset df (which can be downloaded from enter image description here

For example, for September 2022, y_agr_last2 = ((1 3.85/100)*(1 1.81/100))^(1/2) -1, y_agr_last3 = ((1 3.85/100)*(1 1.81/100)*(1 1.6/100))^(1/3) -1.

The code I use is as follows, which is relatively repetitive and trivial:

import math

df['y_shift12'] = df['y'].shift(12)
df['y_shift24'] = df['y'].shift(24)
df['y_shift36'] = df['y'].shift(36)
df['y_agr_last2'] = pow(((1 df['y_shift12']/100) * (1 df['y_shift24']/100)), 1/2) -1
df['y_agr_last3'] = pow(((1 df['y_shift12']/100) * (1 df['y_shift24']/100) * (1 df['y_shift36']/100)), 1/3) -1

df.drop(['y_shift12', 'y_shift24', 'y_shift36'], axis=1, inplace=True)
df

How can the desired result be achieved more concisely?

References:

Create some features based on the mean of y for the month over the past few years

CodePudding user response:

Following is one way to generalise it:

import functools
import operator

num_yrs = 3

for n in range(1, num_yrs 1):
  df[f"y_shift{n*12}"] = df["y"].shift(n*12)
  df[f"y_agr_last{n}"] = pow(functools.reduce(operator.mul, [1 df[f"y_shift{i*12}"]/100 for i in range(1, n 1)], 1), 1/n) - 1

df = df.drop(["y_agr_last1"]   [f"y_shift{n*12}" for n in range(1, num_yrs 1)], axis=1)

Output:

          date      y        x1        x2  y_agr_last2  y_agr_last3
0    2018/1/31 -13.80  1.943216  3.135839          NaN          NaN
1    2018/2/28 -14.50  0.732108  0.375121          NaN          NaN
...
22  2019/11/30   4.00 -0.273262 -0.021146          NaN          NaN
23  2019/12/31   7.60  1.538851  1.903968          NaN          NaN
24   2020/1/31 -11.34  2.858537  3.268478    -0.077615          NaN
25   2020/2/29 -34.20 -1.246915 -0.883807    -0.249940          NaN
26   2020/3/31  46.50 -4.213756 -4.670146     0.221816          NaN
...
33  2020/10/31  -1.00  1.967062  1.860070    -0.035569          NaN
34  2020/11/30  12.99  2.302166  2.092842     0.041998          NaN
35  2020/12/31   5.54  3.814303  5.611199     0.030017          NaN
36   2021/1/31  -6.41  4.205601  4.948924    -0.064546    -0.089701
37   2021/2/28 -22.38  4.185913  3.569100    -0.342000    -0.281975
38   2021/3/31  17.64  5.370519  3.130884     0.465000     0.298025
...
54   2022/7/31   0.80 -6.259455 -6.716896     0.057217     0.052793
55   2022/8/31  -5.30  1.302754  1.412277     0.015121    -0.000492
56   2022/9/30    NaN -2.876968 -3.785964     0.028249     0.024150
  • Related