Python: Dynamically calculate rolling returns over different frequencies-CodePudding

Consider a DataFrame with multiple columns as follows:

data = [[99330,12,122],[1123,1230,1287],[123,101,812739],[1143,1230123,252],[234,342,4546],[2445,3453,3457],[7897,8657,5675],[46,5675,453],[76,484,3735],[363,93,4568],[385,568,367],[458,846,4847],[574,45747,658468],[57457,46534,4675]]
df1 = pd.DataFrame(data, index=['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04',
                       '2022-01-05', '2022-01-06', '2022-01-07', '2022-01-08',
                       '2022-01-09', '2022-01-10', '2022-01-11', '2022-01-12',
                       '2022-01-13', '2022-01-14'], 
          columns=['col_A', 'col_B', 'col_C'])
df1.index = pd.to_datetime(df1.index)
df1: 
            col_A   col_B   col_C
2022-01-01  99330   12      122
2022-01-02  1123    1230    1287
2022-01-03  123     101     812739
2022-01-04  1143    1230123 252
2022-01-05  234     342     4546
2022-01-06  2445    3453    3457
2022-01-07  7897    8657    5675
2022-01-08  46      5675    453
2022-01-09  76      484     3735
2022-01-10  363     93      4568
2022-01-11  385     568     367
2022-01-12  458     846     4847
2022-01-13  574     45747   658468
2022-01-14  57457   46534   4675

Is there a way to write a loop so I can calculate the rolling returns on a daily ('1D'), weekly ('1W'), monthly ('1M') and six monthly ('6M') basis?

EDIT: Here is my attempt at calculating the rolling return on a daily and weekly basis:

periodicity_dict = {'1D':'daily', '1W':'weekly'}
df_columns = df1.columns
for key in periodicity_dict:
    for col in df_columns:
        df1[col '_rolling']= np.nan
        for i in pd.date_range(start=df1[col].first_valid_index(), end=df1[col].last_valid_index(), freq=key):
                df1[col '_rolling'].iloc[i] = (df1[col].iloc[i] - df[col].iloc[i-'1W'])/df[col].iloc[i-'1W']

CodePudding user response：

You can use shift to shift your index by a certain time period. For instance you can shift everything one day with:

df1.shift(freq="1D").add_suffix("_1D")

This will then be something like:

            col_A_1D  col_B_1D  col_C_1D
2022-01-02     99330        12       122
2022-01-03      1123      1230      1287
2022-01-04       123       101    812739
2022-01-05      1143   1230123       252
2022-01-06       234       342      4546

You can then add the new columns to the existing data:

df1.merge(df1.shift(freq="1D").add_suffix("_1D"), how="left", left_index=True, right_index=True)

            col_A    col_B   col_C  col_A_1D   col_B_1D  col_C_1D
2022-01-01  99330       12     122       NaN        NaN       NaN
2022-01-02   1123     1230    1287   99330.0       12.0     122.0
2022-01-03    123      101  812739    1123.0     1230.0    1287.0
2022-01-04   1143  1230123     252     123.0      101.0  812739.0
2022-01-05    234      342    4546    1143.0  1230123.0     252.0

And then just calculate e.g. (df1["col_A"] - df1["col_A_1D"]) / df1["col_A_1D"]. This will then result in:

2022-01-01          NaN
2022-01-02    -0.988694
2022-01-03    -0.890472
2022-01-04     8.292683
2022-01-05    -0.795276

You can do this for all the required columns and time shifts in the same way. For instance:

initial_cols = ["col_A", "col_B", "col_C"]
shifted_cols = [f"{c}_1D" for c in initial_cols]
for i, s in zip(initial_cols, shifted_cols):
    df1[f"{i}_rolling"] = (df1[i] - df1[s]) / df1[s]

This will then result in:

            col_A    col_B   col_C  col_A_1D   col_B_1D  col_C_1D  col_A_rolling  col_B_rolling  col_C_rolling
2022-01-01  99330       12     122       NaN        NaN       NaN            NaN            NaN            NaN
2022-01-02   1123     1230    1287   99330.0       12.0     122.0      -0.988694     101.500000       9.549180
2022-01-03    123      101  812739    1123.0     1230.0    1287.0      -0.890472      -0.917886     630.498834
2022-01-04   1143  1230123     252     123.0      101.0  812739.0       8.292683   12178.435644      -0.999690
2022-01-05    234      342    4546    1143.0  1230123.0     252.0      -0.795276      -0.999722      17.039683

So to answer the main question:

Is there a way to write a loop so I can calculate the rolling returns on a daily ('1D'), weekly ('1W'), monthly ('1M') and six monthly ('6M') basis?

Yes, but there is also a way to do it without a loop :)

CodePudding user response：

pct_change does the shifting math for you, but you would have to do it one window at a time.

windows = ["1D", "7D"]
for window in windows:
    df1 = pd.merge(
        df1,
        (
            df1[["col_A", "col_B", "col_C"]]
            .pct_change(freq=window)
            .add_suffix(f"_rolling_{window}")
        ),
        left_index=True,
        right_index=True,
    )