Cumulative return over x-days for specific dates-CodePudding

I have a DataFrame df1 with daily returns (1 r) and I try to calculate cumulative returns over 4 days for specific dates from a list called dates and assign them to a new DataFrame df2. Is there an efficient way to do that with pandas?

df1:
            Return
Date    
2022-01-01  1.03
2022-01-02  0.98
2022-01-03  1.01
2022-01-04  0.99
2022-01-05  1.01
2022-01-06  0.93
2022-01-07  0.96
2022-01-08  0.92
2022-01-09  1.03
2022-01-10  1.05
2022-01-11  1.08
2022-01-12  1.02

dates = ['2022-01-02', '2022-01-05', '2022-01-09', '2022-01-10']

I try to get df2 with the cumulative returns for the specific dates over 4 days. As an example for 2022-01-02: 0.9897 = 0.98 * 1.01 * 0.99 * 1.01

            CumReturn
Date    
2022-01-02  0.9897
2022-01-05  0.8296
2022-01-09  1.1914
2022-01-10  1.1567

Thanks a lot for your suggestion!

CodePudding user response：

You can use cut and GroupBy.prod:

g = pd.cut(pd.to_datetime(df1.index),
           bins=pd.to_datetime(dates ['2199-01-01']),
           labels=dates,
           right=False)

out = df1.groupby(g)['Return'].prod()

output:

2022-01-02    0.979902
2022-01-05    0.829590
2022-01-09    1.030000
2022-01-10    1.156680
Name: Return, dtype: float64

CodePudding user response：

You can compute a bottom-up rolling window product with a window size of 4 (or whatever you need). With min_periods=1, it computes all values, even if the window is smaller than the target value of 4, which seems to be necessary here (otherwise, 2022-01-10 would be NaN because there are only 3 values).

df2 = df.Return[::-1].rolling(4, min_periods=1).agg(lambda x:x.prod())[::-1].loc[dates]

This results in this series:

>>> df2
Date
2022-01-02    0.989701
2022-01-05    0.829590
2022-01-09    1.191380
2022-01-10    1.156680
Name: Return, dtype: float64

To turn it into a dataframe, do df2.to_frame('CumReturn').