I have a dataframe with quarterly returns of financial entities and I want to calculate 1, 3, 5 10-year annualized returns. The formula for calculating annualized returns is:
R = product(1 r)^(4/N) -1
r are the quarterly return of an entity, N is the number of quarters
for example 3-year annualized return is:
R_3yr = product(1 r)^(4/12) -1 = ((1 r1)*(1 r2)*(1 r3)*...*(1 r12))^(1/3) -1
r1, r2, r3 ... r12 are the quarterly returns of the previous 11 quarters plus current quarter.
I created a code which provides the right results but it is very slow because it is looping through each row of the dataframe. The code below is an extract of my code for 1-year and 3-year annualized retruns (I applied the same concept for 5, 7, 10, 15 and 20-year returns). r_qrt is the field with the quarterly returns
import pandas as pd
import numpy as np
#create dataframe where I append the results
df_final = pd.DataFrame()
columns=['Date','Entity','r_qrt','R_1yr','R_3yr']
#loop thorugh the dataframe
for row in df.itertuples():
R_1yr=np.nan #1-year annualized return
R_3yr=np.nan #3-year annualized return
#Calculate 1 YR Annualized Return
date_previous_period=row.Date pd.DateOffset(years=-1)
temp_table=df.loc[(df['Date']>date_previous_period) &
(df['Date']<=row.Date) &
(df['Entity']==row.Entity)]
if temp_table['r_qrt'].count()>=4:
b=(1 (temp_table.r_qrt))[-4:].product()
R_1yr=(b-1)
#Calculate 3 YR Annualized Return
date_previous_period=row.Date pd.DateOffset(years=-3)
temp_table=df.loc[(df['Date']>date_previous_period) &
(df['Date']<=row.Date) &
(df['Entity']==row.Entity)]
if temp_table['r_qrt'].count()>=12:
b=(1 (temp_table.r_qrt))[-12:].product()
R_3yr=((b**(1/3))-1)
d=[row.Date,row.Entity,row.r_qrt,R_1yr,R_3yr]
df_final = df_final.append(pd.Series(d, index=columns), ignore_index=True)
df_final looks as below (only reporting 1-year return results for space limitations)
Date | Entity | r_qrt | R_1yr |
---|---|---|---|
2015-03-31 | A | 0.035719 | NaN |
2015-06-30 | A | 0.031417 | NaN |
2015-09-30 | A | 0.030872 | NaN |
2015-12-31 | A | 0.029147 | 0.133335 |
2016-03-31 | A | 0.022100 | 0.118432 |
2016-06-30 | A | 0.020329 | 0.106408 |
2016-09-30 | A | 0.017676 | 0.092245 |
2016-12-31 | A | 0.017304 | 0.079676 |
2015-03-31 | B | 0.034705 | NaN |
2015-06-30 | B | 0.037772 | NaN |
2015-09-30 | B | 0.036726 | NaN |
2015-12-31 | B | 0.031889 | 0.148724 |
2016-03-31 | B | 0.029567 | 0.143020 |
2016-06-30 | B | 0.028958 | 0.133312 |
2016-09-30 | B | 0.028890 | 0.124746 |
2016-12-31 | B | 0.030389 | 0.123110 |
I am sure there is a more efficient way to run the same calculations but I have not been able to find it. My code is not efficient and takes more than 2 hours for large dataframes with long time series and many entities.
Thanks
CodePudding user response:
see (https://www.investopedia.com/terms/a/annualized-total-return.asp) for the definition of annualized return
data=[ 3, 7, 5, 12, 1]
def annualize_rate(data):
retVal=0
accum=1
for item in data:
print(1 (item/100))
accum*=1 (item/100)
retVal=pow(accum,1/len(data))-1
return retVal
print(annualize_rate(data))
output
0.05533402290765199
2015 (a and b)
data=[0.133335,0.148724]
print(annualize_rate(data))
output:
0.001410292043902306
2016 (a&b)
data=[0.079676,0.123110]
print(annualize_rate(data))
output
0.0010139064424810051
you can store each year annualized value then use pct_chg to get a 3 year result
data=[0.05,0.06,0.07]
df=pd.DataFrame({'Annualized':data})
df['Percent_Change']=df['Annualized'].pct_change().fillna(0)
amount=1
returns_plus_one=df['Percent_Change'] 1
cumulative_return = returns_plus_one.cumprod()
df['Cumulative']=cumulative_return.mul(amount)
df['2item']=df['Cumulative'].rolling(window=2).mean().plot()
print(df)
CodePudding user response:
For future reference of other users, this is the new version of the code that I implemented following Golden Lion suggestion:
def compoundfunct(arr):
return np.product(1 arr)**(4/len(arr)) - 1
# 1-yr annulized return
df["R_1Yr"]=df.groupby('Entity')['r_qrt'].rolling(4).apply(compoundfunct).groupby('Entity').shift(0).reset_index().set_index('level_1').drop('Entity',axis=1)
# 3-yr annulized return
df["R_3Yr"]=df.groupby('Entity')['r_qrt'].rolling(12).apply(compoundfunct).groupby('Entity').shift(0).reset_index().set_index('level_1').drop('Entity',axis=1)
The performance of the previous code was 36.4 sec for a dataframe of 5,640 rows. The new code is more than 10x faster, it took 2.8 sec
One of the issues with this new code is that one has to make sure that rows are sorted by group (Entity in my case) and date before running the calculations, otherwise results could be wrong.
Thanks, S.