Home > Software design >  Calculate Weekly Percent Change in Timeseries data using Pandas for Missing Data in Dataframe
Calculate Weekly Percent Change in Timeseries data using Pandas for Missing Data in Dataframe

Time:08-12

I am having Financial Data,trying to calculate percent change in values between two consecutive thursday. Sometime due to Holiday's on Thursday the weekly data for this week is absent, so I want to calculate that week pct_change from Thursday to Wednesday, as Thursday data is not present in dataframe.

Reproducible Code-

# !pip install investpy
import pandas as pd
import investpy
from datetime import datetime
df = investpy.get_index_historical_data(index="Nifty 50",country="India",from_date=("23/03/2022"),to_date= "23/04/2022")
df['weekday'] = df.index.day_name()
df = df.loc[:, ['Close', 'weekday']]
df.tail(10)

Output-

              Close      weekday
Date        
2022-04-07  17639.55    Thursday
2022-04-08  17784.35    Friday
2022-04-11  17674.95    Monday
2022-04-12  17530.30    Tuesday
2022-04-13  17475.65    Wednesday
2022-04-18  17173.65    Monday
2022-04-19  16958.65    Tuesday
2022-04-20  17136.55    Wednesday
2022-04-21  17392.60    Thursday
2022-04-22  17171.95    Friday

In df.tail(10), 14-Apr-2022 date is missing as it's holiday, so in that case I want to calculate pct_change between Thursday to Wednesday.

Code I used previously to calculate pct_returns

weekly_pct_change = df.loc[df['weekday'] == 'Thursday']
weekly_pct_change['pct_change']= np.log(1 weekly_pct_change['Close'].pct_change())*100
weekly_pct_change

Output-

              Close      weekday    pct_change
Date            
2022-03-24  17222.75    Thursday    NaN
2022-03-31  17464.75    Thursday    1.395338
2022-04-07  17639.55    Thursday    0.995898
2022-04-21  17392.60    Thursday    -1.409871

CodePudding user response:

Onde idea is add missing datetimes by DataFrame.asfreq with method='ffill' and then reassign names of days by DatetimeIndex.day_name:

df1 = df.asfreq('B', method='ffill')
df1['weekday'] = df1.index.day_name()
print (df1.tail(10))
               Close    weekday
Date                           
2022-04-11  17674.95     Monday
2022-04-12  17530.30    Tuesday
2022-04-13  17475.65  Wednesday
2022-04-14  17475.65   Thursday
2022-04-15  17475.65     Friday
2022-04-18  17173.65     Monday
2022-04-19  16958.65    Tuesday
2022-04-20  17136.55  Wednesday
2022-04-21  17392.60   Thursday
2022-04-22  17171.95     Friday

weekly_pct_change = df1.loc[df1['weekday'] == 'Thursday'].copy()
weekly_pct_change['pct_change']= np.log(1 weekly_pct_change['Close'].pct_change())*100
print(weekly_pct_change)
               Close   weekday  pct_change
Date                                      
2022-03-24  17222.75  Thursday         NaN
2022-03-31  17464.75  Thursday    1.395338
2022-04-07  17639.55  Thursday    0.995898
2022-04-14  17475.65  Thursday   -0.933506
2022-04-21  17392.60  Thursday   -0.476366
  • Related