I have a a csv file which I have imported as follows:
ps0pyc=pd.read_csv(r'/Users/swapnilgupta/Desktop/fend/p0.csv')
ps0pyc['Date'] = pd.to_datetime(ps0pyc['Date'], dayfirst= True)
ps0pyc
Date PORTVAL
0 2013-01-03 17.133585
1 2013-01-04 17.130434
2 2013-01-07 17.396581
3 2013-01-08 17.308323
4 2013-01-09 17.475933
... ... ...
2262 2021-12-28 205.214555
2263 2021-12-29 204.076193
2264 2021-12-30 203.615507
2265 2021-12-31 201.143990
2266 2022-01-03 204.867302
2267 rows × 2 columns
It is a dataframe time series , i.e stock data which has approx 252 trading days per year ranging from 2013 to 2022 I am trying to apply time series module of PyCaret over it only problem which I encounter is that PyCaret doesn't support modeling for daily data with missing values , and my dataset has stock data per year of 252 days and not continuous 366/365 days
What is alternate solution to this and how should i use such data with gaps in Pycaret time series module ?
CodePudding user response:
Set index to your dataframe
ps0pyc.set_index('Date',inplace=True)
**Create a new continuous index for the period **
new_idx = pd.date_range('01-01-2013', '01-01-2023')
Reindex your dataframe
reindexing your dataframe to newly created index
ps0pyc = ps0pyc.reindex(new_idx , fill_value=0)
You can also forward fill or back fill with
ps0pyc = ps0pyc['PORTVAL'].ffill(inplace=True)
#or
ps0pyc = ps0pyc['PORTVAL'].bfill(inplace=True)