Home > Enterprise >  back filling and forward filling pandas to datetime quarterlies
back filling and forward filling pandas to datetime quarterlies

Time:10-24

I'm trying trying to populate the quarterly expiration date based on an expiration date. I'm able to identify the quarterly date when it's on the expiration, but similar as to how I did the monthly expire. I want the quarterly date to back fill and forward fill to the next quarterly expire.

I know I can use libraries outside of Pandas, but is there a way to do this in a pandas dataframe?

import pandas as pd
import pandas.tseries.offsets as offsets
import numpy as np
from datetime import date

expire = ['2022-10-24','2022-10-26','2022-10-28','2022-10-31','2022-11-02','2022-11-04','2022-11-07','2022-11-09','2022-11-11','2022-11-14',
'2022-11-16','2022-11-18','2022-11-21','2022-11-23','2022-11-25','2022-12-02','2022-12-16','2022-12-30','2023-01-20','2023-02-17','2023-03-17',
'2023-03-31','2023-06-16','2023-06-30','2023-09-15','2023-09-29','2023-12-15','2024-01-19','2024-06-21','2024-12-20','2025-01-17']

quarterlies = ['2022-12-30','2023-03-31','2023-06-30','2023-09-29','2023-12-29','2024-03-28','2024-06-28','2024-09-30','2024-12-31','2025-03-31',
'2025-06-30','2025-09-30','2025-12-31']

date_df = pd.DataFrame(expire)
date_df.columns = ["expiration_date"]
date_df = date_df.set_index(date_df["expiration_date"])
date_df["expiration_date"] = pd.to_datetime(date_df["expiration_date"], format="%Y-%m-%d")
date_df.index.name = "expiration_date_idx"

all_third_fridays = pd.DataFrame(pd.date_range(date_df["expiration_date"].min(),
                      date_df["expiration_date"].max() pd.tseries.offsets.Day(30),
                      freq="WOM-3FRI"),
                      columns=["monthly_exp"])
              
date_df = pd.merge_asof(date_df, all_third_fridays, left_on="expiration_date", right_on="monthly_exp", direction="forward")

df_quarterly = pd.DataFrame(quarterlies)
df_quarterly.columns = ['quarterly_expire']
date_df['quarterly_expire'] = date_df['expiration_date'].isin(df_quarterly['quarterly_expire'])
date_df['quarterly_exp'] = date_df['expiration_date']   offsets.QuarterEnd() - offsets.Week(weekday=4) #not accurate, so I had to use list
date_df.loc[(date_df['quarterly_expire'] == True), 'quarterly_expiration'] = date_df['expiration_date']
#date_df["quarterly_expiration"].bfill(axis =df_quarterly['quarterly_expiration']) series error

print(date_df)

The bfill would work for all of the options except the ones highlighted toward the bottom where they need to forward fill to the next quarterlies date.

However, I'm getting a the following error and I'm having trouble understanding why? This still doesn't solve the full problem illustrated by the yellow highlights.

ValueError: No axis named quarterly_expiration for object type Series

This is the output that I'm getting:

   expiration_date monthly_exp  quarterly_expire quarterly_exp quarterly_expiration
0       2022-10-24  2022-11-18             False    2022-12-30                  NaT
1       2022-10-26  2022-11-18             False    2022-12-30                  NaT
2       2022-10-28  2022-11-18             False    2022-12-30                  NaT
3       2022-10-31  2022-11-18             False    2022-12-30                  NaT
4       2022-11-02  2022-11-18             False    2022-12-30                  NaT
5       2022-11-04  2022-11-18             False    2022-12-30                  NaT
6       2022-11-07  2022-11-18             False    2022-12-30                  NaT
7       2022-11-09  2022-11-18             False    2022-12-30                  NaT
8       2022-11-11  2022-11-18             False    2022-12-30                  NaT
9       2022-11-14  2022-11-18             False    2022-12-30                  NaT
10      2022-11-16  2022-11-18             False    2022-12-30                  NaT
11      2022-11-18  2022-11-18             False    2022-12-30                  NaT
12      2022-11-21  2022-12-16             False    2022-12-30                  NaT
13      2022-11-23  2022-12-16             False    2022-12-30                  NaT
14      2022-11-25  2022-12-16             False    2022-12-30                  NaT
16      2022-12-16  2022-12-16             False    2022-12-30                  NaT
17      2022-12-30  2023-01-20              True    2022-12-30           2022-12-30
18      2023-01-20  2023-01-20             False    2023-03-24                  NaT
19      2023-02-17  2023-02-17             False    2023-03-24                  NaT
20      2023-03-17  2023-03-17             False    2023-03-24                  NaT
21      2023-03-31  2023-04-21              True    2023-06-23           2023-03-31
22      2023-06-16  2023-06-16             False    2023-06-23                  NaT
23      2023-06-30  2023-07-21              True    2023-09-29           2023-06-30
24      2023-09-15  2023-09-15             False    2023-09-29                  NaT
25      2023-09-29  2023-10-20              True    2023-09-29           2023-09-29
26      2023-12-15  2023-12-15             False    2023-12-29                  NaT
27      2024-01-19  2024-01-19             False    2024-03-29                  NaT
28      2024-06-21  2024-06-21             False    2024-06-28                  NaT
29      2024-12-20  2024-12-20             False    2024-12-27                  NaT
30      2025-01-17  2025-01-17             False    2025-03-28                  NaT

This is the output I'm shooting for:

enter image description here

Perhaps there is a better way to handle this problem that I'm not thinking of. I'm open to suggestions.

CodePudding user response:

BQuarterEnd offset should do what you need.

date_df.assign(quarterly_expiration=lambda x: x['expiration_date']   pd.tseries.offsets.BQuarterEnd())
  • Related