I have a simple definition which finds the third friday of the month. I use this function to populate the dataframe for the third fridays and that part works fine.
The trouble I'm having is finding the third friday for an expiration_date that doesn't fall on a third friday.
This is my code simplified:
import pandas as pd
def is_third_friday(d):
return d.weekday() == 4 and 15 <= d.day <= 21
x = ['09/23/2022','09/26/2022','09/28/2022','09/30/2022','10/3/2022','10/5/2022',
'10/7/2022','10/10/2022','10/12/2022','10/14/2022','10/17/2022','10/19/2022','10/21/2022',
'10/24/2022','10/26/2022','10/28/2022','11/4/2022','11/18/2022','12/16/2022','12/30/2022',
'01/20/2023','03/17/2023','03/31/2023','06/16/2023','06/30/2023','09/15/2023','12/15/2023',
'01/19/2024','06/21/2024','12/20/2024','01/17/2025']
df = pd.DataFrame(x)
df.rename( columns={0 :'expiration_date'}, inplace=True )
df['expiration_date']= pd.to_datetime(df['expiration_date'])
expiration_date = df['expiration_date']
df["is_third_friday"] = [is_third_friday(x) for x in expiration_date]
third_fridays = df.loc[df['is_third_friday'] == True]
df["current_monthly_exp"] = third_fridays['expiration_date'].min()
df["monthly_exp"] = third_fridays[['expiration_date']]
df.to_csv(path_or_buf = f'C:/Data/Date Dataframe.csv',index=False)
What I'm looking for is any expiration_date that is prior to the monthly expire, I want to populate the dataframe with that monthly expire. If it's past the monthly expire date I want to populate the dataframe with the following monthly expire.
I thought I'd be able to use a new dataframe with only the monthly expires as a lookup table and do a timedelta, but when you look at 4/21/2023 and 7/21/2023 these dates don't exist in that dataframe.
This is my current output:
This is the output I'm seeking:
I was thinking I could handle this problem with something like:
date_df["monthly_exp"][0][::-1].expanding().min()[::-1]
But, it wouldn't solve for the 4/21/2023 and 7/21/2023 problem. Additionally, Pandas won't let you do this in a datetime dataframe.
>>> df = pd.DataFrame([1, nan,2,nan,nan,nan,4])
>>> df
0
0 1.0
1 NaN
2 2.0
3 NaN
4 NaN
5 NaN
6 4.0
>>> df["b"] = df[0][::-1].expanding().min()[::-1]
>>> df
0 b
0 1.0 1.0
1 NaN 2.0
2 2.0 2.0
3 NaN 4.0
4 NaN 4.0
5 NaN 4.0
6 4.0 4.0
I've also tried something like the following in many different forms with little luck:
if df['is_third_friday'].any() == True:
df["monthly_exp"] = third_fridays[['expiration_date']]
else:
df["monthly_exp"] = third_fridays[['expiration_date']].shift(third_fridays)
Any suggestions to get me in the right direction would be appreciated. I've been stuck on this problem for sometime.
CodePudding user response:
You could add these additional lines of code (to replace df["monthly_exp"] = third_fridays[['expiration_date']]
:
# DataFrame of fridays from minimum expiration_date to 30 days after last
fri_3s = pd.DataFrame(pd.date_range(df["expiration_date"].min(),
df["expiration_date"].max() pd.tseries.offsets.Day(30),
freq="W-FRI"),
columns=["monthly_exp"])
# only keep those that are between 15th and 21st (as your function did)
fri_3s = fri_3s[fri_3s.monthly_exp.dt.day.between(15, 21)]
# merge_asof to get next third friday
df = pd.merge_asof(df, fri_3s, left_on="expiration_date", right_on="monthly_exp", direction="forward")
This creates a second DataFrame with the 3rd Fridays, and then by merging with merge_asof
returns the next of these from the expiration_date
.
And to simplify your date_df["monthly_exp"][0][::-1].expanding().min()[::-1]
and use it for datetime, you could instead write df["monthly_exp"].bfill()
(which backward fills). As you mentioned, this will only include Fridays that exist in your DataFrame already, so creating a list of the possible Fridays might be the easiest way.