I would like to flag all rows with dates 1 week before and 1 week after a specific holiday to be = 1; = 0 otherwise.
What's the best way to do so? Below are my codes, which only flag New Year's Day to be new_year = 1. What I want is all 3 rows to have new_year = 1 (since they fall within 1 week before and after New Year's Day).
Note: I would like the code to work for any holidays (e.g. Thanksgiving, Easter, etc.).
Thank you!
# importing pandas as pd
import pandas as pd
import holidays
# Creating the dataframe
df = pd.DataFrame({'Date': ['1/1/2019', '1/5/2019', '12/28/2018'],
'Event': ['Music', 'Poetry', 'Theatre'],
'Cost': [10000, 5000, 15000]})
df['newDate'] = pd.to_datetime(df['Date'], format='%m/%d/%Y')
new_year = holidays.HolidayBase()
new_year.append({"2018-01-01": "New Year's Day",
"2019-01-01": "New Year's Day"})
df['hol_new_year'] = np.where(df['newDate'] in new_year, 1, 0)
CodePudding user response:
You can use pandas' time series offsets:
ye = pd.tseries.offsets.YearEnd()
yb = pd.tseries.offsets.YearBegin()
d = pd.to_timedelta('1w')
s = df['newDate']
df['hol_new_year'] = (s.between(s-ye-d, s-ye d)
|s.between(s yb-d, s yb d)
).astype(int)
Output:
Date Event Cost newDate hol_new_year
0 1/1/2019 Music 10000 2019-01-01 1
1 1/5/2019 Poetry 5000 2019-01-05 1
2 12/28/2018 Theatre 15000 2018-12-28 1
3 1/15/2021 SO 0 2021-01-15 0