How can I override BDay function so recognizes Juneteenth? Using pandas-1.4.2
from datetime import date
from pandas.tseries.offsets import BDay
date(2022,6,17) BDay(1)
CodePudding user response:
We need to tell pandas about holidays. BusinessDay
cannot handle holidays, so we need to replace it with CustomBusinessDay
, and tell it which holidays we have.
There are some included in pandas. I'm going to assume US based on your question:
from datetime import date
import pandas.tseries.holiday
holiday_dates = (pandas.tseries.holiday.USFederalHolidayCalendar()
.holidays(start="2022-01-01", end="2023-01-01"))
# holiday_dates output:
# DatetimeIndex(['2022-01-17', '2022-02-21', '2022-05-30', '2022-06-20',
# '2022-07-04', '2022-09-05', '2022-10-10', '2022-11-11',
# '2022-11-24', '2022-12-26'],
# dtype='datetime64[ns]', freq=None)
So Pandas knows some holidays.
If you use Pandas 1.4 or later, Juneteenth is included in USFederalHolidayCalendar
and you should use it directly like this:
from pandas.tseries.offsets import CustomBusinessDay
date(2022,6,17) CustomBusinessDay(1, holidays=holiday_dates)
# Timestamp('2022-06-21 00:00:00')
With older Pandas, we can let it know about the holiday like the following.
Juneteenth corresponds to June 19th, but seems to fall as a holiday on June 20th in 2022 in particular. Pandas has a rule for this - next workday (if you know the rule more exactly, please update this answer).
This is the long but proper way around to add this as a holiday:
import pandas.tseries.holiday
from pandas.tseries.holiday import next_workday
juneteenth = (pandas.tseries.holiday.Holiday("Juneteenth",
month=6, day=19, observance=next_workday))
# Holiday: Juneteenth (month=6, day=19, observance=<function next_workday at 0x7f25db280dc0>)
# Now compute the actual dates the holiday will fall on in 2022
# (can be any date or year range)
juneteenth_dates = juneteenth.dates(date(2022,1,1), date(2023,1,1))
from pandas.tseries.offsets import CustomBusinessDay
# Now compute the offset we wanted
date(2022,6,17) CustomBusinessDay(1, holidays=juneteenth_dates)
# Timestamp('2022-06-21 00:00:00')