Any suggestions on how to speed up this little code?
The code works fine, it's just too slow.
df['business_day'] = df['date'].apply(lambda x: isbday(x, holidays=holidays.US()))
My goal is to have a column that flags (true or false) if a given date is a business day, considering US holidays
CodePudding user response:
You're constructing the US
instance again and again for every row you're checking. Try constructing it just once, eg:
us_holidays = holidays.US()
df['business_day'] = df['date'].apply(lambda x: isbday(x, holidays=us_holidays))
I'm not sure how much of an improvement you'll get time-wise but it'll definitely be more efficient than the way you currently have it.
CodePudding user response:
You can use np.is_busday
# Assuming date column is DatetimeIndex
hol = holidays.US(years=range(df['date'].dt.year.min(), df['date'].dt.year.max() 1))
df['business_day'] = np.is_busday(df['date'].astype(str).tolist(),
weekmask='1111100', holidays=list(hol.keys()))
Output:
>>> df
date business_day
0 2011-07-03 False
1 2011-07-04 False
2 2011-07-05 True
3 2011-07-06 True
4 2011-07-07 True
5 2011-07-08 True
6 2011-07-09 False
CodePudding user response:
pip install swifter
and
import swifter
df.swifter.apply()