I have a following problem. I would like to calculate number of business days between two dates. Example:
import numpy as np
pokus = {"start_date" : "2022-01-01 10:00:00" , "end_date" : "2022-01-01 17:00:00" }
df = pd.DataFrame(pokus, index=[0])
cas_df["bus_days"] = np.busday_count(pd.to_datetime(df["start_date"]) , pd.to_datetime(df["end_date"]))
which returns a confusing error
Traceback (most recent call last):
File "/home/vojta/.local/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3251, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-17-8910d714721b>", line 3, in <module>
cas_df["bus_days"] = np.busday_count(pd.to_datetime(df["start_date"]) , pd.to_datetime(df["end_date"]))
File "<__array_function__ internals>", line 180, in busday_count
TypeError: Iterator operand 0 dtype could not be cast from dtype('<M8[ns]') to dtype('<M8[D]') according to the rule 'safe'
How can I fix it please? Thanks
CodePudding user response:
np.busday_count
accepts only datetime64[D]
, but pandas Dataframes and Series can only hold datetime64[ns]
, as explained in this answer.
So what we can do is convert the start and end date columns to a numpy array (as type datetime64[D]
), and then pass these values to np.busday_count
:
days = df[['start_date', 'end_date']].to_numpy().astype('datetime64[D]')
cas_df["bus_days"] = np.busday_count(days[:, 0], days[:, 1])
CodePudding user response:
try this :
cas_df["bus_days"] = np.busday_count(pd.to_datetime(df["start_date"]).values.astype('datetime64[D]') , pd.to_datetime(df["end_date"]).values.astype('datetime64[D]'))