Home > Enterprise >  get difference of two dates using np.busday_count in pandas
get difference of two dates using np.busday_count in pandas

Time:08-28

Let's say i have a dataframe like this:

     date_1         date_2
0  2022-08-01     2022-08-05
1  2022-08-20        NaN
2    NaN             NaN

I want to have another column which tells the difference in business days and have a daframe like this (in case date_2 is empty, it will be compared to today's date (2022-08-28)):

     date_1         date_2          diff
0  2022-08-01     2022-08-05          4
1  2022-08-20        NaN              5
2    NaN             NaN              Empty

I tried to use this one:

df["diff"] = df.apply(
lambda x: np.busday_count(x.date_1, x.date_2) if (x.date_1 != '' and x.date_2 != '') else (np.busday_count(x.date_1, np.datetime64('today')) if (x.date_1 != '' and x.date_2 == '') else ''), axis=1)

but im getting this error:

Iterator operand 0 dtype could not be cast from dtype('<M8[us]') to dtype('<M8[D]') according to the rule 'safe'

Any idea how to get the desired dataframe?

CodePudding user response:

I think you just need to coerce the types:

def busday_diff(x):
    if pd.isna(x.date_1) and pd.isna(x.date_2):
        return ''
    date2_to_use = pd.Timestamp("today") if  pd.isna(x.date_2) else x.date_2
    return np.busday_count(np.datetime64(x.date_1, "D"), np.datetime64(date2_to_use, "D"))  

df["diff"] = df.apply( lambda x: busday_diff(x) , axis=1)

I opened up the lambda for readability also

CodePudding user response:

I want to have another column which tells the difference in days

If you just want days:

df.assign(
    diff=lambda df: (
        pd.to_datetime(df["date_2"]).fillna(pd.Timestamp.now())
        - pd.to_datetime(df["date_1"])
    ).dt.days
)

which outputs

       date_1      date_2  diff
0  2022-08-01  2022-08-05   4.0
1  2022-08-20         NaN   8.0
2         NaN         NaN   NaN

EDIT: it was later clarified that they want business days, so please refer to the other answer instead

  • Related