I am calculating a field like this:
f['days'] = np.busday_count(pd.to_datetime(f['Start time']).values.astype('datetime64[D]'), \
pd.to_datetime(f['Stop time']).values.astype('datetime64[D]'))
However, I have columns f['Start time'] or f['Stop time'] with NaT values, I tried 'f['Start time']is pd.NaT
but I don't know how to implement this in above code
CodePudding user response:
You should read up on indexing.
Here i'm creating an index where neither start nor stop time is na (nan, nat, None etc). The .isna() method returns a series with True for na values. ~ is the bitwise not operator so we change both indexes to indicate True for valid dates. & is the bitwise and operator so we "merge" the 2 to indicate True where both dates are valid. You can then "filter" every Series your work on with this index.
valid_dates_index = (~f['Stop time'].isna()) & (~f['Start time'].isna())
f['days'][valid_dates_index ] = np.busday_count(\
pd.to_datetime(f['Start time'][valid_dates_index ]).values.astype('datetime64[D]'), \
pd.to_datetime(f['Stop time'][valid_dates_index ]).values.astype('datetime64[D]'))
You could also drop the rows containing na values with .dropna(axis=0,how='any')
Btw most numpy functions have 'ignore_nan' keyword that you can set to True.