Home > database >  How to tell if a pandas date time difference is null?
How to tell if a pandas date time difference is null?

Time:01-04

I need to fill in missing dates in a pandas data frame. The dataframe consists of weekly sales data for multiple items. I am looping through each item to see if there are missing weeks of dates with the intention of filling in those dates with a '0' for sales and all other information copied down.

I use the following code to find the missing dates:

pd.date_range(start="2017-01-13", end="2022-12-16", freq = "W-SAT").difference(df_['week_date'])

While I can print the missing dates and search manually for the few items that are missing sales weeks, I have not found a way to do this programmatically.

I tried

for item in df['ord_base7'].unique():
df_ = df[df['ord_base7'] == item]
if pd.date_range(start="2017-01-13", end="2022-12-16", freq = "W-SAT").difference(df_['week_date']).isnan() == True:
    pass
else:
    print(item, pd.date_range(start="2017-01-13", end="2022-12-16", freq = "W-SAT").difference(df_['week_date']))

That yielded the error:

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
/tmp/ipykernel_55320/2582723605.py in <module>
      1 for item in df['ord_base7'].unique():
      2     df_ = df[df['ord_base7'] == item]
----> 3     if pd.date_range(start="2017-01-13", end="2022-12-16", freq = "W-SAT").difference(df_['week_date']).isnan() == True:
      4         pass
      5     else:

AttributeError: 'DatetimeIndex' object has no attribute 'isnan'

How can I program a way to see if there are no dates missing so those items can be passed over?

CodePudding user response:

Looping on a pandas dataframe is not a good idea because it's inefficient. Just use the .fillna() and pass in whatever value you want to be set instead of NaN:

df['week_date'].fillna(0)

CodePudding user response:

Nevermind... I just tried the following and it worked.

for item in df['ord_base7'].unique():
df_ = df[df['ord_base7'] == item]
if pd.date_range(start="2017-01-13", end="2022-12-16", freq = "W-SAT").difference(df_['week_date']).empty == True:
    pass
else:
    print(item, pd.date_range(start="2017-01-13", end="2022-12-16", freq = "W-SAT").difference(df_['week_date']))

The .empty is how to do this with a date time index.

  • Related