Home > Enterprise >  Checking if all the dates between min and max dates present in a pandas dataframe? [duplicate]
Checking if all the dates between min and max dates present in a pandas dataframe? [duplicate]

Time:09-23

I have a data frame as follows.

pd.DataFrame({"Date":["09-01-2021","09-02-2021","09-02-2021","09-04-2021"]})

How do I check all the dates between max and min dates in the date column in a data frame?

Here "09-03-2021" is missing. So expected output is ["09-03-2021"], i.e. list of all missing dates.

CodePudding user response:

You can use the date_range function and check if the unique values are the same in your dataframe and the generated range -

df = pd.DataFrame({"Date":["09-01-2021","09-02-2021","09-02-2021","09-04-2021"]})
rng = pd.date_range(df.Date.min(), df.Date.max())
print(set(rng) == set(df['Date']))
#returns False

CodePudding user response:

Question 1

How do I check all the dates between max and min dates in the date column in a data frame?

import pandas as pd
df = pd.DataFrame({"Date":["09-01-2021","09-02-2021","09-02-2021","09-04-2021"]})
df['Date'] = pd.to_datetime(df['Date'])
print(df.loc[(df['Date'] > min(df['Date'])) & (df['Date'] < max(df['Date']))])
Date
1 09-02-2021
2 09-02-2021

EDIT

Question 2

How to list of all missing dates?

Reference (Available in the comment) check for any missing dates in the index

print(pd.date_range(start = min(df['Date']), end = max(df['Date'])).difference(df['Date']))

>>> DatetimeIndex(['2021-09-03'], dtype='datetime64[ns]', freq=None)

If the dataframe in the sorted order you can try using

print(pd.date_range(start = df.iloc[0]['Date'], end = df.iloc[-1]['Date']).difference(df['Date']))
  • Related