I have a dataframe ordered by date (asc). I would like to find the missing dates in this data frame
max_date = df['date'].tail(1)
min_date = df['date'].head(1)
missing_dates = pd.date_range( start= min_date, end= max_date).difference(df.index)
When trying to pass the dates as variables to the pd.date_range I get the following error:
Cannot convert input [0 2021-07-31 Name: date, dtype: datetime64[ns]] of type <class 'pandas.core.series.Series'> to Timestamp
If I were to modify the missing_dates line and fix the dates:
missing_dates = pd.date_range( start= '2021-07-31', end= '2022-04-19').difference(df.index)
Then the code outputs the missing dates. How can I pass the variable to the pd.date_range?
CodePudding user response:
df['date'].tail(1)
returns a Series, you can use item()
to get the first value.
max_date = df['date'].tail(1).item()
min_date = df['date'].head(1).item()