I have a list like this:
dates = [
datetime.date(2014, 11, 24),
datetime.date(2014, 11, 25),
datetime.date(2014, 11, 26),
# datetime.date(2014, 11, 27), # This one is missing
datetime.date(2014, 11, 28),
datetime.date(2014, 11, 29),
datetime.date(2014, 11, 30),
datetime.date(2014, 12, 1)]
I'm trying to find the missing dates between the start and end date, with this expr:
date_set = {dates[0] timedelta(x) for x in range((dates[-1] - dates[0]).days)}
Strangely enough, it throws an error - it can't access the dates
variable. But this expression runs fine:
date_set = {date(2015,2,11) timedelta(x) for x in range((dates[-1] - dates[0]).days)}
I wrote an expression that does what I wanted:
def find_missing_dates(dates: list[date]) -> list[date]:
"""Find the missing dates in a list of dates (that should already be sorted)."""
date_set = {(first_date timedelta(x)) for first_date, x in zip([dates[0]] * len(dates), range((dates[-1] - dates[0]).days))}
missing = sorted(date_set - set(dates))
return missing
It's an ugly expression and forced me to fill a second list with the same variable. Does anyone have a cleaner expression?
CodePudding user response:
If your dates
is sorted, you just need to iterate over it and add dates between into new list. Possible one-line solution I've already provided in this comment.
from datetime import date, timedelta
dates = [
date(2014, 11, 24), date(2014, 11, 25), date(2014, 11, 26),
date(2014, 11, 28), date(2014, 11, 29), date(2014, 11, 30),
date(2014, 12, 1)
]
missing = [d timedelta(days=j) for i, d in enumerate(dates[:-1], 1) for j in range(1, (dates[i] - d).days)]
You can do it using regular for loops:
from datetime import date, timedelta
dates = [
date(2014, 11, 24), date(2014, 11, 25), date(2014, 11, 26),
date(2014, 11, 28), date(2014, 11, 29), date(2014, 11, 30),
date(2014, 12, 1)
]
missing = []
for next_index, current_date in enumerate(dates[:-1], 1):
for days_diff in range(1, (dates[next_index] - current_date).days):
missing.append(current_date timedelta(days=days_diff))
CodePudding user response:
Something like the below. find min & max. loop from min to max and see which date is missing.
from datetime import timedelta, date
dates = [
date(2014, 11, 21),
date(2014, 11, 24),
date(2014, 11, 25),
date(2014, 11, 26),
date(2014, 11, 27),
date(2014, 11, 28),
date(2014, 11, 29),
date(2014, 11, 30),
date(2014, 12, 1)
]
_min = min(dates)
_max = max(dates)
missing = []
while _min < _max:
if _min not in dates:
missing.append(_min)
_min = timedelta(days=1)
print(missing)
output
[datetime.date(2014, 11, 22), datetime.date(2014, 11, 23)]