Python - Given list of dates (as strings), how do we return only those that fall within last 365 day-CodePudding

Given the following list of strings

from datetime import datetime
import numpy as np


strings = ['Nov 1 2021', 'Oct 25 2021', 'Oct 18 2021', 'Oct 11 2021', 'Oct 4 2021', 'Sep 27 2021',
                'Sep 20 2021', 'Aug 24 2021', 'Aug 16 2021', 'Aug 9 2021', 'Aug 2 2021', 'Jul 26 2021',
                'Jun 28 2021', 'Jun 21 2021', 'Jun 14 2021', 'Jun 7 2021', 'May 24 2021', 'May 10 2021',
                'May 3 2021', 'Apr 26 2021', 'Apr 12 2021', 'Apr 12 2021', 'Apr 5 2021', 'Mar 22 2021',
                'Feb 22 2021', 'Feb 13 2021', 'Feb 8 2021', 'Feb 1 2021', 'Nov 2 2020', 'Sep 28 2020',
                'Aug 31 2020', 'Aug 20 2020', 'Aug 10 2020', 'Jun 29 2020', 'Jun 22 2020', 'Jun 15 2020',
                'Mar 2 2020', 'Feb 10 2020', 'Feb 3 2020', 'Jan 27 2020', 'Jan 20 2020', 'Jan 13 2020',
                'Jan 6 2020', 'Aug 26 2019', 'Aug 5 2019', 'Jul 29 2019', 'Jul 22 2019', 'Jul 15 2019']

What's the most efficient way to return a list of those dates that fall within the last 365 days?

Here's my failed attempt:

# Converts strings to datetime format and appends to new list, 'dates.'

dates = []
for item in strings:
    convert_string = datetime.strptime(item, "%b %d %Y").date()
    dates.append(convert_string)


# Given each item in 'dates' list returns corresponding
# list showing elapsed time between each item and today (Nov 11th 2021).

elapsed_time = []
def dateDelta(i):
    today = datetime.fromisoformat(datetime.today().strftime('%Y-%m-%d')).date()
    date = i
    delta = (today - date).days
    elapsed_time.append(delta)

for i in dates:
    dateDelta(i)


# Concatenates 'dates' list and 'elapsed_times_list' in attempt to somehow connect the two.

date_and_elapsed_time = []
date_and_elapsed_time.append(dates)
date_and_elapsed_time.append(elapsed_time)


# Takes 'elapsed_time list' appends only the dates that fall within the last 365 days.

relevant_elapsed_time_list = []
for i in elapsed_time:
    if i <= 365:
        relevant_elapsed_time_list.append(i)


# Finds indices of 'relevant_elapsed_time_list' within last 365 days.
# After trawling StackOverflow posts, I import numpy in an effort to help with indexing.
# My thinking is I can use the indices of the relevant elapsed times from the
# 'elapsed_time_list' and return the corresponding date from the 'dates' list.

relevant_elapsed_time_list_indices = []
for item in relevant_elapsed_time_list:
    indexes = []
    for index, sub_lst in enumerate(date_and_elapsed_time):
        try:
            indexes.append((index, sub_lst.index(item)))
        except ValueError:
            pass
    relevant_elapsed_time_list_indices.append(indexes)

relevant_elapsed_time_list_indices = np.array([[x[0][0], x[0][1]] for x in relevant_elapsed_time_list_indices])

At this point, I'm as yet unable to convert the relevant_elapsed_time_list_indices list to the corresponding indices for the first sub-list in date_and_elapsed_time. The point of this would be to then isolate those indices (i.e. dates).

What's the most efficient way to solve this problem?

CodePudding user response：

You can convert the strings to datetime objects using .strptime, then use a conditional list comprehension that uses timedelta to pick ones that fall within the last 365 days:

from datetime import datetime, timedelta

last_365_days = [s for s in strings if datetime.strptime(s, "%b %d %Y")   timedelta(days=365) >= datetime.today()]

Alternatively you can compute the cutoff date in advance:

cutoff = datetime.today() - timedelta(days=365)
last_365_days = [s for s in strings if datetime.strptime(s, "%b %d %Y") >= cutoff]

The value of last_365_days should then be (for today):

['Nov 1 2021', 'Oct 25 2021', 'Oct 18 2021', 'Oct 11 2021', 'Oct 4 2021',
 'Sep 27 2021', 'Sep 20 2021', 'Aug 24 2021', 'Aug 16 2021', 'Aug 9 2021',
 'Aug 2 2021', 'Jul 26 2021', 'Jun 28 2021', 'Jun 21 2021', 'Jun 14 2021',
 'Jun 7 2021', 'May 24 2021', 'May 10 2021', 'May 3 2021', 'Apr 26 2021',
 'Apr 12 2021', 'Apr 12 2021', 'Apr 5 2021', 'Mar 22 2021', 'Feb 22 2021',
 'Feb 13 2021', 'Feb 8 2021', 'Feb 1 2021']