Given the following list of strings
from datetime import datetime
import numpy as np
strings = ['Nov 1 2021', 'Oct 25 2021', 'Oct 18 2021', 'Oct 11 2021', 'Oct 4 2021', 'Sep 27 2021',
'Sep 20 2021', 'Aug 24 2021', 'Aug 16 2021', 'Aug 9 2021', 'Aug 2 2021', 'Jul 26 2021',
'Jun 28 2021', 'Jun 21 2021', 'Jun 14 2021', 'Jun 7 2021', 'May 24 2021', 'May 10 2021',
'May 3 2021', 'Apr 26 2021', 'Apr 12 2021', 'Apr 12 2021', 'Apr 5 2021', 'Mar 22 2021',
'Feb 22 2021', 'Feb 13 2021', 'Feb 8 2021', 'Feb 1 2021', 'Nov 2 2020', 'Sep 28 2020',
'Aug 31 2020', 'Aug 20 2020', 'Aug 10 2020', 'Jun 29 2020', 'Jun 22 2020', 'Jun 15 2020',
'Mar 2 2020', 'Feb 10 2020', 'Feb 3 2020', 'Jan 27 2020', 'Jan 20 2020', 'Jan 13 2020',
'Jan 6 2020', 'Aug 26 2019', 'Aug 5 2019', 'Jul 29 2019', 'Jul 22 2019', 'Jul 15 2019']
What's the most efficient way to return a list of those dates that fall within the last 365 days?
Here's my failed attempt:
# Converts strings to datetime format and appends to new list, 'dates.'
dates = []
for item in strings:
convert_string = datetime.strptime(item, "%b %d %Y").date()
dates.append(convert_string)
# Given each item in 'dates' list returns corresponding
# list showing elapsed time between each item and today (Nov 11th 2021).
elapsed_time = []
def dateDelta(i):
today = datetime.fromisoformat(datetime.today().strftime('%Y-%m-%d')).date()
date = i
delta = (today - date).days
elapsed_time.append(delta)
for i in dates:
dateDelta(i)
# Concatenates 'dates' list and 'elapsed_times_list' in attempt to somehow connect the two.
date_and_elapsed_time = []
date_and_elapsed_time.append(dates)
date_and_elapsed_time.append(elapsed_time)
# Takes 'elapsed_time list' appends only the dates that fall within the last 365 days.
relevant_elapsed_time_list = []
for i in elapsed_time:
if i <= 365:
relevant_elapsed_time_list.append(i)
# Finds indices of 'relevant_elapsed_time_list' within last 365 days.
# After trawling StackOverflow posts, I import numpy in an effort to help with indexing.
# My thinking is I can use the indices of the relevant elapsed times from the
# 'elapsed_time_list' and return the corresponding date from the 'dates' list.
relevant_elapsed_time_list_indices = []
for item in relevant_elapsed_time_list:
indexes = []
for index, sub_lst in enumerate(date_and_elapsed_time):
try:
indexes.append((index, sub_lst.index(item)))
except ValueError:
pass
relevant_elapsed_time_list_indices.append(indexes)
relevant_elapsed_time_list_indices = np.array([[x[0][0], x[0][1]] for x in relevant_elapsed_time_list_indices])
At this point, I'm as yet unable to convert the relevant_elapsed_time_list_indices
list to the corresponding indices for the first sub-list in date_and_elapsed_time
. The point of this would be to then isolate those indices (i.e. dates).
What's the most efficient way to solve this problem?
CodePudding user response:
You can convert the strings to datetime objects using .strptime
, then use a conditional list comprehension that uses timedelta
to pick ones that fall within the last 365 days:
from datetime import datetime, timedelta
last_365_days = [s for s in strings if datetime.strptime(s, "%b %d %Y") timedelta(days=365) >= datetime.today()]
Alternatively you can compute the cutoff date in advance:
cutoff = datetime.today() - timedelta(days=365)
last_365_days = [s for s in strings if datetime.strptime(s, "%b %d %Y") >= cutoff]
The value of last_365_days
should then be (for today):
['Nov 1 2021', 'Oct 25 2021', 'Oct 18 2021', 'Oct 11 2021', 'Oct 4 2021',
'Sep 27 2021', 'Sep 20 2021', 'Aug 24 2021', 'Aug 16 2021', 'Aug 9 2021',
'Aug 2 2021', 'Jul 26 2021', 'Jun 28 2021', 'Jun 21 2021', 'Jun 14 2021',
'Jun 7 2021', 'May 24 2021', 'May 10 2021', 'May 3 2021', 'Apr 26 2021',
'Apr 12 2021', 'Apr 12 2021', 'Apr 5 2021', 'Mar 22 2021', 'Feb 22 2021',
'Feb 13 2021', 'Feb 8 2021', 'Feb 1 2021']