I have two dictionaries where the values of both the dictionaries are lists of dates. What I would like to do is match the keys from both the dictionaries, and then compare the lists of dates between them in order to determine the total number of days between the dates. However, I only want to do it on the condition the date in the first dictionary is after the date in the second dictionary.
For example, if I have dictionary 1 dict1
and dictionary 2 dict
as follows:
dict1 = {... , a : ['2022-07-27'] , ...}
dict2 = {... , a : ['2022-07-21'] , ...}
Then first, the keys a
in both dict1
and dict2
match, and from the date list I then subtract the date in dict1
with the date in dict2
to get 6 days. This example is easy for me to code up, however the problem I have is when the list of dates becomes longer and more complex. For example:
dict1 = {... , b : ['2021-09-14', '2022-08-08'], ...}
dict2 = {... , b : ['2022-08-01'], ...}
Now, since the first date in dict1
is before the date in dict2
, I don't want to subtract it. However, the second date in dict1
is after the date in dict2
, so I do want to subtract it to determine the number of days inbetween i.e. (2022-08-08) - (2022-08-01) = 7 days
Another example is as follows:
dict1 = {... , c : ['2021-07-28', '2022-07-07', '2022-09-17'], ...}
dict2 = {... , c : ['2022-05-01', '2022-07-27'], ...}
Same as the previous example, since the first date in dict1
is before the date in dict2
, I don't want to subtract it. However, since the second date in dict1
is after the first date in dict2
, I do want to subtract it to determine the number of days inbetween i.e. (2022-07-07) - (2022-05-01) = x number of days
. And, since the third date in dict1
is after the second date in dict2
, I also want to subtract it to determine the number of days inbetween i.e. (2022-09-17) - (2022-07-27) = y number of days
and since I now have two values x
and y
, I want to add them together to get the total i.e. x y = total number of days
Is there a computationally light way of doing this? Thank you!
CodePudding user response:
This is a way that you can do what you want:
this answer is assuming that all of the dates are in chronological order
import datetime
def convert_to_time_obj(str_date):
return datetime.datetime.strptime(str_date, "%Y-%m-%d")
# get common keys between both dicts
dict1_keys = set(dict1.keys())
dict2_keys = set(dict2.keys())
common_keys = dict1_keys.intersection(dict2_keys)
# compare dict elements
for key in common_keys: # go through all common keys
total_days = 0 # used as a sum for case #3
multiple_dates_used = False # triggers the printing of total_days
last_index = 0 # assuming that all of the dates are in order from earliest to latest. to save time don't go through previously looked at values
for start_value in dict2[key]:
start_date = convert_to_time_obj(start_value)
for index, compare_value in enumerate(dict1[key][last_index:]): # grab the index to selectively not go through those element again based on the dates in dict2 having to be larger
compare_date = convert_to_time_obj(compare_value)
if compare_date <= start_date:
continue # skip over these dates
days_between = compare_date - start_date
print(days_between) # prints the days between only 2 dates
multiple_dates_used = bool(total_days) # updates to True when total_days > 0, therefore this would be a total of 2 or more dates with in between days
total_days = days_between.days
last_index = index 1
break # as only using the first set of dates for in between days
if multiple_dates_used:
print(total_days) # prints all of the days as in case 3
Let me know if this helps and anything I may have missed with what you were describing.
CodePudding user response:
Here's an alternative (not so different from Andrew's solution, though), which is using stacks instead of index book-keeping. I find that better readable. (The assumption here too is that the date-lists are in ascending order).
from datetime import datetime as dt
dict1 = ...
dict2 = ...
def to_date_stack(strings):
return [dt.strptime(string, "%Y-%m-%d") for string in reversed(strings)]
res = {}
for key in dict1.keys() & dict2.keys():
res[key] = 0
stack1, stack2 = map(to_date_stack, (dict1[key], dict2[key]))
while stack2 and stack1:
date2 = stack2.pop()
while stack1:
if (date1 := stack1.pop()) > date2:
res[key] = (date1 - date2).days
break
print(res)
The to_date_stack
function converts the list items to datetimes, in reversed order. Once that is done, the processing can simply done by poping items from the stacks. I've done some testing and found no difference in performance.