Comparing Dates in Lists between two Dictionaries-CodePudding

I have two dictionaries where the values of both the dictionaries are lists of dates. What I would like to do is match the keys from both the dictionaries, and then compare the lists of dates between them in order to determine the total number of days between the dates. However, I only want to do it on the condition the date in the first dictionary is after the date in the second dictionary.

For example, if I have dictionary 1 dict1 and dictionary 2 dict as follows:

dict1 = {... , a : ['2022-07-27'] , ...}
dict2 = {... , a : ['2022-07-21'] , ...}

Then first, the keys a in both dict1 and dict2 match, and from the date list I then subtract the date in dict1 with the date in dict2 to get 6 days. This example is easy for me to code up, however the problem I have is when the list of dates becomes longer and more complex. For example:

dict1 = {... , b : ['2021-09-14', '2022-08-08'], ...}
dict2 = {... , b : ['2022-08-01'], ...}

Now, since the first date in dict1 is before the date in dict2, I don't want to subtract it. However, the second date in dict1 is after the date in dict2, so I do want to subtract it to determine the number of days inbetween i.e. (2022-08-08) - (2022-08-01) = 7 days

Another example is as follows:

dict1 = {... , c : ['2021-07-28', '2022-07-07', '2022-09-17'], ...}
dict2 = {... , c : ['2022-05-01', '2022-07-27'], ...}

Same as the previous example, since the first date in dict1 is before the date in dict2, I don't want to subtract it. However, since the second date in dict1 is after the first date in dict2, I do want to subtract it to determine the number of days inbetween i.e. (2022-07-07) - (2022-05-01) = x number of days. And, since the third date in dict1 is after the second date in dict2, I also want to subtract it to determine the number of days inbetween i.e. (2022-09-17) - (2022-07-27) = y number of days and since I now have two values x and y, I want to add them together to get the total i.e. x y = total number of days

Is there a computationally light way of doing this? Thank you!

CodePudding user response：

This is a way that you can do what you want:

this answer is assuming that all of the dates are in chronological order

import datetime

def convert_to_time_obj(str_date):
    return datetime.datetime.strptime(str_date, "%Y-%m-%d")
    
# get common keys between both dicts
dict1_keys = set(dict1.keys())
dict2_keys = set(dict2.keys())
common_keys = dict1_keys.intersection(dict2_keys)

# compare dict elements
for key in common_keys: # go through all common keys
    total_days = 0 # used as a sum for case #3
    multiple_dates_used = False # triggers the printing of total_days
    last_index = 0 # assuming that all of the dates are in order from earliest to latest. to save time don't go through previously looked at values
    for start_value in dict2[key]:
        start_date = convert_to_time_obj(start_value)
        
        for index, compare_value in enumerate(dict1[key][last_index:]): # grab the index to selectively not go through those element again based on the dates in dict2 having to be larger
            compare_date = convert_to_time_obj(compare_value)
            if compare_date <= start_date:
                continue # skip over these dates
            days_between = compare_date - start_date
            print(days_between) # prints the days between only 2 dates
            
            multiple_dates_used = bool(total_days) # updates to True when total_days > 0, therefore this would be a total of 2 or more dates with in between days
            total_days  = days_between.days
            last_index = index   1
            break # as only using the first set of dates for in between days
    if multiple_dates_used:
        print(total_days) # prints all of the days as in case 3

Let me know if this helps and anything I may have missed with what you were describing.

CodePudding user response：

Here's an alternative (not so different from Andrew's solution, though), which is using stacks instead of index book-keeping. I find that better readable. (The assumption here too is that the date-lists are in ascending order).

from datetime import datetime as dt

dict1 = ...
dict2 = ...

def to_date_stack(strings):
    return [dt.strptime(string, "%Y-%m-%d") for string in reversed(strings)]

res = {}
for key in dict1.keys() & dict2.keys():
    res[key] = 0
    stack1, stack2 = map(to_date_stack, (dict1[key], dict2[key]))
    while stack2 and stack1:
        date2 = stack2.pop()
        while stack1:
            if (date1 := stack1.pop()) > date2:
                res[key]  = (date1 - date2).days
                break
print(res)

The to_date_stack function converts the list items to datetimes, in reversed order. Once that is done, the processing can simply done by poping items from the stacks. I've done some testing and found no difference in performance.