Home > database >  Most efficient way to check if timestamp in list exists between two other timestamps?
Most efficient way to check if timestamp in list exists between two other timestamps?

Time:09-08

I am trying to search a list of datetimes to check if there is a timestamp C that exists between timestamp A and B. I found the bisect stdlib but am not sure how to apply it here with datetime types.

My setup is similar to this:

insulin_timestamps = ['2017/07/01 13:23:42', '2017/11/01 00:56:40', '2018/02/18 22:01:09']

start_time = '2017/10/31 22:00:11'
end_time = '2017/11/01 01:59:40'

I want to check if a timestamp in the list exists between my two variable times. I am doing this multiple times in a loop. The only solution I can come up with is another loop within it that checks the whole list. I would rather not do that for efficiency purposes as it is a lot of data. Any tips?

CodePudding user response:

Since your datetime values are in Y/m/d H:i:s format, they can be sorted by regular string comparison, and so you could use the bisect module (assuming insulin_timestamps is sorted) to efficiently search for a timestamp in insulin_timestamps which is between the start_time and end_time values:

from bisect import bisect_left, bisect_right

present = bisect_left(insulin_timestamps, start_time) < bisect_right(insulin_timestamps, end_time, sti)

Note that if in your outer loop you are iterating start and end times, it might be more efficient to iterate the timestamps array at the same time.

CodePudding user response:

If you just want to check if a timestamp exists between start_time and end_time, and the list is sorted, then you just need to check the timestamp immediately following start_time (assuming it occurs in the list). If that is less than end_time, then the result is true. It would add a few checks to the code but would cut runtime in half by removing the need for bisect_right. The code would appear as follows:

left_index = bisect_left(insulin_timestamps, start_time)
present = (
    len(insulin_timestamps) != 0 # if the list is empty the result is False
    and left_index != len(insulin_timestamps)
    and insulin_timestamps[left_index] < end_time
)

Here left_index != len(insulin_timestamps) checks that start_time is not than any element in the list: if it is, then the result is False.

  • Related