I'm working on a script that checks if time slices overlap or not.
I have a handler function that looks like this:
def intersection_checker(foo, bar):
if foo == bar:
return True
if foo[0] == bar[1] or foo[1] == bar[0]:
return True
if bar[0] < (foo[0] or foo[1]) < bar[1]:
return True
if foo[0] < (bar[0] or bar[1]) < foo[1]:
return True
return False
Object foo
is a tuple of two datetime.time()
objects:
foo = (datetime.strptime('06:30:00','%H:%M:%S').time(), datetime.strptime('08:15:00','%H:%M:%S').time())
Object bar
is a set()
of foo-like datetime.time()
objects. That set can include 200 k of that objects.
The line that calls the handler (intersection_checker
) looks like this:
...
if len(bar) > 1 and True in set(map(intersection_checker, repeat(foo), bar)):
...
This code works. The problem is that it takes centuries to process such a large amount of data. I tried using a for loop to iterate trough function, but that didn't work as well as using the built-in map. Perhaps there is a way to transfer and process large amounts of data more efficiently? Or check intersections in a different way? And yes, it is enough to only find the first True
value, it is not necessary to loop through the entire bar.
CodePudding user response:
You'll likely get a boost in performance by replacing this:
True in set(map(intersection_checker, repeat(foo), bar))
with this:
any(map(intersection_checker, repeat(foo), bar))
By converting to a set first, you're forcing the entire dataset to be mapped before it can determine if any of the values are True
. Using any()
will stop the map
iterator as soon as a True
value is found.
CodePudding user response:
Time objects support comparisons, and therefor you can do:
def overlap(t1,t2):
# Check if two time ranges overlap
# Pass two tuples each with the start and end datetime defining the range
return False if t1[1]<t2[0] or t1[0]>t2[1] else True
Test this:
t1= (dt.datetime.strptime('06:30:00','%H:%M:%S').time(), dt.datetime.strptime('08:30:00','%H:%M:%S').time())
t2= (dt.datetime.strptime('08:29:00','%H:%M:%S').time(), dt.datetime.strptime('09:15:00','%H:%M:%S').time())
t3= (dt.datetime.strptime('09:47:00','%H:%M:%S').time(), dt.datetime.strptime('09:33:00','%H:%M:%S').time())
>>> overlap(t1,t2)
True
>>> overlap(t1,t3)
False
Then use next
on an iterator or any
to break on the first True
.
any(overlap(foo,x) for x in your_sequence)
Or,
next(((i,f'{x} overlaps {foo}') for i,x in enumerate(your_seq) if overlap(foo,x)==True), (-1, 'no overlaps found'))