Suppose I have two lists list_1
and list_2
list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
I want to get a list of tuples containing elements from both list_1 and list_2 such that the difference between the numbers in a tuple is under a some constant c.
E.g. suppose c is 2 then the tuples I would have would be:
[(1, 3), (5, 3), (5, 4)]
Of course one can iterate over both lists and check that the difference between 2 elements is less than c, but that has a complexity of n^2 and I would rather reduce that complexity.
CodePudding user response:
Here is an implementation of the idea of Marat from the comments:
import bisect
def close_pairs(list1,list2,c):
#assumes that list2 is sorted
for x in list1:
i = bisect.bisect_left(list2,x-c)
j = bisect.bisect_right(list2,x c)
yield from ((x,y) for y in list2[i:j])
list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
print(list(close_pairs(list_1,list_2,2)))
#prints [(1, 3), (5, 3), (5, 4)]
To demonstrate the potential improvement of this strategy over what might be thought of as the "naive" approach, let's timeit
.
import timeit
setup_naive = '''
import numpy
list_a = numpy.random.randint(0, 10, 500)
list_b = numpy.random.randint(0, 10, 500)
c = 2
def close_pairs(list_a, list_b, c):
yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)
'''
setup_john_coleman = '''
import bisect
import numpy
list_a = numpy.random.randint(0, 10, 500)
list_b = numpy.random.randint(0, 10, 500)
c = 2
def close_pairs(list_a, list_b, c):
list_a = sorted(list_a)
list_b = sorted(list_b)
for x in list_a:
i = bisect.bisect_left(list_b,x-c)
j = bisect.bisect_right(list_b,x c)
yield from ((x,y) for y in list_b[i:j])
'''
print(f"john_coleman: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_john_coleman, number=100):.2f}")
print(f"naive: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_naive, number=100):.2f}")
On my laptop that gives result like:
john_coleman: 1.50
naive: 11.86