Home > Software design >  Efficient algorithm to get all the combinations of numbers that are within a certain range from 2 li
Efficient algorithm to get all the combinations of numbers that are within a certain range from 2 li

Time:10-29

Suppose I have two lists list_1 and list_2

list_1 = [1, 5, 10] list_2 = [3, 4, 15]

I want to get a list of tuples containing elements from both list_1 and list_2 such that the difference between the numbers in a tuple is under a some constant c.

E.g. suppose c is 2 then the tuples I would have would be: [(1, 3), (5, 3), (5, 4)]

Of course one can iterate over both lists and check that the difference between 2 elements is less than c, but that has a complexity of n^2 and I would rather reduce that complexity.

CodePudding user response:

Here is an implementation of the idea of Marat from the comments:

import bisect

def close_pairs(list1,list2,c):
  #assumes that list2 is sorted
  for x in list1:
    i = bisect.bisect_left(list2,x-c)
    j = bisect.bisect_right(list2,x c)
    yield from ((x,y) for y in list2[i:j])

list_1 = [1, 5, 10]
list_2 = [3, 4, 15]
print(list(close_pairs(list_1,list_2,2)))
#prints [(1, 3), (5, 3), (5, 4)]

To demonstrate the potential improvement of this strategy over what might be thought of as the "naive" approach, let's timeit.

import timeit

setup_naive = '''
import numpy
list_a = numpy.random.randint(0, 10, 500)
list_b = numpy.random.randint(0, 10, 500)
c = 2
def close_pairs(list_a, list_b, c):
    yield from ((x,y) for x in list_a for y in list_b if abs(x-y) <= c)
'''

setup_john_coleman = '''
import bisect
import numpy
list_a = numpy.random.randint(0, 10, 500)
list_b = numpy.random.randint(0, 10, 500)
c = 2
def close_pairs(list_a, list_b, c):
    list_a = sorted(list_a)
    list_b = sorted(list_b)
    for x in list_a:
        i = bisect.bisect_left(list_b,x-c)
        j = bisect.bisect_right(list_b,x c)
        yield from ((x,y) for y in list_b[i:j])
'''

print(f"john_coleman: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_john_coleman, number=100):.2f}")
print(f"naive: {timeit.timeit('list(close_pairs(list_a, list_b, c))', setup=setup_naive, number=100):.2f}")

On my laptop that gives result like:

john_coleman: 1.50
naive: 11.86
  • Related