How to find all unique combinations of k size tuple using a single element from each list of n lists-CodePudding

Given a list containing N sublists of multiple lengths, find all unique combinations of a k size, selecting only one element from each sublist.

The order of the elements in the combination is not relevant: (a, b) = (b, a)

sample_k = 2

sample_list = [['B1','B2','B3'], ['T1','T2'], ['L1','L2','L3','L4']]

expected_output =
[
('B1', 'T1'),('B1', 'T2'),('B1', 'L1'),('B1', 'L2'),('B1', 'L3'),('B1', 'L4'),
('B2', 'T1'),('B2', 'T2'),('B2', 'L1'),('B2', 'L2'),('B2', 'L3'),('B2', 'L4'),
('B3', 'T1'),('B3', 'T2'),('B3', 'L1'),('B3', 'L2'),('B3', 'L3'),('B3', 'L4'),
('T1', 'L1'),('T1', 'L2'),('T1', 'L3'),('T1', 'L4'),
('T2', 'L1'),('T2', 'L2'),('T2', 'L3'),('T2', 'L4')
]

Extra points for a pythonic way of doing it
Speed/Efficiency matters, the idea is to use in a list with hundreds of lists ranging from 5 to 50 in length

What I have been able to accomplish so far: Using for and while loops to move pointers and build the answer, however I am having a hard time figuring out how to include K parameter to set the size of tuple combination dinamically. (not really happy about it)

def build_combinations(lst):
    result = []
    count_of_lst = len(lst)
    for i, sublist in enumerate(lst):
        if i == count_of_lst - 1:
            continue
        else:
            for item in sublist:
                j = 0
                while i < len(lst)-1:
                    while j <= len(lst[i 1])-1:
                        comb = (item, lst[i 1][j])
                        result.append(comb)
                        j = j   1
                    i = i   1
                    j = 0
                i = 0
    return result

I've seen many similar questions in stack overflow, but none of them addressed the parameters the way I am trying to (one item from each list, and the size of the combinations being a params of function)

I tried using itertools combinations, product, permutation and flipping them around without success. Whenever using itertools I have either a hard time using only one item from each list, or not being able to set the size of the tuple I need.

I tried NumPy using arrays and a more math/matrix approach, but didn't go too far. There's definitely a way of solving with NumPy, hence why I tagged numpy as well

CodePudding user response：

You need to combine two itertools helpers, combinations to select the two unique ordered lists to use, then product to combine the elements of the two:

from itertools import combinations, product

sample_k = 2

sample_list = [['B1','B2','B3'], ['T1','T2'], ['L1','L2','L3','L4']]

expected_output = [pair
                   for lists in combinations(sample_list, sample_k)
                   for pair in product(*lists)]
print(expected_output)

Try it online!

If you want to get really fancy/clever/ugly, you can push all the work down to the C layer with:

from itertools import combinations, product, starmap, chain

sample_k = 2

sample_list = [['B1','B2','B3'], ['T1','T2'], ['L1','L2','L3','L4']]

expected_output = list(chain.from_iterable(starmap(product, combinations(sample_list, sample_k))))
print(expected_output)

That will almost certainly run meaningfully faster for huge inputs (especially if you can loop the results from chain.from_iterable directly rather than realizing them as a list), but it's probably not worth the ugliness unless you're really tight for cycles (I wouldn't expect much more than a 10% speed-up, but you'd need to benchmark to be sure).