Home > Net >  Sorting a list by another list with duplicates
Sorting a list by another list with duplicates

Time:11-26

I have two lists [1, 2, 3, 1, 2, 1] and [a, b, c, d, e, f]. I want to reorder elements in the second list according to the permutations that sort the first list. Sorting the first list gives [1, 1, 1, 2, 2, 3] but there are many possible permutations for the second list to be sorted by the first i.e. [a, d, f, b, e, c], [d, f, a, e, b, c], etc..

How can I generate all of these permutations in an efficient manner in python?

If I just wanted one permutation I could get one by something like this:

sorted_numbers, sorted_letters = list(zip(*[(x, y) for x, y in sorted(zip(numbers, letters))]))

CodePudding user response:

You could use a list comprehension to filter all the permutations with a helper function:

from itertools import permutations


def is_valid_ordering(perm: str, ch_to_order: dict) -> bool:
    if not perm or len(perm) <= 1:
        return True
    for ch1, ch2 in zip(perm[:-1], perm[1:]):
        if ch_to_order[ch1] > ch_to_order[ch2]:
            return False
    return True


lst_1 = [1, 2, 3, 1, 2, 1]
lst_2 = ['a', 'b', 'c', 'd', 'e', 'f']
ch_to_order = {ch: o for ch, o in zip(lst_2, lst_1)}
valid_permutations = [
    list(p) for p in permutations(lst_2)
    if is_valid_ordering(p, ch_to_order)
]
for valid_perm in valid_permutations:
    print(valid_perm)

Output:

['a', 'd', 'f', 'b', 'e', 'c']
['a', 'd', 'f', 'e', 'b', 'c']
['a', 'f', 'd', 'b', 'e', 'c']
['a', 'f', 'd', 'e', 'b', 'c']
['d', 'a', 'f', 'b', 'e', 'c']
['d', 'a', 'f', 'e', 'b', 'c']
['d', 'f', 'a', 'b', 'e', 'c']
['d', 'f', 'a', 'e', 'b', 'c']
['f', 'a', 'd', 'b', 'e', 'c']
['f', 'a', 'd', 'e', 'b', 'c']
['f', 'd', 'a', 'b', 'e', 'c']
['f', 'd', 'a', 'e', 'b', 'c']

CodePudding user response:

Using itertools to build the Cartesian product of the permutations for each duplicated key:

Code

from itertools import chain, permutations, groupby, product
from operator import itemgetter

def all_sorts(numbers, letters):
    return [list(map(itemgetter(1), chain.from_iterable(p))) for p in product(*(permutations(g) for _,g in groupby(sorted(zip(numbers, letters)), key=itemgetter(0))))]

print( all_sorts([1,2,3,1,2,1], 'abcdef') )
# [['a', 'd', 'f', 'b', 'e', 'c'], ['a', 'd', 'f', 'e', 'b', 'c'], ['a', 'f', 'd', 'b', 'e', 'c'], ['a', 'f', 'd', 'e', 'b', 'c'], ['d', 'a', 'f', 'b', 'e', 'c'], ['d', 'a', 'f', 'e', 'b', 'c'], ['d', 'f', 'a', 'b', 'e', 'c'], ['d', 'f', 'a', 'e', 'b', 'c'], ['f', 'a', 'd', 'b', 'e', 'c'], ['f', 'a', 'd', 'e', 'b', 'c'], ['f', 'd', 'a', 'b', 'e', 'c'], ['f', 'd', 'a', 'e', 'b', 'c']]

This approach is optimal in the sense that it generates the solutions directly, rather that filtering them from a huge list of candidates. With the given example list of size 6, it generates only 12 solutions, rather than filtering through all 720 permutations of a list of size 6.

How it works:

First we sort and group by key, using sorted and itertools.groupby. Note operator.itemgetter(0) is the same as lambda t: t[0].

>>> [list(g) for _,g in groupby(sorted(zip(numbers, letters)), key=itemgetter(0))]
[[(1, 'a'), (1, 'd'), (1, 'f')],
 [(2, 'b'), (2, 'e')],
 [(3, 'c')]]

Then we generate the possible permutations of every key, using itertools.permutation on every group.

>>> [list(permutations(g)) for _,g in groupby(sorted(zip(numbers, letters)), key=itemgetter(0))]
[[((1, 'a'), (1, 'd'), (1, 'f')), ((1, 'a'), (1, 'f'), (1, 'd')), ((1, 'd'), (1, 'a'), (1, 'f')), ((1, 'd'), (1, 'f'), (1, 'a')), ((1, 'f'), (1, 'a'), (1, 'd')), ((1, 'f'), (1, 'd'), (1, 'a'))],
 [((2, 'b'), (2, 'e')), ((2, 'e'), (2, 'b'))],
 [((3, 'c'),)]]

Then we build the Cartesian product of these lists of permutations, using itertools.product; and we rebuild a list from each tuple in the Cartesian product, using itertools.chain to concatenate. Fially we "undecorate", discarding the keys and keeping only the letters, which I did with map(itemgetter(1), ...) but could have equivalently done with a list comprehension [t[1] for t in ...].

>>> [list(map(itemgetter(1), chain.from_iterable(p))) for p in product(*(permutations(g) for _,g in groupby(sorted(zip(numbers, letters)), key=itemgetter(0))))]
[['a', 'd', 'f', 'b', 'e', 'c'], ['a', 'd', 'f', 'e', 'b', 'c'], ['a', 'f', 'd', 'b', 'e', 'c'], ['a', 'f', 'd', 'e', 'b', 'c'], ['d', 'a', 'f', 'b', 'e', 'c'], ['d', 'a', 'f', 'e', 'b', 'c'], ['d', 'f', 'a', 'b', 'e', 'c'], ['d', 'f', 'a', 'e', 'b', 'c'], ['f', 'a', 'd', 'b', 'e', 'c'], ['f', 'a', 'd', 'e', 'b', 'c'], ['f', 'd', 'a', 'b', 'e', 'c'], ['f', 'd', 'a', 'e', 'b', 'c']]
  • Related