How to get all unique assignments between two lists-CodePudding

I have two lists that can each contain duplicate values but any value can only appear in one (or none) of the two lists.

A = [0,1]
B = [2,3]

and i want to get all unique mappings between these two lists.

assignment(A,B) = [((0,2),(1,3)), ((0,3),(1,2))]

I know that this can be done for example using itertools.permutations. There can however be duplicates in either/both of the lists and then this approach would give me duplicates because i do not care about the order of the assignments. Because for every possible assignment i want to just find the total distance.

A = [1,1]
B = [2,3]
mapping(A,B) = [((1,2),(1,3)), ((1,3),(1,2))]

with the desired output being

mapping(A,B) = [((1,2),(1,3))]

If just the first list had duplicated i could achieve this using sympy.utilities.iterables.multiset_permutations. Although that also wouldnt work for me because it doesnt play nice with numba

However the second list can also have duplicated and then even with that i would get results like

A = [0,1]
B = [3,3]
mapping(A,B) = [((0,3),(1,3)), ((1,3),(0,3))]

with the desired output being

mapping(A,B) = [((0,3),(1,3))]

Also ideally i would like to have the code in a way that it works with numba. Is my best way to just produce all permutations of A, put them through a set to filter out duplicates and then just do the extra work of just evaluating the duplicate options on B? (It wont give wrong results just slow the code down)

Currently my best guess is that i have to do

AB = [list(zip(perm, B)) for perm in set(permutations(A, len(B)))]

with

@njit
def permutations(A, k):
    """Calculates permutation of k elements in A. Needed because numba doesnt like itertools"""
    r = [[i for i in range(0)]]
    for i in range(k):
        r = [[a]   b for a in A for b in r if (a in b) == False]
    return r

and just do the extra work for cases where B has duplicates.

Edit: I now got a two versions that give the desired result:

set(tuple(sorted(zip(perm, B))) for perm in set(permutations(A, len(B))))

and

set(tuple(sorted(zip(A, p))) for p in permutations(B))

and the previous version that produces duplicates that come from the second array

[zip(perm, B) for perm in set(permutations(A, len(B)))]

I also tested which version is the fastest for my concrete usecase:

from itertools import product, permutations
import timeit
import numpy as np
A = [0,0,1,1]
B = [2,3,3,3]
mem = np.arange(5)

def vers1(A,B,mem):
    res = float("inf")
    for mapping in set(tuple(sorted(zip(A, p))) for p in permutations(B)):
        temp = 0
        for a, b in mapping:
            temp  = abs(mem[a]-mem[b])
        res = min(res,temp)
    return res
    
def vers2(A,B,mem):
    res = float("inf")
    for mapping in set(tuple(sorted(zip(perm, B))) for perm in set(permutations(A, len(B)))):
        temp = 0
        for a, b in mapping:
            temp  = abs(mem[a]-mem[b])
        res = min(res,temp)
    return res
    
def vers3(A,B,mem):
    res = float("inf")
    for mapping in [zip(perm, B) for perm in set(permutations(A, len(B)))]:
        temp = 0
        for a, b in mapping:
           temp  = abs(mem[a]-mem[b])
        res = min(res,temp)
    return res

print(vers1(A,B,mem))
print(timeit.timeit("vers1(A,B,mem)",'from __main__ import vers1, A, B,mem'))
print(vers2(A,B,mem))
print(timeit.timeit("vers2(A,B,mem)",'from __main__ import vers2, A, B,mem'))
print(vers3(A,B,mem))
print(timeit.timeit("vers3(A,B,mem)",'from __main__ import vers3, A, B,mem'))

and got this result:

9
58.46571195700017
9
23.15174284099703
9
28.064288804998796

CodePudding user response：

Use a set to remove the duplicates

import itertools as it

A=[0,1]
B=[3,3]
AB = list(set(tuple(zip(A, p)) for p in it.permutations(B)))

CodePudding user response：

You can take sets of A and B, then take the product from itertools

from itertools import product
A = [1,1]
B = [2,3]
result = list(product(set(A), set(B)))
print(result) # prints [(1, 2), (1, 3)]

CodePudding user response：

To account for both lists having duplicates, it is indeed easiest to filter duplicates of the outcome from mapping the permutations of a list to another list.

To de-duplicate tuples while maintaining their order and counts, you can convert the permutations of tuples to a set of frozensets of collection.Counter items, and then use the Counter.elements to convert the tuple counts back to tuples.

Below is an example where there are duplicates in both input lists:

from itertools import permutations
from collections import Counter

A = [0,0,1,1]
B = [2,3,3,3]

[
    tuple(Counter(dict(s)).elements())
    for s in {
        frozenset(Counter(zip(A, p)).items())
        for p in permutations(B)
    }
]

This returns:

[((0, 3), (0, 2), (1, 3), (1, 3)), ((1, 2), (0, 3), (0, 3), (1, 3))]

Demo: https://replit.com/@blhsing/LongtermNaturalFiletype