Home > Mobile >  One liner to revert dictionary with non-unique values with O(n) complexity
One liner to revert dictionary with non-unique values with O(n) complexity

Time:06-19

Is it possible to produce one liner (i.e. comprehension) with better time complexity than O(n²) as below?

my_map = {'A': 'x',
          'B': 'y',
          'C': 'x',
          'D': 'z'}

rev_map = {b: [a2 for a2 in my_map.keys() if my_map[a2] == b]
           for a, b in my_map.items()}

Have not found any in related Reverse / invert a dictionary mapping.

CodePudding user response:

Here's an O(n log n) one-liner:

>>> {k : list(map(itemgetter(1), g)) for k, g in groupby(sorted(map(itemgetter(1, 0), my_map.items())), itemgetter(0))}
{'x': ['A', 'C'], 'y': ['B'], 'z': ['D']}

The required imports are:

from itertools import groupby from operator import item getter

CodePudding user response:

I digress here because I don't understand why this has to be done as a one-liner. Consider this:

from timeit import timeit

my_map = {'A': 'x',
          'B': 'y',
          'C': 'x',
          'D': 'z'}

def func1():
    return {b: [a2 for a2, b2 in my_map.items() if b2 == b] for b in my_map.values()}

def func2():
    nm = {}
    for k, v in my_map.items():
        nm.setdefault(v, []).append(k)
    return nm

for func in func1, func2:
    print(func.__name__, timeit(lambda: func()))

Output:

func1 1.9566871839997475
func2 0.6075634010003341

Note that the dictionary comprehension takes more than 3 times as long as a more traditional/simplistic approach.

EDIT:

The time difference (factor) between the 2 methods increases significantly when the source dictionary is larger.

from timeit import timeit
import random
import string
my_map = {}

for k in string.ascii_uppercase:
    my_map[k] = random.choice('xyz')

print(my_map)

def func1():
    return {b: [a2 for a2, b2 in my_map.items() if b2 == b] for b in my_map.values()}

def func2():
    nm = {}
    for k, v in my_map.items():
        nm.setdefault(v, []).append(k)
    return nm

for func in func1, func2:
    print(func.__name__, timeit(lambda: func(), number=250_000))

Output:

{'A': 'y', 'B': 'x', 'C': 'x', 'D': 'y', 'E': 'y', 'F': 'y', 'G': 'z', 'H': 'x', 'I': 'y', 'J': 'z', 'K': 'z', 'L': 'y', 'M': 'x', 'N': 'x', 'O': 'z', 'P': 'x', 'Q': 'x', 'R': 'x', 'S': 'y', 'T': 'y', 'U': 'x', 'V': 'x', 'W': 'x', 'X': 'z', 'Y': 'z', 'Z': 'y'}
func1 8.602543555000011
func2 0.7367826070003503

Proof positive that one-liners are not always a great idea

CodePudding user response:

If I'm guessing correctly you would want to reverse your dictionary as so:

my_map = {
 'A': 'x',
 'B': 'y',
 'C': 'x',
 'D': 'z'
}
rev_map = {
 'x': ['A', 'C'],
 'y': ['B'],
 'z': ['Z'],
}

It can be done in a single pass and get you O(n) with the following code

from collections import defaultdict

rev_map = defaultdict(list)
for key, value in my_map.items():
    rev_map[value].append(key)

If you then want to transform keys with a single value in their list into a string then you can do another pass and do that, that would still be O(n).

  • Related