Let's say I have the following dictionary:
full_dic = {
'aa': 1,
'ac': 1,
'ab': 1,
'ba': 2,
...
}
I normally use standard dictionary comprehension to remove dupes like:
t = {val : key for (key, val) in full_dic.items()}
cleaned_dic = {val : key for (key, val) in t.items()}
Calling print(cleaned_dic)
outputs {'ab': 1,'ba': 2, ...}
With this code, the key that remains seems to always be the final one in the list, but I'm not sure that's even guaranteed as dictionaries are unordered. Instead, I'd like to find a way to ensure that the key I keep is the first alphabetically.
So, regardless of the 'order' the dictionary is in, I want the output to be:
>> {'aa': 1,'ba': 2, ...}
Where 'aa' comes first alphabetically.
I ran some timer tests on 3 answers below and got the following (dictionary was created with random key/value pairs):
dict length: 10
# of loops: 100000
HoliSimo (OrderedDict): 0.0000098405 seconds
Ricardo: 0.0000115448 seconds
Mark (itertools.groupby): 0.0000111745 seconds
dict length: 1000000
# of loops: 10
HoliSimo (OrderedDict): 6.1724137300 seconds
Ricardo: 3.3102091300 seconds
Mark (itertools.groupby): 6.1338266200 seconds
We can see that for smaller dictionary sizes using OrderedDict
is fastest but for large dictionary sizes it's slightly better to use Ricardo's answer below.
CodePudding user response:
t = {val : key for (key, val) in dict(sorted(full_dic.items(), key=lambda x: x[0].lower(), reverse=True)).items()}
cleaned_dic = {val : key for (key, val) in t.items()}
dict(sorted(cleaned_dic.items(), key=lambda x: x[0].lower()))
>>> {'aa': 1, 'ba': 2}
CodePudding user response:
You should use the OrderectDict class.
import collections
full_dic = {
'aa': 1,
'ac': 1,
'ab': 1
}
od = collections.OrderedDict(sorted(d.items()))
In this way you will be sure to have sorted dictionary (Original code: StackOverflow).
And then:
result = {}
for k, vin od.items():
if value not in result.values():
result[key] = value
CodePudding user response:
Seems like you can do this with a single sort and itertools.groupby
. First sort the items by value, then key. Pass this to groupby
and take the first item of each group to pass to the dict
constructor:
from itertools import groupby
full_dic = {
'aa': 1,
'ac': 1,
'xx': 2,
'ab': 1,
'ba': 2,
}
groups = groupby(sorted(full_dic.items(), key=lambda p: (p[1], p[0])), key=lambda x: x[1])
dict(next(g) for k, g in groups)
# {'aa': 1, 'ba': 2}