Home > Software design >  Best and Fastest way to get first value from dictionary values
Best and Fastest way to get first value from dictionary values

Time:12-27

What is the fastest and best way to get first value from dictionary values? There are multiple techniques, but what is the best (memory and speed) and why?

Examples below. I would be grateful for other techniques as well:

dictis = {
    1: "One"
    , 2: "Two"
    , 3: "Three"
    , 4: "Four"
}

# 1 technique:
#----------------

value = next(iter(dictis.values()))
print(value)

# 2 technique:
#----------------

value = list(dictis.values())[0]
print(value)

# 3 technique:
#----------------

value = [value for value in dictis.values()][0]
print(value)

# 4 technique:
#----------------

value = next(value for value in dictis.values())
print(value)

CodePudding user response:

TL;DR

Use the first way! For big dictionaries it's around 2.5 times faster than fourth, 10000 times faster than second and 20000 times faster that third way. Second and third way creates full list and consumes O(n) memory

Time measurement

For small dictionaries:

size = 10
dictis = {i: str(i) for i in range(size)}
code1 = "value = next(iter(dictis.values()))"
code2 = "value = list(dictis.values())[0]"
code3 = "value = [value for value in dictis.values()][0]"
code4 = "value = next(value for value in dictis.values())"

kw = {"number": 1000000, "globals": globals()}
print("code1: ", timeit.timeit(code1, **kw)) # 0.15313154199999998
print("code2: ", timeit.timeit(code2, **kw)) # 0.25568722099999996
print("code3: ", timeit.timeit(code3, **kw)) # 0.49292356800000003
print("code4: ", timeit.timeit(code4, **kw)) # 0.43500832200000006

Let's check for size = 1_000_000. Second and third approach last really long (number = 100)

code1:  1.7961999999982492e-05
code2:  1.607245366
code3:  3.3421025739999997
code4:  6.061599999984679e-05

And for number = 10_000_00:

code1:  1.7192588249999998
code4:  4.457956223

CodePudding user response:

Let's address some points. First, Python dictionaries are not always ordered by standard, it depends on the version. Implementation-wise they had arbitrary order until Python 3.5, which means the same list(dictis) call made at different times could return lists of inconsistent order if changes were made to the dictionary in the intervening time. So don't be doing that in 3.5 or below. Python is actually implement a few different ways, most common is CPython, while another popular one is PyPy. In both of these, dictionaries were implemented in a way that they preserve the order of insertion. CPython is like this since 3.6. Keep that in mind: the order of the dictionary is just the order of insertion of elements, it has nothing to do with the element's properties.

Second, "O(n)" and "O(1)" refers to big-O notation, which is a topic within the subject of asymptotic notation. There are Wikipedia articles on these, but they can be quite technical. I would recommend reading about it from an algorithms textbook, like Cormen's. The gist of is: the number inside the parenthesis stands for a "multiplier" for how many steps you expect your algorithm to take. If you expect to loop through every item on a list of n items at least once, your algorithm is "O(n)" (nicknamed "linear time"). If you only expect to look at 1 or 2 or any number of items that is fixed and never gonna change, then you're "O(1)" (nicknamed "constant time"). As khelwood commented on your question, first option is O(1). It is because the iter function returns an iterator to the dictionary and the next method looks only at one element at a time. I don't know that the actual implementation of the iterator of dictionary in CPython (3.6 to 3.10) is O(1), but theoretically it should be. Techniques 2 and 3 converts the dictionary to a list, which means they have to visit every element, and thus are bound to take more steps. Technique 4 uses generators to achieve an effect similar to technique 1, but it's more complicated to think about and less flexible.

Still, techniques 1 and 4 use the function values to obtain a "view" of the dictionary values. I don't know how much work this is under the hood (it could be O(n) for all I know), so I think we should cut off the middle man. This should be a better solution:

key = next(iter(dictis)) # this doesn't ask to generate a view object
value = dictis[key]      # this is instant look-up

I made some testing and this solution is actually on par with technique 1, so it doesn't seem that generating a view of the dictionaries values entails much overhead. Utilizing the same testing as in kosciej16's answer, with size = 1_000_000, number = 10_000 and

code5 = '''
key = next(iter(dictis))
value = dictis[key]
'''

I got

>>> print("code1: ", timeit.timeit(code1, **kw))
code1:  0.002648300000146264
>>> print("code4: ", timeit.timeit(code4, **kw))
code4:  0.0054308000001128676
>>> print("code5: ", timeit.timeit(code5, **kw))
code5:  0.0028101000000333443

But may I offer an even better solution, but that would require more work? If you want to keep track of the order the elements are inserted into a collection, you would benefit a lot from using a queue. You can look up details on algorithms textbooks. Basically it's a data structure that keeps track of the order in which things were inserted. You'd probably have to store your data in objects, which is more costly memory-wise.

class Objectis():
    def __init__(self, id, data):
        self.id = id
        self.data = data

Then make a bunch of these items:

item1 = Objectis(1, "One")
item2 = Objectis(2, "Two")
item3 = Objectis(3, "Three")
item4 = Objectis(4, "Four")

Add them to a list on whatever order:

queue = [] # you should start your queue empty
queue.append(item3)
queue.append(item1)
queue.append(item2)
queue.append(item4)

And if you want to know who's first you can do that instantly with queue[0]. If you want to know who's last just do queue[-1]. If you want to remove the first element you can do element = queue.pop(0). If you want to insert an element in the first position you can do queue.insert(0, element).

  • Related