Code 1:
kv = {'a': 'A', 'b': 'B', 'c': 'C'}
d = {'a': 1, 'b': 2, 'c': 3}
for key in d.keys():
d[kv[key]] = d.pop(key)
print(d)
Output:
{'A': 1, 'B': 2, 'C': 3}
The above example works fine
But, when we increase the number of elements in the dictionary,
it raises KeyError as shown in example below
Code 2:
kv = {'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
for key in d.keys():
d[kv[key]] = d.pop(key)
print(d)
Output:
Traceback (most recent call last):
File "C:\Users\user\OneDrive\Documents\VSCode_Python\new.py", line 11, in <module>
d[kv[key]] = d.pop(key)
KeyError: 'A'
CodePudding user response:
The simplest answer can be found here in the Python docs on "Dictionary view objects": https://docs.python.org/3/library/stdtypes.html#dict-views:
Iterating views while adding or deleting entries in the dictionary may raise a RuntimeError or fail to iterate over all entries.
It says it may fail to iterate over all entries. In your example with a dict
of length 4, it appears that iterating the keys()
dictionary view object works (i.e., behaves the same way it would if you were not mutating the dictionary inside the loop) for the first 3 entries (just before you pop each entry in turn) and then gets to the newly set key 'A' before getting to key 'd'. In other words, iterating this view has indeed failed to iterate over all entries for your dictionary of length 4.
The question as to why it did not fail for a dictionary of length 3 could perhaps be answered with reference to implementation details that allow us to understand how the language internals give rise to this luck-of-the-draw behavior, but ultimately, the answer is that the Python language specification does not require your code to fail for a view on a dictionary of any length, and it just so happens that it does not fail for a dict
of length 3.
Please see comments by @kcsquared and answers by others for insight into implementation-specific details as to the arbitrary point of failure of the/a current Python language implementation.
CodePudding user response:
The actual reason why this happens is due to the implementation of dictionaries in Python. This article describes it very well.
In essence, Python calculates the hash of the key and applies a mask to find a slot to store the value.
For a small dict you have 8 slots, so the function to determine in which slot you put the value is:
def find_slot(key):
return hash(key) & 7
You'll notice that there is no collision in the original dict:
list(map(find_slot, ('a', 'b', 'c', 'd')))
>>>[5, 0, 6, 1]
But when the 'A'
key is added there is a collision:
list(map(find_slot, ('a', 'b', 'c', 'd', 'A')))
>>>[5, 0, 6, 1, 5]
The same slot is given for 'a'
and 'A'
, so the dictionary has to expand to have more slots.
Until the dictionary is expanded, the keys hash are stored in cache. When it expands, the cache has to be renewed. You can see that happening with some print statements:
from sys import getsizeof
kv = {'a': 'A', 'b': 'B', 'c': 'C', 'd': 'D'}
d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
for key in d.keys():
print(d)
print(getsizeof(d))
print(d.keys(), key)
d[kv[key]] = d.pop(key)
print(d)
This is the principle at play. Not sure the detail is 100% correct but happy to fix it if anyone has any comments.