from sortedcontainers import SortedDict
d = SortedDict(b=20, d=30, c=10, e=50, a=40)
# What is the time complexity of the following code?
for k, v in d.items():
print(k, v)
I think the time complexity should be nlog(n)
as getting an entry from a sorted dictionary costs log(n)
, and even though we're iterating over this dictionary, we essentially perform the get
operation n times. Is my understanding correct?
CodePudding user response:
SortedDict.items()
calls SortedItemsView(self)
, the constructor of which is inherited from collections.abc.MappingView
via collections.abc.ItemsView
, and ItemsView
has the following special method:
def __iter__(self):
for key in self._mapping:
yield (key, self._mapping[key])
So you're right that it is doing a lookup at each step. Here, self._mapping
is a SortedDict
. However, since SortedDict
is a subclass of dict
that does not override __getitem__
, it uses the standard dict.__getitem__
, which is O(1) on average, better than O(log n).
Also note that the for key in self._mapping:
above calls sortedDict.__iter__
, which calls SortedList.__iter__
, which calls iterools.chain.from_iterable
, which runs in linear time.
CodePudding user response:
If I'm understanding the code correctly, you can iterate over the items of a SortedDict in O(n).
Internally, it uses a SortedList, which can iterate over all its elements in O(n) time. (SortedList is implemented as a list of lists, and it uses itertools.chain_iterable()
to turn that into a single generator.) Once it identifies the correct item to access, it can look it up in a hash table, same as an ordinary dict. (The source code says "sorted dict inherits from dict to store items and maintains a sorted list of keys.")
How can this be possible, when any sorting algorithm based on comparisons must take least O(n log n)? When inserting into the SortedDict, that can take O(log n), since that's what the SortedList takes for insertion. So inserting n items takes O(n log n), but iterating over them is only O(n).
CodePudding user response:
Any for loop iteraing through objects has a complexity of O(n), when working with dictionaries this is the same. For each of the dictionary items, you loop once.