Casting generators to dictionaries vs. declaring dictionaries Which is more efficient?-CodePudding

Which of the two is more efficient?

Casting a generator to a dictionary seems to have the same memory usage as using only a dictionary.

def test():
    yield 'a', 4444
    yield 'b', 5555

print(dict(test()))

def test2():
    return {
        "a": 4444,
        "b": 5555
    }

print(test2())

CodePudding user response：

You can use the time module to find this out for yourself. If I call these functions for say a million times and then compare how long did each calling took, I can determine which is faster.

time_i = time.time()
[dict(test()) for i in range(10**6)]
time_f = time.time()
print(time_f - time_i)

time_i = time.time()
[dict(test()) for i in range(10**6)]
time_f = time.time()
print(time_f - time_i)

Output -

0.7002811431884766
0.6696088314056396

From this it looks like test2 is faster. It's also much easier to write considering for a long dictionary you'll have to repeat yield multiple times.

CodePudding user response：

That's easy to figure out, use timeit:

In [6]: def test1():
   ...:     yield 'a', 4444
   ...:     yield 'b', 5555
   ...: 
   ...: def test2():
   ...:     return {
   ...:         'a': 4444,
   ...:         'b': 5555,
   ...:     }
   ...: 
   ...: from timeit import timeit
   ...: 
   ...: t1 = timeit('dict(test1())', '', globals=globals())
   ...: t2 = timeit('test2()', '', globals=globals())
   ...: 
   ...: print(t1, t2)
0.18616174999999657 0.066665042000011

Why?

Because test1() will actually create many-many dicts per each execution, while test2() will prepare that dict once during byte-compilation stage, since that specific dict is a constant.

What about the idea itself?

Since you are going to materialize that dict in memory after all, that use of generators makes no improvement in memory consumption, neither in the CPU load.

So what?

Use dict literals and dict comprehension where possible. Unless you already have a generator yielding a key, value pairs, you're not supposed to change it, and just want to make a dict from what it yieds.

CodePudding user response：

The dictionaries these produce will be the same size, but the first example will use more memory and take longer. It needs to create the generator and store its state. For each item the generator yields the interpreter will need to load that state, run the generator until the next item is yielded, then store the state again. If you're not familiar with the call stack, I would recommend doing some reading on that subject, as it will give you an idea of how using a generator would involve many more operations than a literal.