Why is making a tuple from a generator expression slower than making one from a list comprehension?-CodePudding

(This question is based on a now deleted tweet)

I ran the following with both Python3.9 and Python3.10:

$ python -m timeit 'tuple([-x for x in range(1000)])'
10000 loops, best of 5: 35.7 usec per loop
$ python -m timeit 'tuple(-x for x in range(1000))'
5000 loops, best of 5: 45.7 usec per loop

Surprisingly, the version where an extra object is constructed (the list) is faster. Why is this the case? If this is indeed faster, why does the tuple constructor not just instantiate a list under the hood?

Note: I also timed these for length 10 and length 1 million tuples, and the version with the list is always faster.

CodePudding user response：

Try this one:

python -m timeit 'list(-x for x in range(1000))'

I predict you'll find it takes about the same time as your:

python -m timeit 'tuple(-x for x in range(1000))'

and that

python -m timeit 'list([-x for x in range(1000)])'

would again restore the "lost" speed.

This is because list versus tuple actually has nothing to do with it. It's instead that tuple() and list() are just function calls to Python, and may invoke anything whatsoever (by default, those names are bound to built-in functions, but the user may rebind them to anything).

But [expression for vrb in ...] is dedicated syntax, whose meaning user code cannot change. Python knows at compile-time that it's building a list, and generates special code to exploit that. That code never calls list() at all, but instead builds the list directly itself, one element at a time. No generator expression is actually constructed, and there's no overhead either for yielding one element at a time from a generator expression.

Build simple functions containing these kinds of expressions and use dis.dis() to show the generated byte code. The code is very different :-)