In the following I time 10_000_000
checks of whether if 10
is in {0, ..., 9}
.
In the first check I use an intermediate variable and in the second one I use a literal.
import timeit
x = 10
s = set(range(x))
number = 10 ** 7
stmt = f'my_set = {s} ; {x} in my_set'
print(f'eval "{stmt}"')
print(timeit.timeit(stmt=stmt, number=number))
stmt = f'{x} in {s}'
print(f'eval "{stmt}"')
print(timeit.timeit(stmt=stmt, number=number))
Output:
eval "my_set = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} ; 10 in my_set"
1.2576093
eval "10 in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}"
0.20336140000000036
How is it that the second one is way faster (by a factor of 5-6 approximately)? Is there some runtime optimisation performed by Python, e.g., if the inclusion check if made on a literal? Or maybe is it due to garbage collection (since it is a literal python garbage collects it right after use)?
CodePudding user response:
You're not really testing the same two things. In the first test, you're timing two assignments and lookups in addition to the membership test.
In [1]: import dis
In [2]: x = 10
In [3]: s = set(range(x))
In [4]: dis.dis("x in s")
1 0 LOAD_NAME 0 (x)
2 LOAD_NAME 1 (s)
4 CONTAINS_OP 0
6 RETURN_VALUE
In [5]: dis.dis("my_set = s; x in my_set")
1 0 LOAD_NAME 0 (s)
2 STORE_NAME 1 (my_set)
4 LOAD_NAME 2 (x)
6 LOAD_NAME 1 (my_set)
8 CONTAINS_OP 0
10 POP_TOP
12 LOAD_CONST 0 (None)
14 RETURN_VALUE
# By request
In [6]: dis.dis("s = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}; 10 in s")
1 0 BUILD_SET 0
2 LOAD_CONST 0 (frozenset({0, 1, 2, 3, 4, 5, 6, 7, 8, 9}))
4 SET_UPDATE 1
6 STORE_NAME 0 (s)
8 LOAD_CONST 1 (10)
10 LOAD_NAME 0 (s)
12 CONTAINS_OP 0
14 POP_TOP
16 LOAD_CONST 2 (None)
18 RETURN_VALUE
The actual difference between using literals and x in s
is that the latter needs to go perform a lookup in globals, i.e., the difference is LOAD_NAME
vs LOAD_CONST
:
In [7]: dis.dis("10 in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}")
1 0 LOAD_CONST 0 (10)
2 LOAD_CONST 1 (frozenset({0, 1, 2, 3, 4, 5, 6, 7, 8, 9}))
4 CONTAINS_OP 0
6 RETURN_VALUE
Times:
In [8]: %timeit x in s
28.5 ns ± 0.792 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
In [9]: %timeit 10 in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
20.3 ns ± 0.384 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)