Home > Back-end >  Set inclusion faster with set literals
Set inclusion faster with set literals

Time:10-21

In the following I time 10_000_000 checks of whether if 10 is in {0, ..., 9}.

In the first check I use an intermediate variable and in the second one I use a literal.

import timeit

x = 10
s = set(range(x))
number = 10 ** 7

stmt = f'my_set = {s} ; {x} in my_set'
print(f'eval "{stmt}"')
print(timeit.timeit(stmt=stmt, number=number))

stmt = f'{x} in {s}'
print(f'eval "{stmt}"')
print(timeit.timeit(stmt=stmt, number=number))

Output:

eval "my_set = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9} ; 10 in my_set"
1.2576093
eval "10 in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}"
0.20336140000000036

How is it that the second one is way faster (by a factor of 5-6 approximately)? Is there some runtime optimisation performed by Python, e.g., if the inclusion check if made on a literal? Or maybe is it due to garbage collection (since it is a literal python garbage collects it right after use)?

CodePudding user response:

You're not really testing the same two things. In the first test, you're timing two assignments and lookups in addition to the membership test.

In [1]: import dis

In [2]: x = 10

In [3]: s = set(range(x))

In [4]: dis.dis("x in s")
  1           0 LOAD_NAME                0 (x)
              2 LOAD_NAME                1 (s)
              4 CONTAINS_OP              0
              6 RETURN_VALUE

In [5]: dis.dis("my_set = s; x in my_set")
  1           0 LOAD_NAME                0 (s)
              2 STORE_NAME               1 (my_set)
              4 LOAD_NAME                2 (x)
              6 LOAD_NAME                1 (my_set)
              8 CONTAINS_OP              0
             10 POP_TOP
             12 LOAD_CONST               0 (None)
             14 RETURN_VALUE

# By request
In [6]: dis.dis("s = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}; 10 in s")
  1           0 BUILD_SET                0
              2 LOAD_CONST               0 (frozenset({0, 1, 2, 3, 4, 5, 6, 7, 8, 9}))
              4 SET_UPDATE               1
              6 STORE_NAME               0 (s)
              8 LOAD_CONST               1 (10)
             10 LOAD_NAME                0 (s)
             12 CONTAINS_OP              0
             14 POP_TOP
             16 LOAD_CONST               2 (None)
             18 RETURN_VALUE

The actual difference between using literals and x in s is that the latter needs to go perform a lookup in globals, i.e., the difference is LOAD_NAME vs LOAD_CONST:

In [7]: dis.dis("10 in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}")
  1           0 LOAD_CONST               0 (10)
              2 LOAD_CONST               1 (frozenset({0, 1, 2, 3, 4, 5, 6, 7, 8, 9}))
              4 CONTAINS_OP              0
              6 RETURN_VALUE

Times:

In [8]: %timeit x in s
28.5 ns ± 0.792 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)

In [9]: %timeit 10 in {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
20.3 ns ± 0.384 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
  • Related