Why is single-value tuple plus list is more efficient than appending a number?-CodePudding

Assume we need to append single number 1 to array a.

In Python we have 5 obvious ways:

a.append(1)
a = [1]
a = 1,
a.extend((1,))
a.extend([1])

Let's measure it with timeit :

from timeit import timeit

print(timeit("a.append(1)", "a = []", number=10_000_000))
print(timeit("a  = [1]", "a = []", number=10_000_000))
print(timeit("a  = 1,", "a = []", number=10_000_000))
print(timeit("a.extend((1,))", "a = []", number=10_000_000))
print(timeit("a.extend([1])", "a = []", number=10_000_000))

Here is the console's output:

5.05412472199896
5.869792026000141
3.1280645619990537
4.988895307998973
8.05588494499898

Why is third one more efficient than others?

CodePudding user response：

The creation of a tuple (1,) is optimized away by the compiler. On the other hand, the list is always created. Look at dis.dis

>>> import dis
>>> dis.dis('a.extend((1,))')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_CONST               0 ((1,))
              6 CALL_METHOD              1
              8 RETURN_VALUE
>>> dis.dis('a.extend([1])')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_CONST               0 (1)
              6 BUILD_LIST               1
              8 CALL_METHOD              1
             10 RETURN_VALUE

Notice, it takes less byte-code instructions, and merely does a LOAD_CONST on (1,). On the other hand, for the list, BUILD_LIST is called (with a LOAD_CONST for 1).

Note, you can access these constants on the code object:

>>> code = compile('a.extend((1,))', '', 'eval')
>>> code
<code object <module> at 0x10e91e0e0, file "", line 1>
>>> code.co_consts
((1,),)

Finally, as to why = is faster than .extend, well, again if you look at the bytecode:

>>> dis.dis('a  = b')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_NAME                1 (b)
              4 INPLACE_ADD
              6 STORE_NAME               0 (a)
              8 LOAD_CONST               0 (None)
             10 RETURN_VALUE
>>> dis.dis('a.extend(b)')
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (extend)
              4 LOAD_NAME                2 (b)
              6 CALL_METHOD              1
              8 RETURN_VALUE

You'll notice for .extend, it that requires first resolving the method (which takes extra time). Using the operator on the other hand has it's own bytecode: INPLACE_ADD so everything is pushed down into that C layer (plus, magic methods skip instance namespaces and a bunch of hooplah and are looked up directly on the class).

CodePudding user response：

Ok, to sum up, what @juanpa.arrivillaga, @Samwise and @Barmar mentioned:

a = (1, ) is equiualent to a.__iadd__((1, )) but without loading methods. If we look at dis:

>>> dis.dis("a.__iadd__((1,))")
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (__iadd__)
              4 LOAD_CONST               0 ((1,))
              6 CALL_METHOD              1
              8 RETURN_VALUE

>>> dis.dis("a  = (1, )")
  1           0 LOAD_NAME                0 (a)
              2 LOAD_CONST               0 ((1,))
              4 INPLACE_ADD
              6 STORE_NAME               0 (a)
              8 LOAD_CONST               1 (None)
             10 RETURN_VALUE

>>> dis.dis("a.append(1)")
  1           0 LOAD_NAME                0 (a)
              2 LOAD_METHOD              1 (append)
              4 LOAD_CONST               0 (1)
              6 CALL_METHOD              1
              8 RETURN_VALUE

You can see that in first and third cases we needed to use LOAD_METHOD before call and this is most resourse-mean part, whilst = have direct disassembler

Btw, first case is more worse than previous five and got 8.292503296999712 on timeit

CodePudding user response：

The difference between the tuple and list is entirely due to the difference in how much time it takes to create each. This is easy to verify by timing the list/tuple creation on its own, and by timing the = operation independently of the creation of the objects being added:

>>> timeit("b = [1]", number=10_000_000)
0.447727799997665
>>> timeit("b = (1,)", number=10_000_000)
0.17419059993699193
>>> timeit("a  = b", "a, b = [], [1]", number=10_000_000)
0.5244328000117093
>>> timeit("a  = b", "a, b = [], (1,)", number=10_000_000)
0.5320590999908745