Home > OS >  Why list([]) weighs less than []
Why list([]) weighs less than []

Time:10-20

I have such code:

from sys import getsizeof as sizer

test_list = [i for i in range(10)]
test_list2 = list([i for i in range(10)])
test_list3 = []
for i in range(10):
    test_list3.append(i)

print(f"""{sizer(test_list)} bytes: {test_list}
{sizer(test_list2)} bytes: {test_list2}
{sizer(test_list3)} bytes: {test_list3}""")

With the next result:

184 bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
136 bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
184 bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

The question is: Why does a list created with list([]) weigh less than a list created with just [] or for _ in condition?

CodePudding user response:

lists in Python do not have fixed size. They expand and contract as items are added and removed. In order for that to happen, the list sometimes needs to be moved to a different place in memory. As copying on each added/removed element would slow down the operations by a lot, Python can allocate some spare space for the list when it decides to do the move. Depending on the way you define a list, the size of the list might be exactly what is needed, or include some of those empty bytes to fill with new elements.

If we add some more elements we can see the difference gets negated:

for i in range(1):
    test_list.append(i)
    test_list2.append(i)
    test_list3.append(i)


print(f"""{sizer(test_list)} bytes: {test_list}
{sizer(test_list2)} bytes: {test_list2}
{sizer(test_list3)} bytes: {test_list3}""")
184 bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
184 bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
184 bytes: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0]

CodePudding user response:

(The details in this answer depend on the implementation, they are written to match CPython 3.10.0. Other versions or other implementations of Python work differently.)

Lists in Python are dynamically sized, unfortunately our computers live in hardware land, where memory is physical and can't be squished or stretched.

So if you make your list longer, by appending something to it, Python will have to allocate a new bit of memory and copy all the old cells to the new memory.

Now this is all pretty fast most of the time, but it all adds up. If you want to add a 1000 elements to an empty list one at a time, this means you will have to copy about a million cells in total.

So to make sure that it doesn't have to do it too often, Python usually over-allocates the space needed. For example, if you add a single item to an empty list, it will allocate more than one cell, so Python doesn't have to allocate any more cells for the next couple of items added. This is true whether you use list.append to add items or you use list comprehensions.

Now there are other ways to create lists. Notably, you can create lists from existing iterable objects, using this syntax: list(obj). It works for any iterable, but if len(obj) works, Python only has to allocate memory once, using the exact number of cells it's going to need. So that's the difference. You can also write it as list(range(10)), in which case you don't need to create the intermediate list, which does over-allocate.

As a bonus, try this:

lists = [[], [1], [1, 2], [1, 2, 3]]
for lst in lists:
   print(sizer(lst), lst)

What is the pattern here? Do you know why the pattern is like that?

  • Related