Home > Blockchain >  Does NumPy array really take less memory than python list?
Does NumPy array really take less memory than python list?

Time:11-27

Please refer to below execution -

import sys

_list = [2,55,87]
print(f'1 - Memory used by Python List - {sys.getsizeof(_list)}')
      
narray = np.array([2,55,87])
size = narray.size * narray.itemsize
print(f'2 - Memory usage of np array using itemsize  - {size}')
print(f'3 - Memory usage of np array using getsizeof  - {sys.getsizeof(narray)}')

Here is what I get in result

1 - Memory used by Python List - 80
2 - Memory usage of np array using itemsize  - 12
3 - Memory usage of np array using getsizeof  - 116

One way of calculation suggests numpy array is consuming way too less memory but other says it is consuming more than regular python list? Shouldn't I be using getSizeOf with numpy array. What I am doing wrong here?

Edit - I just checked, an empty python list is consuming 56 bytes whereas an empty np array 104. Is this space being used in pointing to associated built-in methods and attributes?

CodePudding user response:

The calculation using:

size = narray.size * narray.itemsize

does not include the memory consumed by non-element attributes of the array object. This can be verified by the documentation of ndarray.nbytes:

>>> x = np.zeros((3,5,2), dtype=np.complex128)
>>> x.nbytes
480
>>> np.prod(x.shape) * x.itemsize
480

In the above link, it can be read that ndarray.nbytes:

Does not include memory consumed by non-element attributes of the array object.

Note that from the code above you can conclude that your calculation excludes non-element attributes given that the value is equal to the one from ndarray.nbytes.

A list of the non-element attributes can be found in the section Array Attributes, including here for completeness:

ndarray.flags Information about the memory layout of the array.
ndarray.shape Tuple of array dimensions.
ndarray.strides Tuple of bytes to step in each dimension when traversing an array.
ndarray.ndim Number of array dimensions.
ndarray.data Python buffer object pointing to the start of the array’s data.
ndarray.size Number of elements in the array.
ndarray.itemsize Length of one array element in bytes.
ndarray.nbytes Total bytes consumed by the elements of the array.
ndarray.base Base object if memory is from some other object.

With regards to sys.getsizeof it can be read in the documentation (emphasis mine) that:

Only the memory consumption directly attributed to the object is accounted for, not the memory consumption of objects it refers to.

CodePudding user response:

Because numpy arrays have shapes, strides, and other member variables that define the data layout it is reasonable that (might) require some extra memory for this!

A list on the other hand has no specific type, or shape, etc.

Although, if you start appending elements on a list instead of simply writing them as an array, and also go to larger numbers of elements, e.g. 1e7, you will see different behaviour!

Note that is not only the memory footprint important in an application's efficiency! The data layout is sometimes way more important.

Example case:

import numpy as np
import sys

N = int(1e7)

narray = np.zeros(N);
mylist = []

for i in range(N):
    mylist.append(narray[i])

print("size of np.array:", sys.getsizeof(narray))
print("size of list    :", sys.getsizeof(mylist))

On my (ASUS) Ubuntu 20.04 PC I get:

size of np.array: 80000104
size of list    : 81528048
  • Related