I just noted this:
import numpy as np
import sys
arr = np.broadcast_to(0, (10, 1000000000000))
print(arr.nbytes) # prints "80000000000000"
print(sys.getsizeof(arr)) # prints "120"
Is this a bug or intended behavior? I.e., is nbytes
meant to hold the amount of "logical" bytes, not accounting for 0-strides?
CodePudding user response:
While I don't see it documented, nbytes
does look like the product of shape and itemsize
, or arr.size*arr.itemsize
.
In all examples I've looked at nbytes
uses the arrays of shape/size, not that of its base
. So I wouldn't read too much into the "consumed" used in the docs.
Your example:
In [117]: arr = np.broadcast_to(0,(1,2,3))
In [119]: arr.shape, arr.strides, arr.nbytes
Out[119]: ((1, 2, 3), (0, 0, 0), 24)
In [120]: arr.base
Out[120]: array(0)
In [121]: arr.base.nbytes
Out[121]: 4
The broadcasted array is a view
of a much smaller one; nbytes
reflects its own shape, not the shape of the base.
To take another example, where the view
is a subset of the base:
In [122]: np.arange(100).nbytes
Out[122]: 400
In [123]: np.arange(100)[::4].nbytes
Out[123]: 100
The code for broadcast_to
is viewable at np.lib.stride_tricks._broadcast_to
. It uses np.nditer
to generate the new view.
sys.getsizeof
does a reasonable job of returning memory use for an array with its on data (i.e. base
is None
). It does not provide any useful information for a view
.
sliding_windows
Another example of striding tricks used to make a "larger" array:
In [180]: arr = np.arange(16).reshape(4,4).copy()
In [181]: arr.shape, arr.strides, arr.nbytes
Out[181]: ((4, 4), (16, 4), 64)
In [182]: res = np.lib.stride_tricks.sliding_window_view(arr,(2,2))
In [183]: res.shape, res.strides, res.nbytes
Out[183]: ((3, 3, 2, 2), (16, 4, 16, 4), 144)
It's a view
of the original (4,4) arr
:
In [184]: res.base
Out[184]: <numpy.lib.stride_tricks.DummyArray at 0x1fa8e7cc730>
In [185]: res.base.base
Out[185]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15]])
In [186]: res.base.base is arr
Out[186]: True