Are the elements created by numpy.repeat() views of the original numpy.array or unique elements?-CodePudding

I have a 3D array that I like to repeat 4 times.

Achieved via a mixture of Numpy and Python methods:

>>> z = np.arange(9).reshape(3,3)
>>> z
array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])
>>> z2 = []

>>> for i in range(4):
    z2.append(z)

    
>>> z2
[array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]]), array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]]), array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]]), array([[0, 1, 2],
       [3, 4, 5],
       [6, 7, 8]])]
>>> z2 = np.array(z2)
>>> z2
array([[[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]],

       [[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]],

       [[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]],

       [[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]]])

Achieved via Pure NumPy:

>>> z2 = np.repeat(z[np.newaxis,...], 4, axis=0)
>>> z2
array([[[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]],

       [[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]],

       [[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]],

       [[0, 1, 2],
        [3, 4, 5],
        [6, 7, 8]]])

Are the elements created by numpy.repeat() views of the original numpy.array() or unique elements?

If the latter, is there an equivalent NumPy functions that can create views of the original array the same way as numpy.repeat()?

I think such an ability can help reduce the buffer space of z2 in the event size of z is large and when there are many repeats of z involved.

CodePudding user response：

If you want a writable version, it is doable, but it's really ugly.

If you want a read-only version, np.broadcast_to(z, (4, 3, 3)) should be all you need.

Now the ugly writable version. Be careful. You can corrupt memory if you mess the arguments up.

> z.shape
(3, 3)
> z.strides
(24, 8)
from numpy.lib.stride_tricks import as_strided
z2 = as_strided(z, shape=(4, 3, 3), strides=(0, 24, 8))

and you end up with:

>>> z2[1, 1, 1]
4
>>> z2[1, 1, 1] = 100
>>> z2[2, 1, 1]
100
>>>

You are using strides to say that I want to create a second array overlayed on top of the first array. You set its new shape, and you prefix 0 to the previous stride, indicating that the first dimension has no effect on the data you want.

Make sure you understand strides.

CodePudding user response：

numpy.repeat creates a new array and not a view (you can check it by looking the __array_interface__ field). In fact, it is not possible to create a view on the original array in the general case since Numpy views does not support such pattern. A views is basically just an object containing a pointer to a raw memory buffer, some strides, a shape and a type. While it is possible to repeat one item N times with a 0 stride, it is not possible to repeat 2 items N times (without adding a new dimension to the output array). Thus, no there is no way to build a function like numpy.repeat having the same array output shape to repeat items of the last axis. If adding a new dimension is Ok, then you can build an array with a new dimension and a stride set to 0. Repeating the last dimension is possible though. The answer of @FrankYellin gives a good example. Note that reshaping/ravel the resulting array cause a mandatory copy. Supporting such advanced views would make the Numpy code more complex or/and less efficient for a feature that is only used rarely by users.