Home > Software design >  How to assign [N * 1 * M] values to only the diagonals of a [N*M*M] array?
How to assign [N * 1 * M] values to only the diagonals of a [N*M*M] array?

Time:09-22

I am importing data that comes in a list of N 1D arrays of [x,y,z] and I want to assign them to the diagonals of an N-dimensional array of shape [N,3,3].

The below works... it is very slow due to the loop. Is there a better way to do this via matrix operations?

import numpy as np
a=np.arange(6).reshape(2,3,1)
I=np.zeros((2,3,3))
points=a.shape[0]
for step in range(points):
    np.fill_diagonal(I[step],a[step]
print(I)

  [[[0., 0., 0.],
    [0., 1., 0.],
    [0., 0., 2.]],

   [[3., 0., 0.],
    [0., 4., 0.],
    [0., 0., 5.]]]

CodePudding user response:

Here's one approach (note that you should set the dtype to fit your needs, I used 8-bit integers for the sake of memory):

I = np.zeros((2, 3, 3), dtype="uint8")
x, y, _ = I.shape
idxs = np.arange(y)

I[..., idxs, idxs] = np.arange(x * y).reshape(x, y)

Output:

In [4]: I
Out[4]:
array([[[0, 0, 0],
        [0, 1, 0],
        [0, 0, 2]],

       [[3, 0, 0],
        [0, 4, 0],
        [0, 0, 5]]], dtype=uint8)

A slightly larger example:

In [5]: I = np.zeros((3, 6, 6), dtype="uint8")

In [6]: x, y, _ = I.shape

In [7]: idxs = np.arange(y)

In [8]: I[..., idxs, idxs] = np.arange(x * y).reshape(x, y)

In [9]: I
Out[9]: 
array([[[ 0,  0,  0,  0,  0,  0],
        [ 0,  1,  0,  0,  0,  0],
        [ 0,  0,  2,  0,  0,  0],
        [ 0,  0,  0,  3,  0,  0],
        [ 0,  0,  0,  0,  4,  0],
        [ 0,  0,  0,  0,  0,  5]],

       [[ 6,  0,  0,  0,  0,  0],
        [ 0,  7,  0,  0,  0,  0],
        [ 0,  0,  8,  0,  0,  0],
        [ 0,  0,  0,  9,  0,  0],
        [ 0,  0,  0,  0, 10,  0],
        [ 0,  0,  0,  0,  0, 11]],

       [[12,  0,  0,  0,  0,  0],
        [ 0, 13,  0,  0,  0,  0],
        [ 0,  0, 14,  0,  0,  0],
        [ 0,  0,  0, 15,  0,  0],
        [ 0,  0,  0,  0, 16,  0],
        [ 0,  0,  0,  0,  0, 17]]], dtype=uint8)

Some basic benchmarking:

In [10]: I = np.zeros((10000, 3, 3), dtype="uint8")

In [11]: x, y, _ = I.shape

In [12]: idxs = np.arange(y)

In [13]: %timeit I[..., idxs, idxs] = np.arange(x * y).reshape(x, y)
26.6 µs ± 61.7 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

CodePudding user response:

np.einsum should be rather efficient here:

a=np.arange(6).reshape(2,3,1)
I=np.zeros((2,3,3))

np.einsum("ijj->ij",I)[...] = a[...,0]

I
# array([[[0., 0., 0.],
#         [0., 1., 0.],
#         [0., 0., 2.]],
#
#        [[3., 0., 0.],
#         [0., 4., 0.],
#         [0., 0., 5.]]])

Simple benchmarking:

import numpy as np
from timeit import timeit

def embed_einsum(a):
    n,m = a.shape
    out = np.zeros((n,m,m),a.dtype)
    np.einsum("ijj->ij",out)[...] = a                        
    return out

def embed_fancy_idx(a):
    n,m = a.shape
    out = np.zeros((n,m,m),a.dtype)
    idx = np.arange(m)
    out[:,idx,idx] = a
    return out

a = np.arange(6).reshape(2,3)
assert (embed_einsum(a)==embed_fancy_idx(a)).all()

for a in (np.arange(3*n).reshape(n,3) for n in (2,200,20000,2000000)):
          print("a.shape =",a.shape)
          print('einsum',timeit(lambda:embed_einsum(a),number=10)*100,'ms')
          print('fancy indexing',
                timeit(lambda:embed_fancy_idx(a),number=10)*100,'ms')

Result:

a.shape = (2, 3)
einsum 0.004329100192990154 ms
fancy indexing 0.005347799742594361 ms
a.shape = (200, 3)
einsum 0.00677639982313849 ms
fancy indexing 0.005546100146602839 ms
a.shape = (20000, 3)
einsum 0.30451889979303814 ms
fancy indexing 0.2589278003142681 ms
a.shape = (2000000, 3)
einsum 57.06863939994946 ms
fancy indexing 106.78901110004517 ms

einsum comes out on top for very large operands, but @ddejohn's method is faster for up to ~100000 elements on my rig.

  • Related