Vectorized way to contract Numpy array using advanced indexing-CodePudding

I have a Numpy array of dimensions (d1,d2,d3,d4), for instance A = np.arange(120).reshape((2,3,4,5)). I would like to contract it so as to obtain B of dimensions (d1,d2,d4). The d3-indices of parts to pick are collected in an indexing array Idx of dimensions (d1,d2). Idx provides, for each couple (x1,x2) of indices along (d1,d2), the index x3 for which B should retain the whole corresponding d4-line in A, for example Idx = rng.integers(4, size=(2,3)).

To sum up, for all (x1,x2), I want B[x1,x2,:] = A[x1,x2,Idx[x1,x2],:].

Is there an efficient, vectorized way to do that, without using a loop? I'm aware that this is similar to Easy way to do nd-array contraction using advanced indexing in Python but I have trouble extending the solution to higher dimensional arrays.

MWE

A = np.arange(120).reshape((2,3,4,5))
Idx = rng.integers(4, size=(2,3))

# correct result:
B = np.zeros((2,3,5))
for i in range(2):
    for j in range(3):
        B[i,j,:] = A[i,j,Idx[i,j],:]

# what I would like, which doesn't work:
B = A[:,:,Idx[:,:],:]

CodePudding user response：

One way to do that is np.squeeze(np.take_along_axis(A, Idx[:,:,None,None], axis=2), axis=2).

For example,

In [49]: A = np.arange(120).reshape(2, 3, 4, 5)

In [50]: rng = np.random.default_rng(0xeeeeeeeeeee)

In [51]: Idx = rng.integers(4, size=(2,3))

In [52]: Idx
Out[52]: 
array([[2, 0, 1],
       [0, 2, 1]])

In [53]: C = np.squeeze(np.take_along_axis(A, Idx[:,:,None,None], axis=2), axis=2)

In [54]: C
Out[54]: 
array([[[ 10,  11,  12,  13,  14],
        [ 20,  21,  22,  23,  24],
        [ 45,  46,  47,  48,  49]],

       [[ 60,  61,  62,  63,  64],
        [ 90,  91,  92,  93,  94],
        [105, 106, 107, 108, 109]]])

Check the known correct result:

In [55]: # correct result:
    ...: B = np.zeros((2,3,5))
    ...: for i in range(2):
    ...:     for j in range(3):
    ...:         B[i,j,:] = A[i,j,Idx[i,j],:]
    ...: 

In [56]: B
Out[56]: 
array([[[ 10.,  11.,  12.,  13.,  14.],
        [ 20.,  21.,  22.,  23.,  24.],
        [ 45.,  46.,  47.,  48.,  49.]],

       [[ 60.,  61.,  62.,  63.,  64.],
        [ 90.,  91.,  92.,  93.,  94.],
        [105., 106., 107., 108., 109.]]])

CodePudding user response：

Times for 3 alternatives:

In [91]: %%timeit
    ...: B = np.zeros((2,3,5),A.dtype)
    ...: for i in range(2):
    ...:     for j in range(3):
    ...:         B[i,j,:] = A[i,j,Idx[i,j],:]
    ...: 

11 µs ± 48.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [92]: timeit A[np.arange(2)[:,None],np.arange(3),Idx]
8.58 µs ± 44 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In [94]: timeit np.squeeze(np.take_along_axis(A, Idx[:,:,None,None], axis=2), axis=2)
29.4 µs ± 448 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Relative times may differ with larger arrays. But this is a good size for testing the correctness.