I have an array in the following form where the first two columns are supposed to be indices of a 2-dimensional array and the following columns are arbitrary values.

data = np.array([[ 0. ,  1. , 48. ,  4. ],
                 [ 1. ,  2. , 44. ,  4.4],
                 [ 1. ,  1. , 34. ,  2.3],
                 [ 0. ,  2. , 55. ,  2.2],
                 [ 0. ,  0. , 42. ,  2. ],
                 [ 1. ,  0. , 22. ,  1. ]])

How do I combine the indices data[:,:2] with their values data[:,2:] such that the resulting array is accessible by the indices in the first two columns.

In my example that would be:

result = np.array([[[42. ,  2. ], [48. ,  4. ], [55. ,  2.2]],
                   [[22. ,  1. ], [34. ,  2.3], [44. ,  4.4]]])

I know that there is a trivial solution using python loops. But performance is a concern since I'm dealing with a huge amount of data. Specifically it's output of another program that I need to process.

Maybe there is a relatively trivial numpy solution as well. But I'm kind of stuck.

If it helps the following can be safely assumed:

All numbers in the first two columns are whole numbers (although the array consists of floats).
Every possible index (or rather combinations of indices) in the original array is used exactly once. I.e. there is guaranteed to be exactly one entry of the form [i, j, ...].
The indices start at 0 and I know the highest indices beforehand.

Edit:

Hmm. I see now how my example is misleading. The truth is that some of my input arrays are sorted, but that's unreliable. So I shouldn't assume anything about the order. I reordered some rows in my example to make it clearer. In case anyone wants to make sense of the answer and comment below: In my original question the array appeared to be sorted by the first two columns.

CodePudding user response：

find row, column, depth base your data array, then fill like below:

import numpy as np
data = np.array([[ 0. ,  0. , 42. ,  2. ],
                 [ 0. ,  1. , 48. ,  4. ],
                 [ 0. ,  2. , 55. ,  2.2],
                 [ 1. ,  0. , 22. ,  1. ],
                 [ 1. ,  1. , 34. ,  2.3],
                 [ 1. ,  2. , 44. ,  4.4]])

row = int(max(data[:,0])) 1
col = int(max(data[:,1])) 1
depth = len(data[0, 2:])

out = np.zeros([row, col, depth])

out = data[:, 2:].reshape(row,col,depth)
print(out)

Output:

[[[42.   2. ]
  [48.   4. ]
  [55.   2.2]]

 [[22.   1. ]
  [34.   2.3]
  [44.   4.4]]]