how to combine many 2d numpy arrays into 3d array with padding-CodePudding

I have several numpy 2D arrays containing coordinates. I want to combine all of these 2D arrays into a single 3D array (lists of coordinates), with all missing coordinates padded with [0,0] to make each list the same size. I have something working using np.pad for combining 1D arrays to 2D, but can't get it working to go from 2D to 3D.

For example:

[[0.1,0.1], [0.2,0.2], [0.3,0.3]]

Now, the 2d arrays have different sizes i.e. the hold different numbers of coordinates. The above is a (2,3) array. I may have a (2,5) and a (2,8) and a (2,6).

So taking the above example, and the two arrays below:

[[0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2]]
[[0.3,0.3], [0.3,0.3], [0.3,0.3], [0.3,0.3]]

The result would be (spaces added for clarity):

[
[[0.1,0.1], [0.2,0.2], [0.3,0.3], [0.0,0.0], [0.0,0.0]]
[[0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2]]
[[0.3,0.3], [0.3,0.3], [0.3,0.3], [0.3,0.3], [0.0,0.0]]
]

Notice the final shape is (2,5,3), the first row slice has been padded with 2 [0,0], and the 3rd row slice has been padded with 1 [0,0]

Appreciate any help!

CodePudding user response：

Maybe somebody will provide awesome NumPy magic that does this directly, but in the meantime, you can pad in a Python loop and form an array afterwards:

# the padding element
pad = [0.0, 0.0]

# the coords
data = [
    [[0.1,0.1], [0.2,0.2], [0.3,0.3]],
    [[0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2]],
    [[0.3,0.3], [0.3,0.3], [0.3,0.3], [0.3,0.3]],
]

# do the padding
maxl = max(len(v) for v in data)
for v in data:
    npad = maxl - len(v)
    v.extend([pad] * npad)

# convert to a NumPy array
data = np.array(data)

# data is now (3,5,2), you can reshape as needed:

array([[[0.1, 0.1],
        [0.2, 0.2],
        [0.3, 0.3],
        [0. , 0. ],
        [0. , 0. ]],

       [[0.2, 0.2],
        [0.2, 0.2],
        [0.2, 0.2],
        [0.2, 0.2],
        [0.2, 0.2]],

       [[0.3, 0.3],
        [0.3, 0.3],
        [0.3, 0.3],
        [0.3, 0.3],
        [0. , 0. ]]])

CodePudding user response：

By finding the maximum array length, we can pad the arrays based on that length, just by NumPy:

a = np.array([[0.1, 0.1], [0.2, 0.2], [0.3, 0.3]])
b = np.array([[0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2]])
c = np.array([[0.3, 0.3], [0.3, 0.3], [0.3, 0.3], [0.3, 0.3]])

max_len = max(a.shape[0], b.shape[0], c.shape[0])
pad_fill = np.array([0.0, 0.0])

A_pad = np.pad(a, ((0, max_len - a.shape[0]), (0, 0)), constant_values=pad_fill)
B_pad = np.pad(b, ((0, max_len - b.shape[0]), (0, 0)), constant_values=pad_fill)
C_pad = np.pad(c, ((0, max_len - c.shape[0]), (0, 0)), constant_values=pad_fill)

# A_pad:  [[0.1 0.1] [0.2 0.2] [0.3 0.3] [0.  0. ] [0.  0. ]]
# B_pad:  [[0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2]]
# C_pad:  [[0.3 0.3] [0.3 0.3] [0.3 0.3] [0.3 0.3] [0.  0. ]]

and then using some code like np.concotenate to combine them as you want:

np.concatenate((A_pad, B_pad, C_pad), axis=0).reshape(3, 5, 2)

# [[[0.1 0.1] [0.2 0.2] [0.3 0.3] [0.  0. ] [0.  0. ]]
#  [[0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2]]
#  [[0.3 0.3] [0.3 0.3] [0.3 0.3] [0.3 0.3] [0.  0. ]]]

CodePudding user response：

Using itertools.zip_longest:

from itertools import zip_longest

d = [
    [[0.1, 0.1], [0.2, 0.2], [0.3, 0.3]],
    [[0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2]],
    [[0.3, 0.3], [0.3, 0.3], [0.3, 0.3], [0.3, 0.3]],
]

out = list(zip(*zip_longest(*d, fillvalue=[0.0, 0.0])))
print(np.array(out))

Prints:

[[[0.1 0.1]
  [0.2 0.2]
  [0.3 0.3]
  [0.  0. ]
  [0.  0. ]]

 [[0.2 0.2]
  [0.2 0.2]
  [0.2 0.2]
  [0.2 0.2]
  [0.2 0.2]]

 [[0.3 0.3]
  [0.3 0.3]
  [0.3 0.3]
  [0.3 0.3]
  [0.  0. ]]]