I have several numpy 2D arrays containing coordinates. I want to combine all of these 2D arrays into a single 3D array (lists of coordinates), with all missing coordinates padded with [0,0] to make each list the same size. I have something working using np.pad for combining 1D arrays to 2D, but can't get it working to go from 2D to 3D.
For example:
[[0.1,0.1], [0.2,0.2], [0.3,0.3]]
Now, the 2d arrays have different sizes i.e. the hold different numbers of coordinates. The above is a (2,3) array. I may have a (2,5) and a (2,8) and a (2,6).
So taking the above example, and the two arrays below:
[[0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2]]
[[0.3,0.3], [0.3,0.3], [0.3,0.3], [0.3,0.3]]
The result would be (spaces added for clarity):
[
[[0.1,0.1], [0.2,0.2], [0.3,0.3], [0.0,0.0], [0.0,0.0]]
[[0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2]]
[[0.3,0.3], [0.3,0.3], [0.3,0.3], [0.3,0.3], [0.0,0.0]]
]
Notice the final shape is (2,5,3), the first row slice has been padded with 2 [0,0], and the 3rd row slice has been padded with 1 [0,0]
Appreciate any help!
CodePudding user response:
Maybe somebody will provide awesome NumPy magic that does this directly, but in the meantime, you can pad in a Python loop and form an array afterwards:
# the padding element
pad = [0.0, 0.0]
# the coords
data = [
[[0.1,0.1], [0.2,0.2], [0.3,0.3]],
[[0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2], [0.2,0.2]],
[[0.3,0.3], [0.3,0.3], [0.3,0.3], [0.3,0.3]],
]
# do the padding
maxl = max(len(v) for v in data)
for v in data:
npad = maxl - len(v)
v.extend([pad] * npad)
# convert to a NumPy array
data = np.array(data)
# data is now (3,5,2), you can reshape as needed:
array([[[0.1, 0.1],
[0.2, 0.2],
[0.3, 0.3],
[0. , 0. ],
[0. , 0. ]],
[[0.2, 0.2],
[0.2, 0.2],
[0.2, 0.2],
[0.2, 0.2],
[0.2, 0.2]],
[[0.3, 0.3],
[0.3, 0.3],
[0.3, 0.3],
[0.3, 0.3],
[0. , 0. ]]])
CodePudding user response:
By finding the maximum array length, we can pad the arrays based on that length, just by NumPy:
a = np.array([[0.1, 0.1], [0.2, 0.2], [0.3, 0.3]])
b = np.array([[0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2]])
c = np.array([[0.3, 0.3], [0.3, 0.3], [0.3, 0.3], [0.3, 0.3]])
max_len = max(a.shape[0], b.shape[0], c.shape[0])
pad_fill = np.array([0.0, 0.0])
A_pad = np.pad(a, ((0, max_len - a.shape[0]), (0, 0)), constant_values=pad_fill)
B_pad = np.pad(b, ((0, max_len - b.shape[0]), (0, 0)), constant_values=pad_fill)
C_pad = np.pad(c, ((0, max_len - c.shape[0]), (0, 0)), constant_values=pad_fill)
# A_pad: [[0.1 0.1] [0.2 0.2] [0.3 0.3] [0. 0. ] [0. 0. ]]
# B_pad: [[0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2]]
# C_pad: [[0.3 0.3] [0.3 0.3] [0.3 0.3] [0.3 0.3] [0. 0. ]]
and then using some code like np.concotenate
to combine them as you want:
np.concatenate((A_pad, B_pad, C_pad), axis=0).reshape(3, 5, 2)
# [[[0.1 0.1] [0.2 0.2] [0.3 0.3] [0. 0. ] [0. 0. ]]
# [[0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2] [0.2 0.2]]
# [[0.3 0.3] [0.3 0.3] [0.3 0.3] [0.3 0.3] [0. 0. ]]]
CodePudding user response:
Using itertools.zip_longest
:
from itertools import zip_longest
d = [
[[0.1, 0.1], [0.2, 0.2], [0.3, 0.3]],
[[0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2], [0.2, 0.2]],
[[0.3, 0.3], [0.3, 0.3], [0.3, 0.3], [0.3, 0.3]],
]
out = list(zip(*zip_longest(*d, fillvalue=[0.0, 0.0])))
print(np.array(out))
Prints:
[[[0.1 0.1]
[0.2 0.2]
[0.3 0.3]
[0. 0. ]
[0. 0. ]]
[[0.2 0.2]
[0.2 0.2]
[0.2 0.2]
[0.2 0.2]
[0.2 0.2]]
[[0.3 0.3]
[0.3 0.3]
[0.3 0.3]
[0.3 0.3]
[0. 0. ]]]