Home > Software engineering >  Prevent and correct numpy arrays from being nested rather than multidimensional
Prevent and correct numpy arrays from being nested rather than multidimensional

Time:03-31

Sometimes when creating new 2d arrays I end up with nested arrays rather than proper multidimensional ones. This leads to a number of complications such as misleading array.shape values.

What I mean is I end up with

array([array([...]), array([...])], dtype=object)

when I want

array([[...], [...]])

I'm not sure at which point in my code leads to the former scenario. I was wondering 1. what is good practice to avoid obtaining such arrays, and 2. any pragmatic fixes to revert it to the multidimensional form.

Regarding the latter, performing

np.array([list(i) for i in nested_array])

works but doesn't seem practical, especially if the dimensionality is higher.

CodePudding user response:

If you have an array of arrays, for example:

import numpy as np

arr1 = np.empty(3,object)
arr1[:] = [np.arange(3), np.arange(3, 6), np.arange(6, 9)]
repr(arr1)

Result:

array([array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])], dtype=object)

Note the dtype there. That may be causing some of your trouble, compare:

arr2 = np.array([np.arange(3), np.arange(3, 6), np.arange(6, 9)])
print(arr2)
print(arr2.dtype)

Result:

[[0 1 2]
 [3 4 5]
 [6 7 8]]
int32

To turn arr1 into an array just like arr2, which is what you are asking about:

arr3 = np.stack(arr1)
print(arr3)
print((arr2 == arr3).all())

Result:

[[0 1 2]
 [3 4 5]
 [6 7 8]]
True

So, make sure your arrays have the datatype you need, and if you cannot avoid ending up with an array of arrays, combine them with numpy.stack().

  • Related