Why Can't Numpy Produce an Array from a List of Numpy Arrays?-CodePudding

I'm writing some code to group vectors by the angles between them. For example I might have an array of vectors:

vectors = np.array([[1, 0, 0], [1.1, 0, 0], [0, 2, 2]])

With an acceptable angle deviation of 0.1 radians for example. Currently, I'm doing this in a while loop like so:

groups = []
while not vectors.size == 0:
    vector = vectors[0]
    angles = (vectors @ vector)/(np.linalg.norm(vector, axis=1))
    angles = np.arccos(angles/np.linalg.norm(vector))
    group = vectors[angles <= angle]
    groups.append(group)
    vectors = vectors[angles > angle]
return np.array(groups)

I expect this to return a numpy array with the following form:

expected_array = np.array([[[1, 0, 0], [1.1, 0, 0]], [[0, 2, 2]]])

But instead I get the following:

actual_array = np.array([array([[1. , 0. , 0. ], [1.1, 0. , 0. ]]),
                         array([[0. , 2, 2]])])

Why doesn't Numpy notice that the list contains arrays and give me what I expect? Is there a way of making Numpy notice this? Or do you always have to use np.concatenate or something similar to get the desired result?

CodePudding user response：

"I expect this to return a numpy array with the following form:"

In [420]: np.array([[[1, 0, 0], [1.1, 0, 0]], [[0, 2, 2]]])
<ipython-input-420-a1f3305ab5c3>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray. np.array([[[1, 0, 0], [1.1, 0, 0]], [[0, 2, 2]]])

Out[420]: array([list([[1, 0, 0], [1.1, 0, 0]]), list([[0, 2, 2]])], dtype=object)

Is that really what you expected? An array that preserves the nesting of the lists?

vstack (or concatenate) can join the lists/arrays with the lists, to make a 2d array:

In [421]: np.vstack([[[1, 0, 0], [1.1, 0, 0]], [[0, 2, 2]]])
Out[421]: 
array([[1. , 0. , 0. ],
       [1.1, 0. , 0. ],
       [0. , 2. , 2. ]])

Converting those 2 arrays back to lists:

In [422]: _420.tolist()
Out[422]: [[[1, 0, 0], [1.1, 0, 0]], [[0, 2, 2]]]
In [423]: _421.tolist()
Out[423]: [[1.0, 0.0, 0.0], [1.1, 0.0, 0.0], [0.0, 2.0, 2.0]]

The first has 3 levels of nesting, same as the original; the second has only 2.

===

Your code isn't runnable:

In [424]: vectors = np.array([[1, 0, 0], [1.1, 0, 0], [0, 2, 2]])
In [425]: groups = []
     ...: while not vectors.size == 0:
     ...:     vector = vectors[0]
     ...:     angles = (vectors @ vector)/(np.linalg.norm(vector, axis=1))
     ...:     angles = np.arccos(angles/np.linalg.norm(vector))
     ...:     group = vectors[angles <= angle]
     ...:     groups.append(group)
     ...:     vectors = vectors[angles > angle]
     ...: 
Traceback (most recent call last):
  File "<ipython-input-425-e50fafbda1c3>", line 4, in <module>
    angles = (vectors @ vector)/(np.linalg.norm(vector, axis=1))
  File "<__array_function__ internals>", line 5, in norm
  File "/usr/local/lib/python3.8/dist-packages/numpy/linalg/linalg.py", line 2561, in norm
    return sqrt(add.reduce(s, axis=axis, keepdims=keepdims))
AxisError: axis 1 is out of bounds for array of dimension 1

I was hoping to see the list groups, before you tried to make an array from it. I don't feel like debugging your sample.

CodePudding user response：

In short: It can! (But not in the way you want.)

A search around this topic finds this question: Numpy stack with unequal shapes

In which the following is made clear: numpy arrays must be rectangular, as the above example won't produce a rectangular array when added together this doesn't work.

If you try and do the same thing with arrays of the same size it will work. For example:

array = np.array([np.array([1, 2, 3]), np.array([1, 0, 1]))

Will produce an array with the shape (2, 3).

However, with arrays of different sizes the numpy array dtype defaults to the object, and all the individual arrays get stored instead.