I have a question about turning a list of numpy arrays into an object array.
import numpy as np
testing_1=[np.array([1]),np.array([2]),np.array([3]),np.array([4]),np.array([np.nan])]
testing_1_array=np.asarray(testing_1, dtype=object)
testing_2=[np.array([1]),np.array([2,3]),np.array([4]),np.array([np.nan])]
testing_2_array=np.asarray(testing_2, dtype=object)
This results in two very different outcomes:
testing_1_array
Out[12]:
array([[1],
[2],
[3],
[4],
[nan]], dtype=object)
testing_2_array
Out[13]: array([array([1]), array([2, 3]), array([4]), array([nan])], dtype=object)
I assume that the difference comes from the fact that in testing_2_array not all arrays have the same size. Is there any way to force numpy to output testing_1_array in the same way as testing_2_output so that I do not have to additionally check if all arrays in the initial list have the same size?
CodePudding user response:
np.array
tries, where possible to make a multidimensional numeric dtype array. Creating a ragged object dtype array is a fall back option. And with some combinations of shapes, even that raises an error. Specifying object dtype doesn't change that fundamental behavior.
Creating a "empty" array and filling it is the most general option.
In [272]: arr = np.empty(5,object) # filled with None
In [273]: arr[:] = [np.array([1]),np.array([2]),np.array([3]),np.array([4]),np.a
...: rray([np.nan])]
In [274]: arr
Out[274]:
array([array([1]), array([2]), array([3]), array([4]), array([nan])],
dtype=object)
It also works with the ragged shape:
In [276]: arr = np.empty(4,object)
In [277]: arr[:] = [np.array([1]),np.array([2,3]),np.array([4]),np.array([np.nan
...: ])]
In [278]: arr
Out[278]: array([array([1]), array([2, 3]), array([4]), array([nan])], dtype=object)