I'm trying to stack numpy arrays that are stored in a dictionary. So far, I found several ways to do this. Unfortunately, the most elegant code prints a FutureWarning:
sys:1: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.
.
CODE EXAMPLE No.1 (doesn't give a warning, but is too cumbersome):
data_stack = np.stack([data_matrix[0], data_matrix[1], data_matrix[2], data_matrix[3], data_matrix[4], data_matrix[5], data_matrix[6], data_matrix[7], data_matrix[8], data_matrix[9], data_matrix[10], data_matrix[11]], axis=1)
.
CODE EXAMPLE No.2 (more concise and still doesn't give a warning):
data_stack = np.stack([data_matrix[key] for key in data_matrix.keys()], axis=1)
.
CODE EXAMPLE No.3 (this one I like the most, but it gives a warning):
data_stack = np.stack(data_matrix.values(), axis=1)
I have tried fixing this as follows:
data_stack = np.stack([data_matrix.values()], axis=1)
But this seems to break the code completely:
Traceback (most recent call last):
File "./CLN40ULPEF_PttV1350W1350G0000S0000T025.lib.py", line 261, in <module>
data_stack = np.stack([data_matrix.values()], axis=1)
File "<__array_function__ internals>", line 5, in stack
File "/usr/lib64/python3.9/site-packages/numpy/core/shape_base.py", line 430, in stack
axis = normalize_axis_index(axis, result_ndim)
numpy.AxisError: axis 1 is out of bounds for array of dimension 1
.
Is there a (simple) way to make data_matrix.values()
work with numpy.stack
without giving a warning?
CodePudding user response:
The first thing that np.stack
does, after using the generator-warning function (arrays = _arrays_for_stack_dispatcher(arrays, stacklevel=6)
) is:
arrays = [asanyarray(arr) for arr in arrays]
then it checks for matching shapes
shapes = {arr.shape for arr in arrays}
and expands the array dimensions
expanded_arrays = [arr[sl] for arr in arrays]
That's lots of list comprehensions.
Trying to save typing or time by passing the dict_values
generator instead of a list, is a waste of your time and effort.
In [159]: adict.values()
Out[159]:
dict_values([array([[1., 1.],
[1., 1.],
[1., 1.]]), array([[0., 0.],
[0., 0.],
[0., 0.]])])
In [160]: list(adict.values())
Out[160]:
[array([[1., 1.],
[1., 1.],
[1., 1.]]),
array([[0., 0.],
[0., 0.],
[0., 0.]])]
Applying list to values()
is normal Python3 practice. For example you can't index the generator:
In [162]: adict.values()[0]
Traceback (most recent call last):
File "<ipython-input-162-23f4ccd9e2f7>", line 1, in <module>
adict.values()[0]
TypeError: 'dict_values' object is not subscriptable
In [163]: list(adict.values())[0]
Out[163]:
array([[1., 1.],
[1., 1.],
[1., 1.]])
CodePudding user response:
Here
data_stack = np.stack([data_matrix[key] for key in data_matrix.keys()], axis=1)
you are iterating over keys of data_matrix
dict
and the retrieve value for each given key. You might simply iterate over values instead, i.e.:
data_stack = np.stack([v for v in data_matrix.values()], axis=1)
CodePudding user response:
Thank you all for your help. A suitable answer is found in one of the comments by @mapf. Hence, I'm going to post it here as a final solution:
data_stack = np.stack(list(data_matrix.values()), axis=1)