Lets say one has a numpy array generated from lists
import numpy as np
a1 = [1,2,3,4]
a2 = [11,22,33,44]
a3 = [111,222,333,444]
a4 = [1111,2222,3333,4444]
a = []
for x in a1:
for y in a2:
for k in a3:
for l in a4:
a.append((x, y, k, l))
na = np.array(a)
Now the goal is to retrieve these initial lists from this 2D numpy array. One solution is
na.shape = (4,4,4,4,4)
a1 = na[:,0,0,0,0]
a2 = na[0,:,0,0,1]
a3 = na[0,0,:,0,2]
a4 = na[0,0,0,:,3]
print(a1)
print(a2)
print(a3)
print(a4)
[1 2 3 4]
[11 22 33 44]
[111 222 333 444]
[1111 2222 3333 4444]
This is perfectly fine and my first choice. I'm simply wondering if there's also a fancy way of doing this, thanks
CodePudding user response:
If the values in each original array are always unique you could use numpy's "unique" to find unique values in each column like this:
#--- your code
import numpy as np
a1 = [1,2,3,4]
a2 = [11,22,33,44]
a3 = [111,222,333,444]
a4 = [1111,2222,3333,4444]
a = []
for x in a1:
for y in a2:
for k in a3:
for l in a4:
a.append((x, y, k, l))
na = np.array(a)
#--- suggested solution
original_arrays = [np.unique(column) for column in na.T]
>>> original_arrays
[array([1, 2, 3, 4]),
array([11, 22, 33, 44]),
array([111, 222, 333, 444]),
array([1111, 2222, 3333, 4444])]
Details of the solution:
- First we loop through the columns of the array using list comprehension to construct a list of our outputs (instead of creating an empty list and appending to it in a for loop)
columns = [column for column in na.T]
- Now instead of just looping through the columns we find the unique values in each column using the numpy "unique" function.
original_arrays = [np.unique(column) for column in na.T]
And the result is a list of NumPy arrays containing the unique values in each column:
>>> original_arrays
[array([1, 2, 3, 4]),
array([11, 22, 33, 44]),
array([111, 222, 333, 444]),
array([1111, 2222, 3333, 4444])]
CodePudding user response:
The initial na
and shape:
In [117]: na
Out[117]:
array([[ 1, 11, 111, 1111],
[ 1, 11, 111, 2222],
[ 1, 11, 111, 3333],
...,
[ 4, 44, 444, 2222],
[ 4, 44, 444, 3333],
[ 4, 44, 444, 4444]])
In [118]: na.shape
Out[118]: (256, 4)
Your indexing works with
naa=na.reshape(4,4,4,4,4)
Initially I missed the fact that you were using
na.shape = (4,4,4,4,4)
to do this reshape. (I use reshape
far more often than the in-place reshape.)
The a#
values appear in the respective columns, but with many repeats. You can skip those with the right slicing.
In [119]: na[:4,3]
Out[119]: array([1111, 2222, 3333, 4444])
In [122]: na[:16:4,2]
Out[122]: array([111, 222, 333, 444])
In [123]: na[:16*4:16,1]
Out[123]: array([11, 22, 33, 44])
In [124]: na[:16*4*4:16*4,0]
Out[124]: array([1, 2, 3, 4])
On the 5d version, your solution is probably as good as any. It's not a common arrangement of values, so it's unlikely that there will be a built-in shortcut.