Home > Software engineering >  Retrieving initial lists used for creating a Numpy array
Retrieving initial lists used for creating a Numpy array

Time:10-11

Lets say one has a numpy array generated from lists

import numpy as np


a1 = [1,2,3,4]
a2 = [11,22,33,44]
a3 = [111,222,333,444]
a4 = [1111,2222,3333,4444]

a = []
for x in a1:
 for y in a2:
  for k in a3:
   for l in a4:
        a.append((x, y, k, l))


na = np.array(a)

Now the goal is to retrieve these initial lists from this 2D numpy array. One solution is

na.shape = (4,4,4,4,4)

a1 = na[:,0,0,0,0]
a2 = na[0,:,0,0,1]
a3 = na[0,0,:,0,2]
a4 = na[0,0,0,:,3]

print(a1)
print(a2)
print(a3)
print(a4)
[1 2 3 4]
[11 22 33 44]
[111 222 333 444]
[1111 2222 3333 4444]

This is perfectly fine and my first choice. I'm simply wondering if there's also a fancy way of doing this, thanks

CodePudding user response:

If the values in each original array are always unique you could use numpy's "unique" to find unique values in each column like this:

#--- your code
import numpy as np

a1 = [1,2,3,4]
a2 = [11,22,33,44]
a3 = [111,222,333,444]
a4 = [1111,2222,3333,4444]

a = []
for x in a1:
 for y in a2:
  for k in a3:
   for l in a4:
        a.append((x, y, k, l))

na = np.array(a)

#--- suggested solution
original_arrays = [np.unique(column) for column in na.T]
>>> original_arrays

[array([1, 2, 3, 4]),
 array([11, 22, 33, 44]),
 array([111, 222, 333, 444]),
 array([1111, 2222, 3333, 4444])]

Details of the solution:

  • First we loop through the columns of the array using list comprehension to construct a list of our outputs (instead of creating an empty list and appending to it in a for loop)
columns = [column for column in na.T]
  • Now instead of just looping through the columns we find the unique values in each column using the numpy "unique" function.
original_arrays = [np.unique(column) for column in na.T]

And the result is a list of NumPy arrays containing the unique values in each column:

 >>> original_arrays

[array([1, 2, 3, 4]),
 array([11, 22, 33, 44]),
 array([111, 222, 333, 444]),
 array([1111, 2222, 3333, 4444])]

CodePudding user response:

The initial na and shape:

In [117]: na
Out[117]: 
array([[   1,   11,  111, 1111],
       [   1,   11,  111, 2222],
       [   1,   11,  111, 3333],
       ...,
       [   4,   44,  444, 2222],
       [   4,   44,  444, 3333],
       [   4,   44,  444, 4444]])
In [118]: na.shape
Out[118]: (256, 4)

Your indexing works with

naa=na.reshape(4,4,4,4,4)

Initially I missed the fact that you were using

na.shape = (4,4,4,4,4)

to do this reshape. (I use reshape far more often than the in-place reshape.)

The a# values appear in the respective columns, but with many repeats. You can skip those with the right slicing.

In [119]: na[:4,3]
Out[119]: array([1111, 2222, 3333, 4444])
In [122]: na[:16:4,2]
Out[122]: array([111, 222, 333, 444])
In [123]: na[:16*4:16,1]
Out[123]: array([11, 22, 33, 44])
In [124]: na[:16*4*4:16*4,0]
Out[124]: array([1, 2, 3, 4])

On the 5d version, your solution is probably as good as any. It's not a common arrangement of values, so it's unlikely that there will be a built-in shortcut.

  • Related