Home > database >  numpy ValueError when trying to sort list of np.ndarray with respect to list of ints
numpy ValueError when trying to sort list of np.ndarray with respect to list of ints

Time:12-30

I am trying to sort a list of numpy arrays with respect to a list of integers in ascending order, the problem discussed in this post. Specifically, I am using the top rated solution from the post.

This first example produces the intended solution:

>>> x1 = [np.array([1,2,3]),np.array([4,5,6]),np.array([7,8,9])]
>>> y1 = [6, 10 , 4]
>>> y1_sorted, x1_sorted = zip(*sorted(zip(y1, x1)))
>>> y1_sorted, x1_sorted
((4, 6, 10), (array([7, 8, 9]), array([1, 2, 3]), array([4, 5, 6])))

However, this second example, with variables seemingly of the same type, produces this error:

>>> x2 = [np.array([1, 2, 3]),
...                   np.array([1, 3, 2]),
...                   np.array([2, 1, 3]),
...                   np.array([2, 3, 1]),
...                   np.array([3, 1, 2]),
...                   np.array([3, 2, 1])]
>>> y2 = [6,3,7,1,3,8]
>>> y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2)))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Would anyone be able to explain what is happening? I am using numpy 1.20.3 with Python 3.8.12.

CodePudding user response:

sorted function by default sorts tuples by the first elements and if there is a tie there, sort by second elements, and if there is still a tie, sort by third elements and so on.

In y2, 3 appears twice, so sorted will look into the second elements of the tuples to sort but the second elements are arrays, so it's not clear how to sort them, so you get an error. In other words, it's as if you ran the following:

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda x: (x[0], x[1])))

One way you can still use sorted function here is to simply sort by the first element as @niko suggested:

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda x: x[0]))

In this case, you only sort by the first elements (i.e. sort by y2) and leave the sorting of ties in y2 to the order it appears in.

Another way is to explicitly state how to use the information from the np.arrays. Maybe you want to sort by the first elements in the arrays in case there are ties in y2:

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda x: (x[0], x[1][0])))

Lastly, since you have a list of np.arrays, you can use numpy.argsort instead:

x2_sorted = np.array(x2)[np.argsort(y2)]

CodePudding user response:

So the line sorted(zip(y1, x1)) in the first part of the code seems to be sorting according to y1.

What you can do is use the the key argument of sorted to replicate that behaviour

y2_sorted, x2_sorted = zip(*sorted(zip(y2, x2), key=lambda _: _[0]))
print(y2_sorted)
# (1, 3, 3, 6, 7, 8)
print(x2_sorted)
# (array([2, 3, 1, 4, 5, 6]), array([1, 3, 2, 4, 5, 6]), array([3, 1, 2, 4, 5, 6]), array([1, 2, 3, 4, 5, 6]), array([2, 1, 3, 4, 5, 6]), array([3, 2, 1, 4, 5, 6]))

  • Related