How to compare numpy arrays of tuples?-CodePudding

Here's an MWE that illustrates the issue I have:

import numpy as np

arr = np.full((3, 3), -1, dtype="i,i")

doesnt_work = arr == (-1, -1)

n_arr = np.full((3, 3), -1, dtype=int)

works = n_arr == 10

arr is supposed to be an array of tuples, but it doesn't behave as expected.

works is an array of booleans, as expected, but doesnt_work is False. Is there a way to get numpy to do elementwise comparisons on more complex types, or do I have to resort to list comprehension, flatten and reshape?

There's a second problem:

f = arr[(0, 0)] == (-1, -1)

f is False, because arr[(0,0)] is of type numpy.void rather than a tuple. So even if the componentwise comparison worked, it would give the wrong result. Is there a clever numpy way to do this or should I just resort to list comprehension?

CodePudding user response：

Both problems are actually the same problem! And are both related to the custom data type you created when you specified dtype="i,i".

If you run arr.dtype you will get dtype([('f0', '<i4'), ('f1', '<i4')]). That is a 2 signed integers that are placed in one continuous block of memory. This is not a python tuple. Thus it is clear why the naive comparison fails, since (-1,-1) is a python tuple and is not represented in memory the same way that the numpy data type is.

However if you compare with a_comp = np.array((-1,-1), dtype="i,i") you get the exact behavior you are expecting!

You can read more about how the custom dtype stuff works on the numpy docs: https://numpy.org/doc/stable/reference/arrays.dtypes.html

Oh and to address what np.void is: it comes from the idea that it is a void c pointer which essentially means that it is an address to a continuous block of memory of unspecified type. But, provided you (the programer) knows what is going to be stored in that memory (in this case two back to back integers) it's fine provided you are careful (compare with the same custom data type).