Home > database >  numpy array check whether not all elements are the same where elements are arrays of different lengt
numpy array check whether not all elements are the same where elements are arrays of different lengt

Time:01-17

I need to check if not all elements are the same in numpy array.

import numpy as np
from datetime import date

arr1 = np.array(
[np.array([date(2022,12,14)]),
np.array([date(2022,12,15)])]
)

arr2 = np.array(
[np.array([date(2022,12,14)]),
np.array([date(2022,12,19), date(2022, 12, 20), date(2022, 12, 21)])]
)

Now, arr1 != arr1[0] returns array([[False],[ True]]), and taking this into np.all() correctly returns False. However, arr2 != arr2[0] just returns True instead of expected array([[False],[ True]]). Why does it happen? Is there a quick workaround?

@edit I'm thinking of using

[np.array_equiv(arr2[0], dates) for dates in arr2]

Then I'm correctly getting [False, True] upon which I can apply np.all(), but is it the best solution for that case?

CodePudding user response:

Either you are using an old numpy version, or have deliberately omitted/ignored the warnings:

In [2]: from datetime import date
   ...: 
   ...: arr1 = np.array(
   ...: [np.array([date(2022,12,14)]),
   ...: np.array([date(2022,12,15)])]
   ...: )
   ...: 
   ...: arr2 = np.array(
   ...: [np.array([date(2022,12,14)]),
   ...: np.array([date(2022,12,19), date(2022, 12, 20), date(2022, 12, 21)])]
   ...: )
C:\Users\paul\AppData\Local\Temp\ipykernel_1604\1472092167.py:8: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
  arr2 = np.array(

Now look at the resulting arrays - they are quite different:

In [3]: arr1
Out[3]: 
array([[datetime.date(2022, 12, 14)],
       [datetime.date(2022, 12, 15)]], dtype=object)

In [4]: arr2
Out[4]: 
array([array([datetime.date(2022, 12, 14)], dtype=object),
       array([datetime.date(2022, 12, 19), datetime.date(2022, 12, 20),
              datetime.date(2022, 12, 21)], dtype=object)              ],
      dtype=object)

And the test:

In [5]: arr1 == arr1[0]
Out[5]: 
array([[ True],
       [False]])

In [6]: arr2 == arr2[0]
C:\Users\paul\AppData\Local\Temp\ipykernel_1604\3122922660.py:1: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
  arr2 == arr2[0]
Out[6]: False

Again a warning that you are doing something wrong.

Look at the list comprehension:

In [7]: [np.array_equiv(arr2[0], dates) for dates in arr2]
Out[7]: [True, False]

Or with the simple equality that worked for arr1:

In [8]: [dates==arr2[0] for dates in arr2]
Out[8]: [array([ True]), array([False, False, False])]

Notice that the 2nd comparison has a value for each element of the inner array.

Object dtype arrays are a lot like lists, and arrays containing arrays which themselves are object dtype are especially complicated.

  • Related