Home > database >  NumPy: Array of evaluations of function on array
NumPy: Array of evaluations of function on array

Time:09-30

Consider the following code.

import numpy as np
list = ["blabla.com","blabla.org","blibli.com"]
array1 = np.array(list)
array2 = array == "blabla.com"
print(array2)
# array2 is the one-element array containing only "blabla.com",
# but I would have expected it to contain the boolean False
array3 = array.endswith(".com")
print(array3)
# error

The error I get is

AttributeError: 'numpy.ndarray' object has no attribute 'endswith'

and I obviously agree.

What I don't understand, is the following: when I write

array == "blabla.com"

how does NumPy know that I want the array consisting of the boolean values of the according test on all elements of the array, and not the result of the test <<is array equal to "blabla.com"?>>?

And why does NumPy believe that when I write

array.endswith("blabla.com")

I want to use the (unexistent) attribute .endswith of numpy.ndarray?

I mean, it looks that in one case, NumPy is clever enough to understand an abuse of notation, and in the other case, it just reads what is explicitely written?

CodePudding user response:

numpy normally does an equality test, element by element, when given '=='. That's true to string dtype as well as the more common numeric dtypes.

In [14]: alist = ["blabla.com","blabla.org","blibli.com"]
    ...: array1 = np.array(alist)
    ...: array2 = array1 == "blabla.com"
In [15]: array1
Out[15]: array(['blabla.com', 'blabla.org', 'blibli.com'], dtype='<U10')
In [16]: array2
Out[16]: array([ True, False, False])

But .endswith is a method, not an operator. Methods have to be defined to the class. That's true for string, but not arrays. Hence the AttributeError.

In [18]: array1[0]
Out[18]: 'blabla.com'

In [19]: type(_)
Out[19]: numpy.str_

In [20]: array1[0].endswith('com')
Out[20]: True

There is a group of functions in np.char that do apply string methods to elements of a string dtype array:

In [21]: np.char.endswith(array1, 'org')
Out[21]: array([False,  True, False])

This doesn't involve special numpy coding. [astr.endswith('org') for astr in array1] does just as well.

  • Related