Consider the following code.
import numpy as np
list = ["blabla.com","blabla.org","blibli.com"]
array1 = np.array(list)
array2 = array == "blabla.com"
print(array2)
# array2 is the one-element array containing only "blabla.com",
# but I would have expected it to contain the boolean False
array3 = array.endswith(".com")
print(array3)
# error
The error I get is
AttributeError: 'numpy.ndarray' object has no attribute 'endswith'
and I obviously agree.
What I don't understand, is the following: when I write
array == "blabla.com"
how does NumPy know that I want the array consisting of the boolean values of the according test on all elements of the array, and not the result of the test <<is array equal to "blabla.com"?>>?
And why does NumPy believe that when I write
array.endswith("blabla.com")
I want to use the (unexistent) attribute .endswith
of numpy.ndarray
?
I mean, it looks that in one case, NumPy is clever enough to understand an abuse of notation, and in the other case, it just reads what is explicitely written?
CodePudding user response:
numpy
normally does an equality test, element by element, when given '=='. That's true to string dtype as well as the more common numeric dtypes.
In [14]: alist = ["blabla.com","blabla.org","blibli.com"]
...: array1 = np.array(alist)
...: array2 = array1 == "blabla.com"
In [15]: array1
Out[15]: array(['blabla.com', 'blabla.org', 'blibli.com'], dtype='<U10')
In [16]: array2
Out[16]: array([ True, False, False])
But .endswith
is a method, not an operator. Methods have to be defined to the class. That's true for string, but not arrays. Hence the AttributeError.
In [18]: array1[0]
Out[18]: 'blabla.com'
In [19]: type(_)
Out[19]: numpy.str_
In [20]: array1[0].endswith('com')
Out[20]: True
There is a group of functions in np.char
that do apply string methods to elements of a string dtype array:
In [21]: np.char.endswith(array1, 'org')
Out[21]: array([False, True, False])
This doesn't involve special numpy coding. [astr.endswith('org') for astr in array1]
does just as well.