How to get only the index in numpy.where instead of a tuple-CodePudding

I have an array of strings arr in which I want to search for elements and get the index of element. Numpy has a method where to search element and return index in a tuple form.

arr = numpy.array(["string1","string2","string3"])
print(numpy.where(arr == "string1")

It prints:

(array([0], dtype=int64),)

But I only want the index number 0.

I tried this:

i = numpy.where(arr == "string1")
print("idx = {}".format(i[0]))

which has output:

i = [0]

Is there any way to get the index number without using replace or slicing method?

CodePudding user response：

TL;DR

Use:

try:
    i = numpy.where(arr == "string1")[0][0]
except IndexError:
    # handle the case where "string1" was not found in arr

indices = list(numpy.where(arr == "string1")[0])

Details

Finding elements in NumPy arrays is not intuitive the first time you try to do it.

Let's decompose the operation:

>>> arr = numpy.array(["string1","string2","string3"])
>>> arr == "string1"
array([ True, False, False])

Notice how just doing arr == "string1" is already doing the search: it's returning an array of booleans of the same shape as arr telling us where the condition is true.

Then, you're using numpy.where which, when used with only one parameter (the condition), returns where its input is non-zero. With booleans, that means non false.

>>> numpy.where(numpy.array([ True, False, False]))
(array([0], dtype=int64),)
>>> numpy.where(arr == "string1")
(array([0], dtype=int64),)

It's not quite clear to my where this gives you a tuple of arrays for a 1-D input, but when you use this syntax with a 2-d input, it makes more sense.

In any case, what you're getting here is a tuple containing a list of indices where the condition matches. Notice it has to be a list, because you might have multiple matches.

For your code, you want numpy.where(arr == "string1")[0][0], because you know "string1" occurs in the list, but the inner list may also contain zero or more than one values, depending on how many times the string is found.

>>> arr2 = numpy.array(["string1","string2","string3","string1", "string3"])
>>> numpy.where(arr2 == "foo")
(array([], dtype=int64),)
>>> numpy.where(arr2 == "string3")
(array([2, 4], dtype=int64),)

So when you want to use these indices, you should simply treat numpy.where(arr == "string1")[0] as a list (it's really a 1-D array, though) and continue from there.

Now, just using numpy.where(arr == "some string")[0][0] is risky, because it will throw in IndexError exception if the string is not found in arr. If you really want to do that, do it in a try/except block.

If you need the list of indices as a Python list, you can do this:

indices = list(numpy.where(arr == "string1")[0])