Home > database >  Count the number of 0s until the first 1 in each row
Count the number of 0s until the first 1 in each row

Time:09-28

I have a 2D numpy array and I want to count the number of values (including the certain value) until a certain value comes up in each row. If that specific value doesn't exist, return the length of that row. For example:

val = 2
arr = np.array([
       [2, 2, 1, 1, 0],
       [0, 3, 1, 0, 0],
       [0, 1, 2, 0, 1]
])

I want array([1, 5, 3]) returned because the first 2 appears in the first row of the first column; doesn't appear in the second row; and appears in the third column of the third row.

My attempt is via building 2 new temporary arrays; very clunky to say the least.

new_arr = np.zeros_like(arr)
new_arr[np.arange(len(arr)), np.argmax(np.cumsum(arr==val, 1)==1, 1)] = 1
new_new_arr = np.nonzero(new_arr==1)[1]   1
new_new_arr[np.all(arr!=val, 1)] = arr.shape[1]

It's a little like drawing from an exponential distribution, so I was hoping there was a native method in the numpy library that already implements something like this but I couldn't find it. How should I do this in a numpythonic way?

CodePudding user response:

In [180]: val = 2

In [181]: m = arr == val

In [182]: np.where(m.any(axis=1), m.argmax(axis=1)   1, arr.shape[1])
Out[182]: array([1, 5, 3], dtype=int64)
  • getting a mask out of whether equal to val or not
  • if any value is equal in a row, then return the argmax of that row, i.e., the index of the first True
    • but add 1 because need to include the value as well
  • otherwise, return the number of columns, i.e., .shape[1]
  • Related