Home > database >  Find runs and lengths of consecutive values in an array
Find runs and lengths of consecutive values in an array

Time:04-05

I'd like to find equal values in an array and their indices if they occur consecutively more then 2 times.

[0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4]

so in this example I would find value "2" occured "4" times, starting from position "8". Is there any build in function to do that?

I found a way with collections.Counter

collections.Counter(a)
# Counter({0: 3, 1: 4, 3: 2, 5: 1, 4: 1})

but this is not what I am looking for. Of course I can write a loop and compare two values and then count them, but may be there is a more elegant solution?

CodePudding user response:

Find consecutive runs and length of runs with condition

import numpy as np

arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2, 2, 2, 2, 1, 3, 4])

res = np.ones_like(arr)
np.bitwise_xor(arr[:-1], arr[1:], out=res[1:])  # set equal, consecutive elements to 0
# use this for np.floats instead
# arr = np.array([0, 3, 0, 1, 0, 1, 2, 1, 2.4, 2.4, 2.4, 2, 1, 3, 4, 4, 4, 5])
# res = np.hstack([True, ~np.isclose(arr[:-1], arr[1:])])
idxs = np.flatnonzero(res)                      # get indices of non zero elements
values = arr[idxs]
counts = np.diff(idxs, append=len(arr))         # difference between consecutive indices are the length

cond = counts > 2
values[cond], counts[cond], idxs[cond]

Output

(array([2]), array([4]), array([8]))
# (array([2.4, 4. ]), array([3, 3]), array([ 8, 14]))

CodePudding user response:

_, i, c = np.unique(np.r_[[0], ~np.isclose(arr[:-1], arr[1:])].cumsum(), 
                    return_index = 1, 
                    return_counts = 1)
for index, count in zip(i, c):
    if count > 1:
        print([arr[index], count, index])

Out[]:  [2, 4, 8]

A little more compact way of doing it that works for all input types.

  • Related