Home > Software design >  Extract the first and last indices of all sequences of 1s in a numpy array and append them to a list
Extract the first and last indices of all sequences of 1s in a numpy array and append them to a list

Time:12-10

I have a mask vector for an audio file (time series) that contains 1s and 0s. The mask vector will contain long sequences of 1s for the intervals in the audio signal when there is some favourable activity and 0s when there is noise. I want to basically extract all the activity parts from the audio signal and store them as separate audio files. For this reason, it would be helpful if I could find the most efficient way of extracting the start and end indices of all sequences of 1s from the mask vector and append them for instance to a list.

CodePudding user response:

I'd do something like this:

groups = df.groupby(df['your_col'].ne(df['your_col'].shift(1)).cumsum()[df['your_col'].eq(1)])
for _, group in groups:
    # At this point, 'group' is a separate dataframe containing all the rows where 'your_col' is consecutively 1
    # ...

Basically what that does is it groups the rows by consecutive 1s (each group of one or more zeros ends the previous group of 1s), and then loops over each group (which is a portion of the original dataframe).

CodePudding user response:

Given your data, you can create idx by indexing on 1 and use np.split to split it into subarrays of consecutive indices.

data = pd.Series([1,1,1,0,0,1,0,0,1,1])
idx = data[data==1].index.values
out = [arr[[0,-1]] for arr in np.split(idx, np.where(np.diff(idx) != 1)[0] 1)]

So in this example, 1 appears 3 separate times in indices 0-2, 5 and 8-9:

[array([0, 2], dtype=int64),
 array([5, 5], dtype=int64),
 array([8, 9], dtype=int64)]
  • Related