Home > Back-end >  Compare values within a certain timeframe in arrays
Compare values within a certain timeframe in arrays

Time:03-09

I am trying to compare values (0's and 1's) in a array. I want to search for each "1" that appears in one column, for another "1" in the other column in a specific timeframe (for example, 5 seconds, 10 seconds, etc.). I will call the 1's as "signals".

In example, I have an array such as:

data1 = [ 0 0 0] [ 1 0 0] [ 2 0 0] [ 3 0 0] [ 4 0 0] [ 5 0 0] [ 6 0 1] [ 7 0 0] [ 8 0 0] [ 9 0 0] [ 10 1 0] [ 11 0 0] [ 12 0 0] [ 13 0 0] [ 14 0 0] [ 15 0 0] [ 16 0 0] [ 17 0 0] [ 18 0 0] [ 19 0 0] [ 20 0 1] [ 21 0 0] [ 22 0 0] [ 23 0 0] [ 24 0 0] [ 25 0 0] ]

This is much smaller than the data I have. But the idea is this: the first column represents the timestamps. The second and third, the signals that I have. What I would like to do is calculate the proportion of the signals that occurs in the same time interval as at least one other signal (in the other column). I would like to do it in multiple timeframes, such as 5 seconds, 10 seconds, etc., as to see the differences.

I've tried a for loop in the arrays and could check for the signals that are in the arrays. However, I was unable to create this condition of "checking" if the signal in the other column was within a certain timeframe.

Hope I was clear. Thank you!

CodePudding user response:

I have a working solution, though I'm sure there are more efficient ones. I have abbreviated data to d, which I am assuming is a NumPy array.

# Get all signal columns from array. d1 = middle column, d2 = last column.
d1 = d[:,1]
d2 = d[:,2]

# If there is a signal in either signal column (i.e. if either column has value 1 in a row), then final_d is 1 there. Basically, final_d is 1 if there is a signal in any column.
final_d = np.logical_or(d1,d2).astype(int)
length = final_d.shape[0]

# flags is in int form for now. flags = 0 means False, flags = 1 means True. Starts out with all flags being False.
flags = np.zeros((length), dtype=int)

# What range you want to work within, e.g. 5 seconds, 10 seconds, etc.
time_range = 5

# This loop gets all subgroups/time ranges of time_range consecutive values.
# This is why the loop does not go all the way to len(final_d); there are not that many subgroups.
for i in range(length - time_range   1):

    # Get each subgroup, i.e. time range.
    # Then get the indices within this_range (the subgroup) that are equal to 1.
    this_range = final_d[i:i time_range]
    indices_of_signals = np.array(np.where(this_range == 1))   i
    
    # There is more than 1 signal in the subgroup if the sum of the signals is more than 2.
    # If this is the case, then change the flag for all signals within this_range to 1.
    if np.sum(this_range) >= 2:
        flags[indices_of_signals] = 1

# Changes flags from int form to boolean (True/False) form.
flags = flags.astype(bool)

I would like to note that the reason I did not use chunking (i.e. considering chunks 0-4, 5-9, 10-14, etc.) is that in that example, if you have signals in rows 4 and 7, even though those are within a 5-second time range, they are not in the same 5-second time chunk. My method returns a True flag if a signal is near any other signal within - time_range.

  • Related