Adding values to a list below a threshold number-CodePudding

I have a list of random data and a threshold:

threshold = 3
data = [2,2,2,2,2,5,5,2,2,2,2,3,4,5,6,4,5,4,3,4,5,3,3,7,8,2,2,2] # data
timestamp =[] 
for i in range(len(data)):
    timestamp.append(i)
print(timestamp)

I am trying to extract timestamps that are below the threshold but, if a range of consecutive timestamps (less than 4 timestamps (<4)), between 2 time ranges below the threshold occurs, we also treat it as below the threshold

As such, this example should return:

belowthreshold = [0,1,2,3,4,5,6,7,8,9,10,25,26,27]

So we can see that the consecutive 5,5 is skipped and treated as under threshold since values before and after it are under threshold

Currently, my method is:

belowthreshold = []
for j in range(len(data)):
    if data[j] < threshold and data[j]: # check if greater than threshold, meaning energy is being used at home
        belowthreshold.append(j) # add this time to a list

However it quite clearly only extracts values less than the threshold.

What is the best way to approach this?

Thanks in advance for your answers

CodePudding user response：

Try with list comprehension using itertools.zip_longest:

import itertools

output = [i for i, (x, y, z) in enumerate(itertools.zip_longest(data,data[1:],data[2:],fillvalue=0)) if x<threshold or y<threshold or z<threshold]

>>> output
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 23, 24, 25, 26, 27]

Edit:

To take into account that the "consecutive" timestamps can be on either side, you can use itertools.groupby using a custom key to check if the value is less than the threshold.

This splits the data into the following groups: [2, 2, 2, 2, 2], [5, 5], [2, 2, 2, 2], [3, 4, 5, 6, 4, 5, 4, 3, 4, 5, 3, 3, 7, 8], [2, 2, 2]

output = list()
i = 0
for k, v in itertools.groupby(data, key=lambda x: x<threshold):
    values = list(v)
    if k or len(values) < 4:
        output  = [i x for x in range(len(values))]
    i  = len(values)

>>> output
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 25, 26, 27]

CodePudding user response：

You can use a temporary list to store steps when they're above threshold and you can add them to the result list if they remained above only 3 or less steps, otherwise you reset this temporary list. Here we go:

threshold = 3
data = [2,2,2,2,2,5,5,2,2,2,2,3,4,5,6,4,5,4,3,4,5,3,3,7,8,2,2,2] # data
steps =[] # 'time stamp'
for i in range(len(data)):
    steps.append(i)
print(steps)

belowthreshold = []
temp_above_threshold = []
consecutive_above_counter = 0

for j in range(len(data)):
    if data[j] < threshold:
        if consecutive_above_counter < 4: # add only if less than 4 steps were above threshold
            belowthreshold = belowthreshold   temp_above_threshold
        # reset counter and temporary list
        consecutive_above_counter = 0
        temp_above_threshold = []
        belowthreshold.append(j) # add this time to a list
    else:
        consecutive_above_counter  = 1
        
        if consecutive_above_counter < 4:
            temp_above_threshold.append(j)
        else:
            temp_above_threshold = []
print(belowthreshold)

edit: I tried to bring a simple solution following your code, without adding extra packages complexity that might be difficult to keep track later.

CodePudding user response：

I have managed to replicate your output with the following code:

def below_threshold(threshold, list_of_value):
    indices = set()
    for i in range(2, len(list_of_value)):
        if all(list_of_value[k] >= threshold for k in [i, i - 1, i - 2]):
            indices = indices.union({i, i-1, i-2})
    return set(range(len(list_of_value))).difference(indices)

print(below_threshold(3, [2, 2, 2, 2, 2, 5, 5, 2, 2, 2, 2, 3, 4, 5, 6, 4, 5, 4, 3, 4, 5, 3, 3, 7, 8, 2, 2, 2]))