Home > other >  Adding values to a list below a threshold number
Adding values to a list below a threshold number

Time:03-12

I have a list of random data and a threshold:

threshold = 3
data = [2,2,2,2,2,5,5,2,2,2,2,3,4,5,6,4,5,4,3,4,5,3,3,7,8,2,2,2] # data
timestamp =[] 
for i in range(len(data)):
    timestamp.append(i)
print(timestamp)

I am trying to extract timestamps that are below the threshold but, if a range of consecutive timestamps (less than 4 timestamps (<4)), between 2 time ranges below the threshold occurs, we also treat it as below the threshold

As such, this example should return:

belowthreshold = [0,1,2,3,4,5,6,7,8,9,10,25,26,27]

So we can see that the consecutive 5,5 is skipped and treated as under threshold since values before and after it are under threshold

Currently, my method is:

belowthreshold = []
for j in range(len(data)):
    if data[j] < threshold and data[j]: # check if greater than threshold, meaning energy is being used at home
        belowthreshold.append(j) # add this time to a list

However it quite clearly only extracts values less than the threshold.

What is the best way to approach this?

Thanks in advance for your answers

CodePudding user response:

Try with list comprehension using itertools.zip_longest:

import itertools

output = [i for i, (x, y, z) in enumerate(itertools.zip_longest(data,data[1:],data[2:],fillvalue=0)) if x<threshold or y<threshold or z<threshold]

>>> output
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 23, 24, 25, 26, 27]
Edit:

To take into account that the "consecutive" timestamps can be on either side, you can use itertools.groupby using a custom key to check if the value is less than the threshold.

This splits the data into the following groups: [2, 2, 2, 2, 2], [5, 5], [2, 2, 2, 2], [3, 4, 5, 6, 4, 5, 4, 3, 4, 5, 3, 3, 7, 8], [2, 2, 2]

output = list()
i = 0
for k, v in itertools.groupby(data, key=lambda x: x<threshold):
    values = list(v)
    if k or len(values) < 4:
        output  = [i x for x in range(len(values))]
    i  = len(values)

>>> output
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 25, 26, 27]

CodePudding user response:

You can use a temporary list to store steps when they're above threshold and you can add them to the result list if they remained above only 3 or less steps, otherwise you reset this temporary list. Here we go:

threshold = 3
data = [2,2,2,2,2,5,5,2,2,2,2,3,4,5,6,4,5,4,3,4,5,3,3,7,8,2,2,2] # data
steps =[] # 'time stamp'
for i in range(len(data)):
    steps.append(i)
print(steps)

belowthreshold = []
temp_above_threshold = []
consecutive_above_counter = 0

for j in range(len(data)):
    if data[j] < threshold:
        if consecutive_above_counter < 4: # add only if less than 4 steps were above threshold
            belowthreshold = belowthreshold   temp_above_threshold
        # reset counter and temporary list
        consecutive_above_counter = 0
        temp_above_threshold = []
        belowthreshold.append(j) # add this time to a list
    else:
        consecutive_above_counter  = 1
        
        if consecutive_above_counter < 4:
            temp_above_threshold.append(j)
        else:
            temp_above_threshold = []
print(belowthreshold)

edit: I tried to bring a simple solution following your code, without adding extra packages complexity that might be difficult to keep track later.

CodePudding user response:

I have managed to replicate your output with the following code:

def below_threshold(threshold, list_of_value):
    indices = set()
    for i in range(2, len(list_of_value)):
        if all(list_of_value[k] >= threshold for k in [i, i - 1, i - 2]):
            indices = indices.union({i, i-1, i-2})
    return set(range(len(list_of_value))).difference(indices)

print(below_threshold(3, [2, 2, 2, 2, 2, 5, 5, 2, 2, 2, 2, 3, 4, 5, 6, 4, 5, 4, 3, 4, 5, 3, 3, 7, 8, 2, 2, 2]))
  • Related