How to find consecutive points and their index below.a threshold from a list of points in python?-CodePudding

I have lists of data points, which I look at to see if they are above a certain threshold.

I can calculate the percentage of total points above the threshold, but I need index and points of all points above the threshold. e.g.

points_above_threshold = [1,1,1,0,0,0,1,1] 1 is yes, 0 is no

I need a function which returns, points in the format: [line_points,[start_index, end_index]

e.g. the output of points_above_threshold would be [3,(0,2)],[2,(6,7)]

CodePudding user response：

Your question is lacking some detail about the format of the data you're working with. A good starting point is to specify precisely the expected input and output for your function.

For example, if your data is a list of numbers (floats) like this:

[1.56, 2.45, 8.43, ... ]

your threshold is a single floating point number, and your output is expected to be a list of tuples (index, data_point) like this:

[(1, 2.45), (2, 8.43), ... ]

Then you can write a function that that looks something like this:

def get_points_above_threshold(data_list, threshold):
    output = []
    for idx, point in enumerate(data_list):
        if point > threshold:
            output.append((idx, point))
    return output

I'll attempt to answer how to implement the points_above_threshold function you describe. We can alter the above function slightly with a tracking system to calculate the index ranges of values that are above the threshold like this:

def compute_ranges(values, threshold):
    start_range = None                             #
    ranges = []                                    # tuples (start_idx, end_idx), inclusive
    for idx, value in enumerate(values):           #
        if value <= threshold:                     # This either ends an "active" range, or does nothing if there isn't one.
            if start_range is None:                # If no current range, continue
               continue                            #
            ranges.append((start_range, idx-1))    # Otherwise end current range, append it to ranges, and reset range variables 
            start_range = None                     #
        else:                                      # Otherwise, we either start an "active" range or continue one that already exists
            if start_range is None:                #
                start_range = idx                  #
    if start_range is not None:                    # If still an active range, append it (since range could end at end of list)
        ranges.append((start_range,                # 
                       len(values)-1))             #
    final = [(r[1]-r[0] 1, r) for r in ranges]     # Do final convert that includes length of range to output 
    return final

If we apply this function to a list of numbers with a given threshold, it will output the ranges in the way you describe above. For example, if the input list is the simple example

[1,1,1,0,0,0,1,1]

and the threshold is say 0.5, then the output is

[(3, (0, 2)), (2, (6, 7))]

CodePudding user response：

Using enumerate and pairwise iteration we can achieve what you want.

# enumerate helps us to isolate the indexes of 1's
points_above_threshold = [1,1,1,0,0,0,1,1]
id_ = [i for i,e in enumerate(a) if e == 1]   # list comprehension
print(id_)
[0, 1, 2, 6, 7]   # all indexes of 1's

# pairwise iteration helps us find the
# sequences of indexes, e.g. (0,1,2) and (6,7) are sequences 
pairwise = [[]]
for item1, item2 in list(zip(id_, id_[1:])):
if item2-item1 == 1:
    if not pairwise[-1]:
        pairwise[-1].extend((item1,item2))
    else:
        pairwise[-1].append(item2)
elif pairwise[-1]:
    pairwise.append([])

print(pairwise)
[[0, 1, 2], [6, 7]]

# with the code above we've just iterate over the id_ list
# and create another list with the sequences nested

# now using list comprehension we can achieve the output,
# but with tuples nested inside a list
points_above_threshold = [(len(i), (i[0], i[-1])) for i in pairwise]
print(points_above_threshold)
[(3, (0, 2)), (2, (6, 7))]

Hope this is helpful!