Home > OS >  Index array based on value limits of another
Index array based on value limits of another

Time:10-02

Let's say I have an array (or even a list) that looks like:

tmp_data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

And then I have another ray that are distance values:

dist_data = [ 15.625  46.875  78.125 109.375 140.625 171.875 203.125 234.375 265.625 296.875]

Now, say I want to create a threshold of distance that I would like to perform an operation on from tmp_data. For this example, let's just take the max value. And let's set the threshold distance to 100. What I would like to do is take the n number of elements every 100 distance units and replace all elements in that with the maximum value in that small array. For example: I would want the final output to be

max_tmp_data_100 = [2,2,2,5,5,5,8,8,8,9]

This is because the first 3 elements in dist_data are below 100, so we take the first three elements of tmp_data (0,1,2), and get the maximum of this and replace all elements in there with that value, 2

Then, the next set of data that would be below the next 100 value would be

tmp_dist_array_100 = [109.375 140.625 171.875]
tmp_data_100 = [3,4,5]
max_tmp_data_100 = [5,5,5]
(append to [2,2,2])

I have come up with the following:

# Initialize
final_array = []
d_array = []
idx = 1

for i in range(0,10):
    if dist_data[i] < idx * final_res:
        d_array.append(tmp_data[i])
    elif dist_data[i] > idx * final_res:
        # Now get the values
        max_val = np.amax(d_array)
        new_array = np.ones(len(d_array)) * max_val
        final_array.extend(new_array)
        idx = idx   1

But the outcome is

[2.0, 2.0, 2.0, 5.0, 5.0, 5.0, 5.0, 5.0]

When it should be [2,2,2,5,5,5,8,8,8,9]

CodePudding user response:

You can do with groupby

from itertools import groupby

dist_data = [ 15.625, 46.875 ,78.125 ,109.375 ,140.625 ,171.875 ,203.125 ,234.375, 265.625 ,296.875]
tmp_data = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
result = []

index_list = [[dist_data.index(i) for i in l]
               for k, l in groupby(dist_data, key=lambda x:x//100)]
for i in tmp_data:
    for lst in index_list:
        if i in lst:
            result.append(max(lst))

print(result)
# [2, 2, 2, 5, 5, 5, 9, 9, 9, 9]

A per your requirements last 4 elements will comes under next threshold value, the max of last 4 element is 9.

CodePudding user response:

With numpy:

import numpy as np

cdist_data = [15.625, 46.875, 78.125, 109.375, 140.625, 171.875, 203.125, 234.375,265.625, 296.875]
cut = 100

a = np.array(dist_data)
vals = np.searchsorted(a, np.r_[cut:a.max()   cut:cut]) - 1
print(vals[(a/cut).astype(int)])

It gives:

[2 2 2 5 5 5 9 9 9 9]
  • Related