How to extract data point from two numpy arrays based on two conditions?-CodePudding

I have two numpy arrays; x, y. I want to be able to extract the value of x that is closest to 1 that also has a y value greater than 0.96 and the get the index of that value.

x = [0.5, 0.8, 0.99, 0.8, 0.85, 0.9, 0.91, 1.01, 10, 20]
y = [0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 0.99, 0.99, 0.99, 0.85]

In this case the x value would be 1.01 because it is closest to 1 and has a y value of 0.99.

Ideal result would be:

idx = 7

I know how to find the point nearest to 1 and how to get the index of it but I don't know how to add the second condition.

CodePudding user response：

This code also works when there are multiple indexes satisfying the condition.

import numpy as np

x = [0.5, 0.8, 0.99, 0.8, 0.85, 0.9, 0.91, 1.01, 10, 20]
y = [0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 0.99, 0.99, 0.99, 0.85]
# differences
first_check = np.abs(np.array(x) - 1)
# extracting index of the value that x is closest to 1
# (indexes in case there are two or more values that are closest to 1)
indexes = np.where(first_check == np.min(first_check))[0]

indexes = [index for index in indexes if y[index] > 0.96]

print(indexes)

OUTPUT:

[7]

CodePudding user response：

You can use np.argsort(abs(x - 1)) to sort the indices according to the closest value to 1. Then, grab the first y index that satisfies y > 0.96 using np.where.

import numpy as np

x = np.array([0.5, 0.8, 0.99, 0.8, 0.85, 0.9, 0.91, 1.01, 10, 20])
y = np.array([0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 0.99, 0.99, 0.99, 0.85])

closest_inds = np.argsort(abs(x - 1))
idx = closest_inds[np.where(y[closest_inds] > 0.96)][0]

This would give:

idx = 7

For short arrays (shorter than, say 10k elements), the above solution would be slow because there is no findfirst in numpy till the moment. Look at this long awaited feature request.

So, in this case, the following loop would be much faster and will give same result:

for i in closest_inds:
    if y[i] > 0.96:
        idx = i
        break

CodePudding user response：

This will work on multiple conditions and lists.

x = [0.5, 0.8, 0.99, 0.8, 0.85, 0.9, 0.91, 1.01, 10, 20]

y = [0.7, 0.75, 0.8, 0.85, 0.9, 0.95, 0.99, 0.99, 0.99, 0.85]


condition1 = 1.0
condition2 = 0.96


def convert(*args):
    """
    Returns a list of tuples generated from multiple lists and tuples
    """
    for x in args:
        if not isinstance(x, list) and not isinstance(x, tuple):
            return []

    size = float("inf")
    
    for x in args:
        size = min(size, len(x)) 
    result = []
    for i in range(size):
        result.append(tuple([x[i] for x in args]))
    print(result)
    return result

result = convert(x, y)

closest = min([tpl for tpl in result if tpl[0] >= condition1 and tpl[1] > condition2], key=lambda x: x[1])

index = result.index(closest)

print(f'The index of the closest numbers of x-list to 1 and y-list to 0.96 is {index}')

Output

[(0.5, 0.7), (0.8, 0.75), (0.99, 0.8), (0.8, 0.85), (0.85, 0.9), (0.9, 0.95), (0.91, 0.99), (1.01, 0.99), (10, 0.99), (20, 0.85)]
The index of the closest numbers is 7