Home > Mobile >  Get an array of corresponding values in a reference array from very big input array
Get an array of corresponding values in a reference array from very big input array

Time:10-28

I have the following array:

table = np.array([
[1.0, 1.0, 3.0, 5.0],
[1.0, 2.0, 5.0, 3.0],
...
[2.0, 5.0, 2.0, 1.0],
[8.0, 9.0, 7.0, 2.0]])

Let's name the different columns respectively by ['a', 'b', 'm', 'n'].

"table" is my my reference table where I want to extract 'm' and 'n' given 'a' and 'b' contained in a list we will call 'my_list'. In that list, we allow duplicate pairs (a, b).

N.B.: Here list can be referred as array (not in the python sense)

It is easier to do it with for loop. But, for my problem, my list 'my_list' can contain more than 100000 pairs (a, b) so doing it with for loop is not optimal for my work. How can I do it with numpy functions or pandas functions in a few lines (1 to 3 lines)?

An example of what I want: Given the following list

my_list = np.array([
[1.0, 2.0],
[1.0, 2.0],
[8.0, 9.0]])

I want to have the following result:

results = np.array([
[5.0, 3.0],
[5.0, 3.0],
[7.0, 2.0]])

Thank you in advance

Edit 1: equivalence with for loop

Here is the equivalent with for loop (simplest way with for loop without dichotomous search):

result = []
for x in my_list:
    for y in table:
        if (x[0] == y[0]) and (x[1] == y[1]):
            result.append([y[2], y[3]])
        break

print(results)

CodePudding user response:

One possible approach using pandas is to perform inner merge

pd.DataFrame(table).merge(pd.DataFrame(my_list))[[2, 3]].to_numpy()

array([[5., 3.],
       [5., 3.],
       [7., 2.]])
  • Related