I have the following array:
table = np.array([
[1.0, 1.0, 3.0, 5.0],
[1.0, 2.0, 5.0, 3.0],
...
[2.0, 5.0, 2.0, 1.0],
[8.0, 9.0, 7.0, 2.0]])
Let's name the different columns respectively by ['a', 'b', 'm', 'n'].
"table" is my my reference table where I want to extract 'm' and 'n' given 'a' and 'b' contained in a list we will call 'my_list'. In that list, we allow duplicate pairs (a, b).
N.B.: Here list can be referred as array (not in the python sense)
It is easier to do it with for loop. But, for my problem, my list 'my_list' can contain more than 100000 pairs (a, b) so doing it with for loop is not optimal for my work. How can I do it with numpy functions or pandas functions in a few lines (1 to 3 lines)?
An example of what I want: Given the following list
my_list = np.array([
[1.0, 2.0],
[1.0, 2.0],
[8.0, 9.0]])
I want to have the following result:
results = np.array([
[5.0, 3.0],
[5.0, 3.0],
[7.0, 2.0]])
Thank you in advance
Edit 1: equivalence with for loop
Here is the equivalent with for loop (simplest way with for loop without dichotomous search):
result = []
for x in my_list:
for y in table:
if (x[0] == y[0]) and (x[1] == y[1]):
result.append([y[2], y[3]])
break
print(results)
CodePudding user response:
One possible approach using pandas is to perform inner merge
pd.DataFrame(table).merge(pd.DataFrame(my_list))[[2, 3]].to_numpy()
array([[5., 3.],
[5., 3.],
[7., 2.]])