Home > database >  Intersect float rows in two numpy matrices with precision
Intersect float rows in two numpy matrices with precision

Time:05-14

example: eps = 0.1

A = [[1.22, 1.33], [1.45, 1.66]]

B = [[1.25, 1.34], [1.77, 1.66]]

Result: [[1.22, 1.33]] (or average value)

CodePudding user response:

I agree with Professor Pantsless' comment and would expand on it's use here:

Your arrays look like so:

# array A
1.22 1.33
1.45 1.66

# array B
1.25 1.34
1.77 1.66

Your desired result (per your OP) matches to A[0], or 1.22 1.33. This indicates that you wish to return an array of the rows inside array A in which all elements in a row are < eps when compared to the same indexed row of array 'B':

# array C
np.abs(A[0] - B[0]) # --> True  True
np.abs(A[1] - B[1]) # --> False True

This is achievable with a boolean index achieved using:

>>> A[(np.abs(A-B) < eps).all(axis = 1)]

Breaking down this line:

>>> np.abs(A-B)
0.03 0.01
0.32 0.00
>>> np.abs(A-B) < eps
True  True
False True
# notice this matches the comments above
>>> (np.abs(A-B) < eps).all(axis = 1)
True False

The (np.abs(A-B) < eps) returns a new array of booleans, and the .all(axis = 1) checks along the first axis (so column wise) of each row to see if all elements per row are True. Since the first row of array C is all True, it returns True; this does not hold for the second row as its False True so it returns False. What you're left with is an array of shape (N, ).

So now the final breakdown is:

>>> A[(np.abs(A-B) < eps).all(axis = 1)]
1.22 .133
# since this is access A like A[(True, False)]

w.r.t. your latest comment, you cannot do that because of NumPy's broadcasting rules. These rules are pretty similar to standard matrix multiplication rules. So you cannot take a 2x2 matrix and multiply by a 3x2 because the inner dimensions don't work.

CodePudding user response:

If you are looking to filter to elements in A which are close to any element in B, you can use broadcast and tile to do an exhaustive check:

import numpy as np

eps = .1
A = np.array([[1.22, 1.33], [1.45, 1.66]])
B = np.array([[1.25, 1.34], [1.77, 1.66], [1.44, 1.67]])


# broadcast A based on the shape of B
A_ext = np.broadcast_to(A, (B.shape[0],)   A.shape)

# tile B and reshape, this will allow comparison of all elements in A to all elements in B
B_ext = np.tile(B, A.shape[0]).reshape(A_ext.shape)

# create the boolean array
A_bool = np.abs(A_ext - B_ext) < eps

# reduce array to match number of elements in A
# .all() will result in an array representing which elements in A are close to each element in B
# .any() represents if A is close to any of the elements in B
A_mask = A_bool.all(axis = -1).any(axis = 0)

# final result
A[A_mask]
  • Related