I have the lists xA, yA, zA and the lists xB, yB, zB in Matlab. The contain the the x,y and z coordinates of points of type A and type B. There may be different numbers of type A and type B points.
I would like to calculate a matrix containing the Euclidean distances between points of type A and type B. Of course, I only need to calculate one half of the matrix, since the other half contains duplicate data.
What is the most efficient way to do that?
When I'm done with that, I want to find the points of type B that are closest to one point of type A. How do I then find the coordinates the closest, second closest, third closest and so on points of type B?
CodePudding user response:
Given a matrix A
of size [N,3]
and a matrix B
of size [M,3]
, you can use the pdist2 function to get a matrix of size [M,N]
containing all the pairwise distances.
If you want to order the points from B
relative to their distance from the r
th point in A
then you can sort the r
th row of the pairwise distance matrix.
% generate some example data
N = 4
M = 7
A = randn(N,3)
B = randn(M,3)
% compute N x M matrix containing pairwise distances
D = pdist2(A, B, 'euclidean')
% sort points in B by their distance to the rth point in A
r = 3
[~, b_idx] = sort(D(r,:))
Then b_idx
will contain the indices of the points in B
sorted w.r.t. their distance to the r
th in A
. So B(b_idx,:)
would be a [M,3]
matrix of the points in B
sorted w.r.t. their distance from the r
th point in A
.
If you want to do this for all r
you could use
[~, B_idx] = sort(D, 1)
to sort all the rows of D
at the same time. Then the r
th row of B_idx
will contain b_idx
.
If our only objective is to find the k closest points in B
for each point in A
then we would generally not explicitly compute the pairwise distance. This is because space partitioning data-structures like KD-trees can be used to improve the efficiency of searching without explicitly computing all the pair-wise distances.
Matlab provides a knnsearch
function that implements these optimizations, so if for each point in B
you need only the k
closest points, then you should use that function. For example
k = 2
B_kidx = knnsearch(B, A, 'K', k)
would return the first two columns of B_idx
, i.e. for each point in A
the closest two points in B
. Also, note that this is only going to be more efficient than the pdist2
method when k
is relatively small.