Given a 2-d numpy array, X, of shape [m,m], I wish to apply a function and obtain a new 2-d numpy matrix P, also of shape [m,m], whose [i,j]th element is obtained as follows:
P[i][j] = exp (-|| X[i] - x[j] ||**2)
where ||.||
represents the standard L-2 norm of a vector. Is there any way faster than a simple nested for loop?
For example,
X = [[1,1,1],[2,3,4],[5,6,7]]
Then, at diagonal entries the rows accessed will be the same and the norm/magnitude of their difference will be 0. Hence,
P[0][0] = P[1][1] = P[2][2] = exp (0) = 1.0
Also,
P[0][1] = exp (- || X[0] - X[1] ||**2) = exp (- || [-1,-2,-3] || ** 2) = exp (-14)
etc.
The most trivial solution using a nested for loop is as follows:
import numpy as np
X = np.array([[1,2,3],[4,5,6],[7,8,9]])
P = np.zeros (shape=[len(X),len(X)])
for i in range (len(X)):
for j in range (len(X)):
P[i][j] = np.exp (- np.linalg.norm (X[i]-X[j])**2)
print (P)
This prints:
P = [[1.00000000e 00 1.87952882e-12 1.24794646e-47]
[1.87952882e-12 1.00000000e 00 1.87952882e-12]
[1.24794646e-47 1.87952882e-12 1.00000000e 00]]
Here, m is of the order of 5e4.
CodePudding user response:
In [143]: X = np.array([[1,2,3],[4,5,6],[7,8,9]])
...: P = np.zeros (shape=[len(X),len(X)])
...: for i in range (len(X)):
...: for j in range (len(X)):
...: P[i][j] = np.exp (- np.linalg.norm (X[i]-X[j]))
...:
In [144]: P
Out[144]:
array([[1.00000000e 00, 5.53783071e-03, 3.06675690e-05],
[5.53783071e-03, 1.00000000e 00, 5.53783071e-03],
[3.06675690e-05, 5.53783071e-03, 1.00000000e 00]])
A no-loop version:
In [145]: np.exp(-np.sqrt(((X[:,None,:]-X[None,:,:])**2).sum(axis=2)))
Out[145]:
array([[1.00000000e 00, 5.53783071e-03, 3.06675690e-05],
[5.53783071e-03, 1.00000000e 00, 5.53783071e-03],
[3.06675690e-05, 5.53783071e-03, 1.00000000e 00]])
I had to drop your **2
to match values.
With the norm
applied to the 3d difference array:
In [148]: np.exp(-np.linalg.norm(X[:,None,:]-X[None,:,:], axis=2))
Out[148]:
array([[1.00000000e 00, 5.53783071e-03, 3.06675690e-05],
[5.53783071e-03, 1.00000000e 00, 5.53783071e-03],
[3.06675690e-05, 5.53783071e-03, 1.00000000e 00]])
In one of the scikit
packages (learn?) there's a cdist
that may handle this sort of thing faster.
CodePudding user response:
This would be easier if you provided a sample array. You can create an array Q of size [m, m, m] where Q[i, j, k] = X[i, k] - X[j, k] by using
X[None,:,:] - X[:,None,:]
At this point, you're performing simple numpy operations against the third axis.