Creating a Kernel matrix without for-loops in Python-CodePudding

I know there are other posts asking similar questions, but didn't manage to find something that answers my specific question. I have the code below :

def kernel_function(self, x1, x2):
    h = 0.5
    return np.exp(-(np.linalg.norm(x2 - x1)/h)**2)

for i, x1 in enumerate(train_x):
    for j, x2 in enumerate(train_x):
        K[i,j] = self.kernel_function(x1, x2)

where x1 and x2 are arrays of shape (2,). I need to vertorize it for performance. I looked at np.fromfunction, np.outer, but they don't seem to be what I am looking for...

Thank you in advance. Sorry if there is already an answer somewhere!

CodePudding user response：

Assuming train_x has the following format:

>>> train_x = np.array(((-.2, -.1), (0, .1), (.2, 0), (.1, -.1)))

Executing your code you get:

>>> np.set_printoptions(precision=2)
>>> K
[[1.   0.73 0.51 0.7 ]
 [0.73 1.   0.82 0.82]
 [0.51 0.82 1.   0.92]
 [0.7  0.82 0.92 1.  ]]

You can reshape train_x:

>>> train_x_cols = train_x.T.reshape(2, -1, 1)
>>> train_x_rows = train_x.T.reshape(2, 1, -1)

So, thanks to broadcasting, you get all the combinations when you subtract them:

>>> train_x_rows - train_x_cols
[[[ 0.   0.2  0.4  0.3]
  [-0.2  0.   0.2  0.1]
  [-0.4 -0.2  0.  -0.1]
  [-0.3 -0.1  0.1  0. ]]

 [[ 0.   0.2  0.1  0. ]
  [-0.2  0.  -0.1 -0.2]
  [-0.1  0.1  0.  -0.1]
  [ 0.   0.2  0.1  0. ]]]

And you can rewrite kernel_function() to calculate the norm on the first axis only:

def kernel_function(x1, x2):
    h = 0.5
    return np.exp(-(np.linalg.norm(x2 - x1, axis=0) / h) ** 2)

Then you get:

>>> kernel_function(train_x_cols, train_x_rows)
[[1.   0.73 0.51 0.7 ]
 [0.73 1.   0.82 0.82]
 [0.51 0.82 1.   0.92]
 [0.7  0.82 0.92 1.  ]]