Home > Software engineering >  How to calculate the cosine similarity betwen list to lists?
How to calculate the cosine similarity betwen list to lists?

Time:06-23

I have the following code:

import numpy as np
from numpy import dot
from numpy.linalg import norm

a = np.array([3, 45, 7, 2])
b = np.array([[2, 54, 13, 15], [2, 54, 13, 14], [2, 54, 13, 13]])

cos_sim = dot(a, b)/(norm(a)*norm(b))

but it fails. The idea is to calculate the cosine similarity of a over b, and get a list like this:

[0.97..., 0.97..., 0.97...]

Can I do this briefly or I need use for?

CodePudding user response:

You need to change a couple things: first, when using dot(), the last dimension of the first array needs to match the second-to-last dimension of the second array, so in your case what you probably want is dot(a, b.T). Then, in order to compute the norm of each constituent array within b instead of computing the norm of b as a matrix, you need norm(b, axis=1). Putting those together, you probably should use dot(a, b.T) / (norm(a) * norm(b, axis=1)).

  • Related