I have the following code:
import numpy as np
from numpy import dot
from numpy.linalg import norm
a = np.array([3, 45, 7, 2])
b = np.array([[2, 54, 13, 15], [2, 54, 13, 14], [2, 54, 13, 13]])
cos_sim = dot(a, b)/(norm(a)*norm(b))
but it fails. The idea is to calculate the cosine similarity of a
over b
, and get a list like this:
[0.97..., 0.97..., 0.97...]
Can I do this briefly or I need use for
?
CodePudding user response:
You need to change a couple things: first, when using dot()
, the last dimension of the first array needs to match the second-to-last dimension of the second array, so in your case what you probably want is dot(a, b.T)
. Then, in order to compute the norm of each constituent array within b
instead of computing the norm of b
as a matrix, you need norm(b, axis=1)
. Putting those together, you probably should use dot(a, b.T) / (norm(a) * norm(b, axis=1))
.