I know this is a very basic question, but please forgive me. I have a python script which is calculating cosine similarity of sentences. The result the script is returning is like this: [[0.72894156 0.96235985 0.61194754]]
. I want to store these three values into an array or list individually, so I can find the minimum and maximum values. When I store them in an array, it stores them altogether in a single value. Here is the script:
sentence_embeddings = model.encode(sentences)
sentence_embeddings.shape
result = (cosine_similarity(
[sentence_embeddings[0]],
sentence_embeddings[1:]
))
print(result)
Your help is much appreciated!
CodePudding user response:
To clarify, OP is asking for the result [[0.72894156, 0.96235985, 0.61194754]]
of shape (1, 3)
to become [[0.72894156], [0.96235985], [0.61194754]]
of shape (3, 1)
.
As mentioned in the comments, we can either use .reshape
or a more generalizable way of transposing .T
.
result.reshape(-1, 1)
and result.T
CodePudding user response:
This may help. I came across something similar and treated the output like an array. To get the specific score based on the texts I compared I did the following:
cos_text = [TextA, TextB]
cv = CountVectorizer()
count_matrix = cv.fit_transform(cos_text)
#word matrix
doc_term_matrix = count_matrix.todense()
df_matrix = pd.DataFrame(doc_term_matrix,
columns=cv.get_feature_names_out())
#individual score
score=cosine_similarity(df_matrix)
cs_scoreA = score[0,1]
cs_scoreB = score[1,0]