Can the f1, precision, accuracy and recall all have the same values?-CodePudding

I've been trying to implement a support vector machine algorithm using scikit-learn and after doing some measurements all the scores provide the same values.

x = df["Text"]
y = df["Mood"]

test_size = 5122

x_test = x[:-test_size]
y_test = y[:-test_size]

x_train = x[-test_size:]
y_train = y[-test_size:]

count_vect = CountVectorizer()
X_train_counts = count_vect.fit_transform(x_train)
tfidf_transformer = TfidfTransformer()
X_train_tfidf = tfidf_transformer.fit_transform(X_train_counts)
x_test = count_vect.transform(x_test).toarray()

SVM = svm.SVC(C=1.0, kernel='linear', degree=3, gamma='auto')
SVM.fit(X_train_tfidf, y_train)
predictions_SVM = SVM.predict(x_test)

print('Accuracy score is: ', accuracy_score(y_test, predictions_SVM))
print('F1 score is: ', f1_score(y_test, predictions_SVM, average='micro'))
print('Precission score is: ', precision_score(y_test, predictions_SVM, average ='micro'))
print('Recall score is: ', recall_score(y_test, predictions_SVM, average='micro'))

Output:

Accuracy score is:  0.9687622022647403
F1 score is:  0.9687622022647403
Precission score is:  0.9687622022647403
Recall score is:  0.9687622022647403

Is this normal or have I made an error somewhere?

CodePudding user response：

Looking at the documentation for these scores, it appears like they should all come out the same when you are using 'micro'.

They are all counting the fraction of times that you get the correct label.

See the examples:

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.accuracy_score.html

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.f1_score.html

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_score.html

https://scikit-learn.org/stable/modules/generated/sklearn.metrics.recall_score.html

In fact in the last three they all give the same example and of course get the same score.