I'm building a ML model. I would like to run the prediction bit a few times and then calculate the mean of the accuracy scores.
My code looks like this:
predictions = test_df[['histor', 'philosoph', 'cook', 'roman', 'bibl']].apply(lambda x: baseline.predict(*x), axis=1)
y_true = test_df["label"].values
print("Accuracy: ", accuracy_score(y_true, predictions))
Is there a way to loop the predictions? The desired results would be: let's say n=10. Predictions are run 10 times, I get all the accuracies printed for each run and also the mean of all of them at the end.
Hope this makes sense.
CodePudding user response:
I would use sklearns cross_val_score for this:
from sklearn.model_selection import cross_val_score
X = test_df[['histor', 'philosoph', 'cook', 'roman', 'bibl']]
y = test_df["label"].values
cross_val_score(baseline, X, y, cv=10)
CodePudding user response:
You can store the accuracy scores in a list, and then use that list to calculate the mean accuracy at the end
import numpy as np
n = 10
accuracies = np.zeros(n)
for i in range(n):
predictions = test_df[['histor', 'philosoph', 'cook', 'roman', 'bibl']].apply(lambda x: baseline.predict(*x), axis=1)
accuracy = accuracy_score(y_true, predictions)
accuracies[i] = accuracy
print("Run ", i 1, " Accuracy: ", accuracy)
mean_accuracy = np.mean(accuracies)
print("Mean Accuracy: ", mean_accuracy)
or
n = 10
accuracies = []
for i in range(n):
predictions = test_df[['histor', 'philosoph', 'cook', 'roman', 'bibl']].apply(lambda x: baseline.predict(*x), axis=1)
accuracy = accuracy_score(y_true, predictions)
accuracies.append(accuracy)
print("Run ", i 1, " Accuracy: ", accuracy)
mean_accuracy = sum(accuracies) / n
print("Mean Accuracy: ", mean_accuracy)