I have data that is similar to these two arrays:
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class_____ = ['A','B','C','A','B','C','A','B','C']
I would like to find the number of classes that are correctly predicted once the majority consensus is taken - e.g my data shows predictions for 'A' = 66% correct, 'B' = 66% correct, 'C' = 33% correct, so overall accuracy would be 66% given the most common prediction for class 'A' and 'B' are correct, but 'C' isn't.
CodePudding user response:
From what you write in the example and comments, it looks like you are looking for the maximum of the correct-to-all prediction ratio for each class.
Here is one way of doing so using collections.Counter
:
import collections
def max_model_match(true, predicted):
# count all occurrences of the classes
counter_all = collections.Counter(true)
# initialize the "correct" or "good" counter
counter_good = counter_all.copy()
counter_good.clear()
# loop through all outcomes
for (x, y) in zip(true, predicted):
# if the prediction is correct increment the counter
if x == y:
counter_good[x] = 1
# find the maximum correct-to-all ratio
max_good_ratio = 0.0
for key in counter_all.keys():
good_ratio = counter_good[key] / counter_all[key]
if good_ratio > max_good_ratio:
max_good_ratio = good_ratio
return max_good_ratio
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class = ['A','B','C','A','B','C','A','B','C']
max_model_match(true_class, predicted_class)
# 0.6666666666666666
CodePudding user response:
A simple approach using a defaultdict
and max
:
predicted_class = ['A','B','C','A','B','A','B','C','A']
true_class = ['A','B','C','A','B','C','A','B','C']
from collections import defaultdict
d = defaultdict(lambda : [0, 0]) # [total, correct]
for p,t in zip(predicted_class, true_class):
d[t][0] = 1
if p == t:
d[t][1] = 1
# max value
max(n/t for t,n in d.values())
output: 0.666666666