How can I reduce lines of code and make it work with other data?-CodePudding

I have multi-classes like this:

predicted = [1 0 2 1 1 0 1 2 1 2 2 0 0 0 0 2 2 1 1 1 0 1 0 1 2 1 1 2 0 0]
actual    = [1 0 2 1 1 0 1 2 1 1 2 0 0 0 0 1 2 1 1 2 0 2 0 2 2 2 2 2 0 0]

And I want to find the precision for each class(0,1,2)

This is my code:

TP_0 = 0
TP_1 = 0
TP_2 = 0
FP_0 = 0
FP_1 = 0
FP_2 = 0
    
for i in range(len(y_pred)):
    if y_pred[i] == y_test[i] :
        if y_pred[i] == 0: 
            TP_0  = 1
        elif y_pred[i] == 1:
            TP_1  = 1
        else:
            TP_2  = 1
    else:
        if y_pred[i] == 0: 
            FP_0  = 1
        elif y_pred[i] == 1:
            FP_1  = 1
        else:
            FP_2  = 1 

precision_0 = TP_0/(TP_0 FP_0)
precision_1 = TP_1/(TP_1 FP_1)
precision_2 = TP_2/(TP_2 FP_2)

It works if I know the number of classes and data before. But now I want to make it work whether or not I know them, like if I have a larger number of classes.

How can I reduce the code or make it dynamic?

Note: I don't like to finish it with a library.

CodePudding user response：

You can try this:

def precision(y_test, y_pred): 
    # to count false-pos and true-pos  
    classes = sorted(list(set(y_test   y_pred)))
    tp = {cls: 0 for cls in classes}
    fp = {cls: 0 for cls in classes}
    
    # count tp and fp
    for i in range(len(y_pred)):
        if y_pred[i] == y_test[i]:
            tp[y_test[i]]  = 1
        else:
            fp[y_test[i]]  = 1
    
    
    # calculate prec for every class
    precision = dict()
    for cls in classes:
        try:
            precision[cls] = tp[cls] / (tp[cls]   fp[cls])
        except ZeroDivisionError:
            precision[cls] = 0.0
    
    return precision


predicted = [0, 1, 2, 3, 0, 1, 4]
actual = [0, 1, 2, 0, 1, 2, 3]
print(precision(actual, predicted))

Output:

{0: 0.5, 1: 0.5, 2: 0.5, 3: 0.0, 4: 0.0}

You get dictionary with key - class and value - precision.

CodePudding user response：

import numpy as np

# Convert to arrays
y_pred = np.array(y_pred)
y_test = np.array(y_test)

def get_precision(pred, truth, num_classes):
    precision_by_class = []
    match = (pred == truth) # Binary array indicating whether each prediction is true
    for i_class in range(num_classes): # Iterate over classes
        # match[pred == i_class].sum() -> number of correct predictions of specific class
        # (pred == i_class).sum() -> number of times specific class was predicted
        out.append(match[pred == i_class].sum() / (pred == i_class).sum())
    accuracy = match.mean() # Total accuracy
    return precision_by_class, accuracy