Home > Net >  ML accuracy for a particular group/range
ML accuracy for a particular group/range

Time:03-08

General terms that I used to search on google such as Localised Accuracy, custom accuracy, biased cost functions all seem wrong, and maybe I am not even asking the right questions.

Imagine I have some data, may it be the:

  1. The famous Iris Classification Problem
  2. Pictures of felines
  3. The Following Dataset that I made up on predicting house prices:

enter image description here

In all these scenario, I am really interested in the accuracy of one set/one regression range of data.

  1. For irises, I really need Iris "setosa" to be classified correctly, really don't care if Iris virginica and Iris versicolor are all wrong.

  2. for Felines, I really need the model to tell me if you spotted a tiger (for obvious reason), whether it is a Persian or ragdoll or not I dont really care.

  3. For the house prices one, i want the accuracy of higher-end houses error to be minimised. Because error in those is costly.

How do I do this? If I want Setosa to be classified correctly, removing virginica or versicolour both seem wrong. Trying different algorithm like Linear/SVM etc are all well and good, but it only improves the OVERALL accuracy. But I really need, for example, "Tigers" to be predicted correctly, even at the expense of the "overall" accuracy of the model.

Is there a way to have a custom cost-function to allow me to have a high accuracy in a localise region in a regression problem, or a specific category in a classification problem?

If this cannot be answered, if you could just point me to some terms that i can search/research that would still be greatly appreciated.

CodePudding user response:

You can use weights to achieve that. If you're using the SVC class of scikit-learn, you can pass class_weight in the constructor. You could also pass sample_weight in the fit-method.

For example:

from sklearn import svm
from sklearn import datasets
iris = datasets.load_iris()
X = iris.data
y = iris.target

clf = svm.SVC(class_weight={0: 3, 1: 1, 2: 1})
clf.fit(X, y)

This way setosa is more important than the other classes.

  • Related