Home > Back-end >  How decision function of Logistic Regression on scikit-learn works?
How decision function of Logistic Regression on scikit-learn works?

Time:03-11

I am trying to understand how this function works and the mathematics behind it. Does decision_function() in scikitlearn give us log odds? The function return values ranging from minus infinity to infinity and it seems like 0 is the threshold for prediction when we are using decision_function() whereas the threshold is 0.5 when we are using predict_proba(). This is exactly the relationship between probability and log odds Geeksforgeeks.

I couldn't see anything about that in the documentation but the function behaves like log-likelihood I think. am I right?

CodePudding user response:

Decision function is nothing but the value of (as you can see in the source)

f(x) = <w, x>   b

where predict proba is (as you can see in the source)

p(x) = exp(f(x)) / [exp(f(x))   exp(-f(x))] = 1 / (1   exp(-2x))

which up to a constant under exp, is just a regular sigmoid function.

Consequently, the corresponding threshold points will be 0 for f(x), and 0.5 for p(x), since

exp(0) / [exp(0)   exp(-0)] = 1 / 2 = 0.5

So how do you interpret the decision function? It is essentially 2 times the logit of the probability modeled by LR model. (The "2 times" comes from just a trick scikit-learn uses to always be able to use softmax instead of manually doing sigmoid, which is unfortunate).

  • Related