I am trying to fit a multinomial logit model using LogisticRegression module from Sklearn.
My outcome (y) has 4 levels. I need to specify one of these levels as the reference category (or baseline). Does the LogisticRegression module provides a way of specifying this reference category?
CodePudding user response:
LogisticRegression for multiple classes in sklearn uses either one vs all or a softmax parameterization of the problem, depending on whether you specify multinomial. In either case it does not compute the solution using a reference, but instead computes a vector of coefficients for each output class. If you use the multinomial specification you can select the coefficients corresponding to the reference category you would like to set and subtract that from the others, which should recover an equivalent solution to the one you seem to want.
See the docs for how to specify multinomial: https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.LogisticRegression.html