I have a dataset that looks like this.
Note that variable A and B are binary variables of Low/High
The following code has been run in R
logit = glm(y ~ A*B , family = binomial(link='logit') , data=df)
summary(logit)
The reason for including the interaction effect between A and B is my hypothesis does NOT align with the effect of A and B so I thought I'd include the interaction effect between A and B and not surprisingly it turned out to be quite significant.
But how do I interpret these coefficients?
I know how to interpret if either A or B was numeric but dealing with 2 categorical variables is quite hard to get my head around.
Looking forward to some expert's advices/comments.
Many thanks!
Many thanks in advance.
CodePudding user response:
General background: interpreting logistic regression coefficients
First of all, to learn more about interpreting logistic regression coefficients generally, take a look at
Because A is either 0 or 1 and B is either 0 or 1, the last term in that equation above will be 0 unless both A=1 and B=1. That corresponds to both variables being Low
assuming you're using the default factor coding. We can interpret the coefficient of 1.41, which is positive, as saying that if A
is Low
, the effect of B
on y
is more positive, or causing a greater increase in the probability of y
being Terminated
. Specifically, if both are Low
, the odds of Terminated
are about exp(1.41) = 4.1
times higher than if at least one of them isn't Low
.
You can say "if A is Low, then B being Low has a positive effect on the probability of termination, but if A is High, then B being Low has a negative effect on the probability of termination." That's because the main effect of B is < 0 while the interaction coefficient is > 0.