Home > OS >  Unscale coefficient of scaled continuous variable in negative binomial regression
Unscale coefficient of scaled continuous variable in negative binomial regression

Time:10-19

I'm fitting a negative binomial regression. I scaled all continuous predictors prior to fitting the model. I need to transform the coefficients of scaled predictors to be able to interpret them on their original scale. Example:

# example dataset
set.seed(1)
dep <- dnbinom(seq(1:150), size = 150, prob = 0.75)
ind.1 <- ifelse(sign(rnorm(150))==-1,0,1)
ind.2 <- rnorm(150, 10, 1.7)
df <- data.frame(dep, ind.1, ind.2)

# scale continuous independent variable
df$ind.2 <- scale(df$ind.2) 

# fit model
m1 <- MASS::glm.nb(dep ~ ind.1   ind.2, data = df)
summz <- summary(m1)

To get the result for ind.1 I take the exponential of the coefficient:

# result for ind.1
exp(summz$coefficients["ind.1","Estimate"])
> [1] 1.276929 

Which shows that for every 1 unit increase in ind.1 you'd expect a 1.276929 increase in dep. But what about for ind.2? I gather that as the predictor is scaled the coefficient can be interpreted as the effect an increase of 1 standard deviation of ind.2 has on dep. How to transform this back to original units? This answer says to multiply the coefficient by the sd of the predictor, but how to do this in the case of a logit link? exp(summz$coefficients["ind.2","Estimate"] * sc) doesn't seem to make sense.

CodePudding user response:

Set up data:

set.seed(1)
dep <- dnbinom(seq(1:150), size = 150, prob = 0.75)
ind.1 <- ifelse(sign(rnorm(150))==-1,0,1)
ind.2 <- rnorm(150, 10, 1.7)
df <- data.frame(dep, ind.1, ind.2)
sc <- sd(df$ind.2)

Fit unscaled and scaled models:

m_unsc <- MASS::glm.nb(dep ~ ind.1   ind.2, data = df)
m_sc <- update(m_unsc, data = transform(df, ind.2 = drop(scale(df$ind.2))))

Compare coefficients:

 cbind(coef(m_unsc), coef(m_sc))
                   [,1]        [,2]
(Intercept) -5.50449624 -5.13543854
ind.1        0.24445805  0.24445805
ind.2        0.03662308  0.06366992

Check equivalence (we divide the scaled coefficient by the scaling factor (sc=sd(ind.2)) to get back the unscaled coefficient):

all.equal(coef(m_sc)["ind.2"]/sc, coef(m_unsc)["ind.2"])

The negative binomial model uses a log link, not a logit link, so if you want to back-transform the coefficient to get proportional or "fold" changes per unit of ind2:

exp(coef(m_sc)["ind.2"]/sc)

this gives 1.0373, a 4% change in the response per unit change in ind.2 (you can confirm that it's the same as exponentiating the unscaled coefficient).

Note that 2/3 of the answers in the linked question, including the currently accepted answer, are wrong: you should be dividing the scaled coefficient by the scaling factor, not multiplying ...

  • Related