Home > Software engineering >  How do i extract the standard error from a gaussian GLM model?
How do i extract the standard error from a gaussian GLM model?

Time:05-30

I wish to extract the standard error from my Gaussian GLM and my poisson GLM if anyone knows how i can do this?

code for simulated data and both models are below;

#data simulated before fitting models
set.seed(20220520)  
#simulating 200 values between 0 and 1 from a uniform distribution
x = runif(200, min = 0, max = 1) 

lam = exp(0.3 5*x)

y = rpois(200, lambda = lam) 

#before we do this each Yi may contain zeros so we need to add a small constant
y <- y   .1 
#combining x and y into a dataframe so we can plot
df = data.frame(x, y)

#Gausian GLM
model1 <- glm(y ~ x, 
          data = df,
          family = gaussian(link = 'log'))


#Poisson GLM
model2 <- glm(y ~ x, 
          data = df,
          family = poisson(link='log'))

CodePudding user response:

This question is a little deeper than might otherwise appear. In general, sigma() will extract the residual standard deviation:

Extract the estimated standard deviation of the errors, the “residual standard deviation” (misnamed also “residual standard error”, e.g., in ‘summary.lm()’'s output, from a fitted model).

Many classical statistical models have a scale parameter, typically the standard deviation of a zero-mean normal (or Gaussian) random variable which is denoted as sigma. ‘sigma(.)’ extracts the estimated parameter from a fitted model, i.e., sigma^.

This works as expected for a linear model (sigma(model1)). However, it doesn't necessarily do what you're expecting for the Poisson model; it returns the square root of the deviance divided by the number of observations, which is analogous to the residual standard deviation but not the same.

identical(
   sigma(model1),   ## 5.424689
   sqrt(sum(residuals(model1)^2)/(df.residual(model1)))
)  ## TRUE
sigma(model2)  ## 1.017891
sqrt(sum(residuals(model2, type="response")^2)/(df.residual(model2)))  ## 5.452

(If you redo this calculation with type = "deviance" [the default value for residuals.glm], you will get the same value as sigma() ...)

If you want to compare goodness of fit, you should consider a metric like the AIC ...

PS you probably shouldn't add 0.1 to your response; not only is this unnecessary (either for a log-link Gaussian or for a Poisson model), it leads to a series of warnings about "non-integer x" when you fit the Poisson model (harmless in this case, but a further indication that you probably shouldn't do this); however, you do need to specify starting values for the log-link Gaussian model (start = c(1,1) seems to work).

  • Related