I wish to extract the standard error from my Gaussian GLM and my poisson GLM if anyone knows how i can do this?
code for simulated data and both models are below;
#data simulated before fitting models
set.seed(20220520)
#simulating 200 values between 0 and 1 from a uniform distribution
x = runif(200, min = 0, max = 1)
lam = exp(0.3 5*x)
y = rpois(200, lambda = lam)
#before we do this each Yi may contain zeros so we need to add a small constant
y <- y .1
#combining x and y into a dataframe so we can plot
df = data.frame(x, y)
#Gausian GLM
model1 <- glm(y ~ x,
data = df,
family = gaussian(link = 'log'))
#Poisson GLM
model2 <- glm(y ~ x,
data = df,
family = poisson(link='log'))
CodePudding user response:
This question is a little deeper than might otherwise appear. In general, sigma()
will extract the residual standard deviation:
Extract the estimated standard deviation of the errors, the “residual standard deviation” (misnamed also “residual standard error”, e.g., in ‘summary.lm()’'s output, from a fitted model).
Many classical statistical models have a scale parameter, typically the standard deviation of a zero-mean normal (or Gaussian) random variable which is denoted as sigma. ‘sigma(.)’ extracts the estimated parameter from a fitted model, i.e., sigma^.
This works as expected for a linear model (sigma(model1)
). However, it doesn't necessarily do what you're expecting for the Poisson model; it returns the square root of the deviance divided by the number of observations, which is analogous to the residual standard deviation but not the same.
identical(
sigma(model1), ## 5.424689
sqrt(sum(residuals(model1)^2)/(df.residual(model1)))
) ## TRUE
sigma(model2) ## 1.017891
sqrt(sum(residuals(model2, type="response")^2)/(df.residual(model2))) ## 5.452
(If you redo this calculation with type = "deviance"
[the default value for residuals.glm
], you will get the same value as sigma()
...)
If you want to compare goodness of fit, you should consider a metric like the AIC ...
PS you probably shouldn't add 0.1 to your response; not only is this unnecessary (either for a log-link Gaussian or for a Poisson model), it leads to a series of warnings about "non-integer x" when you fit the Poisson model (harmless in this case, but a further indication that you probably shouldn't do this); however, you do need to specify starting values for the log-link Gaussian model (start = c(1,1)
seems to work).