Home > Software engineering >  Unable to plot confidence intervals using ggplot, (geom_ribbon() argument)
Unable to plot confidence intervals using ggplot, (geom_ribbon() argument)

Time:05-30

I am trying to plot 95% confidence intervals on some simulated values but am running into so issues when i am trying to plot the CIs using the geom_ribbon() argument. The trouble I'm having it that my model does not show the CIs when i plot them, like so; enter image description here

I have included all of my code below if anyone knows where i have gone wrong here;

set.seed(20220520)  
#simulating 200 values between 0 and 1 from a uniform distribution
x = runif(200, min = 0, max = 1) 

lam = exp(0.3 5*x)

y = rpois(200, lambda = lam)

#before we do this each Yi may contain zeros so we need to add a small constant
y <- y   .1 
#combining x and y into a dataframe so we can plot
df = data.frame(x, y)

#fitting a Poisson GLM
model2 <- glm(y ~ x, 
          data = df,
          family = poisson(link='log'))

#make predictions (this may be the same as predictions_mod2)
preds <- predict(model2, type = "response")

#making CI predictions
predictions_mod2 = predict(model2, df, se.fit = TRUE, type = 'response')

#calculate confidence intervals limit
upper_mod2 = predictions_mod2$fit 1.96*predictions_mod2$se.fit 
lower_mod2 = predictions_mod2$fit-1.96*predictions_mod2$se.fit

#transform the CI limit to get one at the level of the mean
upper_mod2 = exp(upper_mod2)/(1 exp(upper_mod2)) 
lower_mod2 = exp(lower_mod2)/(1 exp(lower_mod2))

#combining into a df
predframe = data.frame(lwr=lower_mod2,upr=upper_mod2, x = df$x, y = df$y)

#plot model with 95% confidence intervals using ggplot
ggplot(df, aes(x, y))  
  geom_ribbon(data = predframe, aes(ymin=lwr, ymax=upr), alpha = 0.4)  
  geom_point()  
  geom_line(aes(x, preds2), col = 'blue')

CodePudding user response:

In a comment to the question, it's asked why not to logit transform the predicted values. The reason why is that the type of prediction asked for is "response". From the documentation, my emphasis.

type
the type of prediction required. The default is on the scale of the linear predictors; the alternative "response" is on the scale of the response variable. Thus for a default binomial model the default predictions are of log-odds (probabilities on logit scale) and type = "response" gives the predicted probabilities. The "terms" option returns a matrix giving the fitted values of each term in the model formula on the linear predictor scale.

There is a good way to answer, to show the code.

library(ggplot2, quietly = TRUE)

set.seed(20220520)  
#simulating 200 values between 0 and 1 from a uniform distribution
x = runif(200, min = 0, max = 1) 

lam = exp(0.3 5*x)

y = rpois(200, lambda = lam)

#before we do this each Yi may contain zeros so we need to add a small constant
y <- y   0.1 
#combining x and y into a dataframe so we can plot
df = data.frame(x, y)

#fitting a Poisson GLM
suppressWarnings(
  model2 <- glm(y ~ x, 
                data = df,
                family = poisson(link='log'))
)
#make predictions (this may be the same as predictions_mod2)
preds <- predict(model2, type = "response")

#making CI predictions
predictions_mod2 = predict(model2, df, se.fit = TRUE, type = 'response')

#calculate confidence intervals limit
upper_mod2 = predictions_mod2$fit 1.96*predictions_mod2$se.fit 
lower_mod2 = predictions_mod2$fit-1.96*predictions_mod2$se.fit

#combining into a df
predframe = data.frame(lwr=lower_mod2,upr=upper_mod2, x = df$x, y = df$y)

#plot model with 95% confidence intervals using ggplot
ggplot(df, aes(x, y))  
  geom_ribbon(data = predframe, aes(ymin=lwr, ymax=upr), alpha = 0.4)  
  geom_point()  
  geom_line(aes(x, preds), col = 'blue')

Created on 2022-05-29 by the reprex package (v2.0.1)

  • Related