I am trying to plot a logistic regression using ggplot with this type of binomial data (all data is more than 6000 rows):
age | result |
---|---|
50 and older | 1 |
18-49 years old | 1 |
50 and older | 0 |
50 and older | 1 |
18-49 years old | 0 |
Using geom_smooth I am trying to make a visualization of this logistic regression model:
ggplot(data, aes(age, result))
geom_smooth(method = "glm", formula = y ~ x, colour = "black", method.args = list(family = binomial))
However, the result looks like this:
CodePudding user response:
Because the x axis is discrete, you need to ensure that you give each x value the same group
aesthetic:
library(ggplot2)
ggplot(data, aes(age, result, group = 1))
geom_smooth(method = "glm", formula = y ~ x, colour = "black",
method.args = list(family = binomial))
However, I'm not sure how meaningful this end result is, since your x axis groups are discrete, and it therefore doesn't make a lot of sense to have a continuous line or SE between them. If this were me, I would probably use point estimates with error bars:
pred_df <- data.frame(age = c('50 and older', '18-49 years old'))
fit <- predict(model, newdata = pred_df, se.fit = TRUE, type = 'response')
pred_df$fit <- fit$fit
pred_df$upper <- fit$fit 1.96 * fit$se.fit
pred_df$lower <- fit$fit - 1.96 * fit$se.fit
ggplot(pred_df, aes(age, fit))
geom_errorbar(aes(ymin = lower, ymax = upper), width = 0.25)
geom_point(size = 3)
ylim(c(0, 1))
Data used
set.seed(1)
data <- data.frame(age = rep(c('50 and older', '18-49 years old'), each = 3000),
result = rbinom(6000, 1, rep(c(0.3, 0.5), each = 3000)))