I'm trying to visualize a dataset that uses a binomial response variable (proportions). I'm using a gam to examine the trend, but having difficult getting it to plot with ggplot. How do I get the smooth added to the plot?
Example:
set.seed(42)
df <- data.frame(y1 = sample.int(100),
y2 = sample.int(100),
x = runif(100, 0, 100))
ggplot(data = df,
aes(y = y1/(y1 y2), x = x))
geom_point(shape = 1)
geom_smooth(method = "gam",
method.args = list(family = binomial),
formula = cbind(y1, y2) ~ s(x))
Warning message:
Computation failed in `stat_smooth()`
Caused by error in `cbind()`:
! object 'y1' not found
CodePudding user response:
The formula in geom_smooth
has to be in terms of x
and y
, representing the variables on your x and y axes, so you can't pass in y1
and y2
.
The way round this is that rather than attempting to use the cbind
type left-hand side of your gam
, you can expand the counts into 1s and 0s so that there is only a single y variable. Although this makes for a little extra pre-processing, it allows you to draw your points just as easily using stat = 'summary'
inside geom_point
and makes your geom_smooth
very straightforward:
library(tidyverse)
set.seed(42)
df <- data.frame(y1 = sample.int(100),
y2 = sample.int(100),
x = runif(100, 0, 100))
df %>%
rowwise() %>%
summarize(y = rep(c(1, 0), times = c(y1, y2)), x = x) %>%
ggplot(aes(x, y))
geom_point(stat = 'summary', fun = mean, shape = 1)
geom_smooth(method = "gam",
method.args = list(family = binomial),
formula = y ~ s(x))
theme_classic()
Created on 2023-01-20 with reprex v2.0.2