I would like to make a contour plot with ggplot2
by using gam
results. Below is a detailed explanation of what I want:
#packages
library(mgcv)
library(ggplot2)
library(tidyr)
#prepare data
df <- data.frame(x = iris$Sepal.Width,
y = iris$Sepal.Length,
z = iris$Petal.Length)
#fit gam
gam_fit <- gam(z ~
s(x)
s(y),
data=df,na.action = "na.fail")
To predict z values based on the gam_fit
, I found a way from
My goal is removing tile and contour at where there are no measured x-y points. For example, there is no measured points around the top-right & top-left corners of the plot.
I wonder if mgcViz
can achieve this, but it requires including x & y as an interaction term as below (also I am not sure how to add measured points on the below figure):
library(mgcViz)
gamm_fit2 <- gam(z ~
s(x,y),
data=df,na.action = "na.fail") #,REML=TRUE
b <- getViz(gamm_fit2)
plot(sm(b, 1))
I think df_pred
may not the best format to achieve my goal, but I am not sure how to do this. I would be grateful if you give me any solution with ggplot2
.
CodePudding user response:
There might be a package designed to handle this task, but if you can't find the right 'tool' for the job one option is to draw a polygon around the 'points' and colour everything outside the polygon grey, e.g.
library(tidyverse)
library(mgcv)
#prepare data
df <- data.frame(x = iris$Sepal.Width,
y = iris$Sepal.Length,
z = iris$Petal.Length)
#fit gam
gam_fit <- gam(z ~
s(x)
s(y),
data=df,na.action = "na.fail")
df_pred <- expand_grid(
x = seq(from=min(df$x),
to=max(df$x),
length.out = 100),
y = seq(from=min(df$y),
to=max(df$y),
length.out = 100)
)
df_pred <- predict(gam_fit, newdata = df_pred,
se.fit = TRUE) %>%
as_tibble() %>%
cbind(df_pred)
ggplot()
geom_tile(data=df_pred, aes(x=x, y=y, fill = fit))
geom_point(data=df,aes(x=x, y=y))
scale_fill_distiller(palette = "YlGnBu")
geom_contour(data=df_pred, aes(x=x, y=y, z = fit), colour = "white")
coord_cartesian(xlim = c(1.9, 4.5),
ylim = c(4, 8))
# Get the 'hull' around all of the dots
hulls <- df[chull(df$x, df$y), ]
# Get the 'edges' of the frame, starting at the first hull point
edges <- data.frame(x = c(4.1,4.5,4.5,1.9,1.9,4.5),
y = c(5.2,4,8,8,4,4),
z = NA)
# Combine
draw_poly <- rbind(hulls, edges)
# Draw the plot, and overlay the gray polygon
ggplot()
geom_tile(data=df_pred, aes(x=x, y=y, fill = fit))
geom_point(data=df, aes(x=x, y=y))
scale_fill_distiller(palette = "YlGnBu")
geom_contour(data=df_pred, aes(x=x, y=y, z = fit), colour = "white")
geom_polygon(data=draw_poly, aes(x=x, y=y), fill = "grey")
# Without the points
ggplot()
geom_tile(data=df_pred, aes(x=x, y=y, fill = fit))
# geom_point(data=df, aes(x=x, y=y))
scale_fill_distiller(palette = "YlGnBu")
geom_contour(data=df_pred, aes(x=x, y=y, z = fit), colour = "white")
geom_polygon(data=draw_poly, aes(x=x, y=y), fill = "grey")
You can do this a bit more easily using my gratia package (IMHO), but the general idea is the same
# remotes::install_github("gavinsimpson/gratia") # need's dev version
library("gratia")
# prepare data
df <- with(iris, data.frame(x = Sepal.Width,
y = Sepal.Length,
z = Petal.Length))
# fit model
gam_fit <- gam(z ~ s(x) s(y), data = df, method = "REML")
# prepare a data slice through the covariate space
ds <- data_slice(gam_fit, x = evenly(x, n = 100), y = evenly(y, n = 100))
# predict
fv <- fitted_values(gam_fit, data = ds)
# exclude points that are too far
drop <- too_far(ds$x, ds$y, df$x, df$y, dist = 0.1)
fv <- fv |>
mutate(fitted = if_else(drop, NA_real_, fitted))
# then plot
fv |>
ggplot(aes(x = x, y = y, fill = fitted))
geom_tile()
geom_point(data = df, aes(x = x, y = y, fill = NULL))
scale_fill_distiller(palette = "YlGnBu")
geom_contour(aes(z = fitted, fill = NULL), colour = "white")