I have a data frame with water quality information on 20 lakes. For each lake the ph and dissolved oxygen levels were measured multiple times per visit, and the mean and max values recorded in the data frame. Model-derived estimates for dissolved oxygen and ph were calculated separately using two different methods. I would like to make a figure with 4 panels of scatterplots depicting the relationships between the observed values and the estimated values for each estimation "method" and water quality parameter. I would like to avoid using methods like ggarrange()
as I have a custom theme I am using that would require a lot of extra work. My best idea is to use facet_wrap()
in ggplot()
, but this yields all possible combinations of the different variables, and I am only interested in 4 specific combinations.
Example:
library(dplyr)
library(ggplot)
#creating a dataframe
df <- data.frame(
method = c(rep("quad", 10), rep("linear",10)),
date = c(rep("2021-11-17", 10),
rep("2022-02-27", 5),
rep("2021-11-20",5)),
ph_est = rnorm(20, 3),
disso_est = rnorm(20, 1),
mean_ph = rnorm(20, 0),
max_ph = rnorm(20, 2),
mean_disso = rnorm(20, 5),
max_disso = rnorm(20, 10)
)
df$ID <- seq.int(nrow(df))
#pivoting longer in order to plot
df_l <- pivot_longer(df,
cols = c("ph_est", "disso_est"),
names_to = "samp_pars",
values_to = "samp_vals")
df_l <- pivot_longer(df_l, cols = c("mean_ph", "max_ph", "mean_disso", "max_disso"),
names_to = "est_pars",
values_to = "est_vals")
#Attempting the plot
ggplot(df_l, aes(x = est_vals, y = samp_vals))
geom_point()
facet_wrap(~ method samp_pars est_pars, scales = "free")
The output is close, but I only want 4 panels:
1."quad" method for estimated ph vs mean ph
2."quad" method for estimated DO vs mean DO
3."linear" method for estimated ph vs max ph
4."linear" method for estimated DO vs mean DO
Is there a way I can rearrange my data frame to make this work? or do I need to go about this whole thing differently?
Any help would be appreciated and thank you in advance!
CodePudding user response:
You can filter the data frame to keep only the combinations you require for the 4 plots.
Add the following before your ggplot
:
df_l <- filter(df_l,
(method=="quad" & samp_pars=="ph_est" & est_pars=="mean_ph") |
(method=="quad" & samp_pars=="disso_est" & est_pars=="mean_disso") |
(method=="linear" & samp_pars=="ph_est" & est_pars=="max_ph") |
(method=="linear" & samp_pars=="disso_est" & est_pars=="mean_disso")
)
Then the ggplot
will only output the 4 plots you require.