I have a very long data frame of results. There are 148 exposures and 148 outcomes and each has been regressed against the other (148*148 = 21,904 - the number of rows in the df).
I am wanting to plot the results for each exposure against the 148 outcomes - so I want 148 plots in total. The code below does this for one exposure and generates one plot.
**Question:**How best to do this for all 148 exposures and to export to a multi-page PDF and/OR separate PDF files?
# libraries
library(qs)
library(dplyr)
library(ggplot2)
library(ggrepel)
# make data
set.seed(15)
res_df <- data.frame(exp = randomStrings(N = 148, string_size = 4))
res_df <- data.frame(res_df[rep(seq_len(nrow(res_df)), each = 148), ])
colnames(res_df)[1] <- "exp"
res_df <- mutate(res_df, y = randomStrings(N = 148, string_size = 5),
logp = abs(rnorm(n = 148, mean = 5, sd = 6)),
r = rnorm(n = 148, mean = 0.5, sd = 0.1))
# subset df for individiual plot
subset <- res_df[1,1]
res_df_a <- subset(res_df, exp == subset)
# PLOT
ggplot(res_df_a, aes(x = r, y = logp, label = y))
geom_point(data = res_df_a[res_df_a$logp < 10,], color = "grey50")
geom_text_repel(data = res_df_a[res_df_a$logp > 10,], box.padding = 0.5, max.overlaps = Inf)
geom_point(data = res_df_a[res_df_a$logp > 10,], color = "red")
xlab("Variance explained (%)") ylab("-log10(pvalue)")
ggtitle("y ~ exp")
CodePudding user response:
Rather than use your example, I provide below a simpler example using a synthetic dataset. The key to a multi-page pdf is using the argument onefile = TRUE
when opening the pdf device:
# required libraries ------------------------------------------------------
library(ggplot2)
# make data ---------------------------------------------------------------
set.seed(1)
df <- data.frame(x = cumsum(rnorm(10)), y = cumsum(rnorm(10)))
# make sequential plot and send output to pdf device ----------------------
pdf("plotseq.pdf", width = 5, height = 5, onefile = TRUE)
for(i in seq(nrow(df))){
p <- ggplot(df) aes(x = x, y = y)
geom_point(shape = 1)
geom_point(data = df[i,])
labs(title=paste("i =", i))
print(p)
}
dev.off()
CodePudding user response:
I managed to figure out the answer in the end by making a list of data frames from the one large df and then plot each df and save. Using the code above to make the data and then:
library(gridExtra)
# make list of data frames
obs_lists <- split( res_df , f = res_df$exp )
# plot each df within the list and write out to PDFs
p <- lapply(obs_lists, function(d) ggplot(
data = d, aes(x = r, y = logp)) geom_point()
)
# 6 per page
ggsave("multi.pdf", gridExtra::marrangeGrob(grobs = p, nrow=3, ncol=2, top = NULL))
# 1 per page
pdf("single.pdf", onefile = TRUE)
p
dev.off()