Home > Enterprise >  How to loop/generate many plots using ggplot on the same data frame
How to loop/generate many plots using ggplot on the same data frame

Time:10-15

I have a very long data frame of results. There are 148 exposures and 148 outcomes and each has been regressed against the other (148*148 = 21,904 - the number of rows in the df).

I am wanting to plot the results for each exposure against the 148 outcomes - so I want 148 plots in total. The code below does this for one exposure and generates one plot.

**Question:**How best to do this for all 148 exposures and to export to a multi-page PDF and/OR separate PDF files?

# libraries 

library(qs)
library(dplyr)
library(ggplot2)
library(ggrepel)

# make data

set.seed(15)
res_df <- data.frame(exp = randomStrings(N = 148, string_size = 4))
res_df <- data.frame(res_df[rep(seq_len(nrow(res_df)), each = 148), ])
colnames(res_df)[1] <- "exp"
res_df <- mutate(res_df, y = randomStrings(N = 148, string_size = 5),
                 logp = abs(rnorm(n = 148, mean = 5, sd = 6)),
                 r = rnorm(n = 148, mean = 0.5, sd = 0.1))

# subset df for individiual plot

subset <- res_df[1,1]
res_df_a <- subset(res_df, exp == subset)

# PLOT

ggplot(res_df_a, aes(x = r, y = logp, label = y))  
  geom_point(data = res_df_a[res_df_a$logp < 10,], color = "grey50")  
  geom_text_repel(data = res_df_a[res_df_a$logp > 10,], box.padding = 0.5, max.overlaps = Inf)  
  geom_point(data = res_df_a[res_df_a$logp > 10,], color = "red") 
  xlab("Variance explained (%)")   ylab("-log10(pvalue)")  
  ggtitle("y ~ exp")

CodePudding user response:

Rather than use your example, I provide below a simpler example using a synthetic dataset. The key to a multi-page pdf is using the argument onefile = TRUE when opening the pdf device:

# required libraries ------------------------------------------------------
library(ggplot2)

# make data ---------------------------------------------------------------
set.seed(1)
df <- data.frame(x = cumsum(rnorm(10)), y = cumsum(rnorm(10)))

# make sequential plot and send output to pdf device ----------------------
pdf("plotseq.pdf", width = 5, height = 5, onefile = TRUE)
for(i in seq(nrow(df))){
  p <- ggplot(df)   aes(x = x, y = y)  
    geom_point(shape = 1)   
    geom_point(data = df[i,])   
    labs(title=paste("i =", i))
  print(p)
}
dev.off()

CodePudding user response:

I managed to figure out the answer in the end by making a list of data frames from the one large df and then plot each df and save. Using the code above to make the data and then:

library(gridExtra)
# make list of data frames

obs_lists <- split( res_df , f = res_df$exp )

# plot each df within the list and write out to PDFs

p <- lapply(obs_lists, function(d) ggplot(
  data = d, aes(x = r, y = logp))   geom_point()
)

# 6 per page
ggsave("multi.pdf", gridExtra::marrangeGrob(grobs = p, nrow=3, ncol=2, top = NULL))

# 1 per page 
pdf("single.pdf", onefile = TRUE)
p
dev.off()

  • Related