Home > other >  Save density plot for each column in pdfs
Save density plot for each column in pdfs

Time:02-25

I want to create a density plot for each numerical column of a dataset and then the save the output as the column names.

To preserve data anonymity, I'll use mtcars.

My ideal output would be density plots saved as the following:

mpg.pdf
cyl.pdf
disp.pdf
hp.pdf
drat.pdf
qsec.pdf
vs.pdf
am.pdf
gear.pdf
carb.pdf

My attempt, which obviously doesn't work..

library(dplyr)
library(ggplot2)
col_tmp <- colnames(mtcars)

make_plots <- function(col){
  column <- mtcars %>% select(col) 
  col_plot <- column %>%
    ggplot( aes(x=col))  
    geom_density(fill="#69b3a2", color="#e9ecef", alpha=0.8)
  
  ggsave(col_plot, file="col.pdf")
}

lapply(mtcars, make_plots(col_tmp))

I thought maybe the issue was parsing the column names from the vector into the function? I tried using [[col]] but that didn't work either...

CodePudding user response:

There's a few ways. One is to use the .data construction.

make_plots <- function(col){
  col_plot <- mtcars %>%
    ggplot(aes(x = .data[[col]]))  
    geom_density(fill = "#69b3a2", color = "#e9ecef", alpha = 0.8)
  
  file_name <- paste0(col, ".pdf")
  ggsave(col_plot, file = file_name)
}

lapply(col_tmp, make_plots)

Note that your lapply did not work, and that selecting the column isn't necessary.

CodePudding user response:

The challenge is that inside your function col is a text string that contains the column name. It is not the name of the column.

The following works for turning text strings into column names:

my_string = "gear"

mtcars %>%
  mutate(new = 2*!!sym(my_string)) %>%
  select(new, !!sym(my_string))

Which is equivalent to:

mtcars %>%
  mutate(new = 2*gear) %>%
  select(new , gear)

On this basis, I would modify your function as follows:

make_plots <- function(col){
  column <- mtcars %>%
    select(!!sym(col)) %>%
    rename(only_column = !!sym(col))
  # now regardless of `col` the name of the column is 'only_col'

  col_plot <- column %>%
    ggplot(aes(x = only_column))  
    geom_density(fill = "#69b3a2", color = "#e9ecef", alpha = 0.8)
  
  file_name = paste0(col, ".pdf")
  ggsave(col_plot, file = file_name)
}

CodePudding user response:

You may plot the density distribution functions and use polygon to shade them. Adding 'cc' in '#69b3a2cc' defines an alpha of 0.8.

make_plots <- \(x) {
  pdf(paste0(x, '.pdf'))
  d <- density(mtcars[[x]])
  plot(d, main=x, col='#e9ecef')
  polygon(d, col='#69b3a2cc')
  dev.off()
}

To subset for only numerical columns, look if a column inherits from 'numeric' and save the boolean vector first.

num_cols <- sapply(mtcars, inherits, 'numeric')
lapply(names(mtcars[num_cols]), make_plots)

Gives

.pdf's that look like this:

enter image description here

You may explicitly specify file= path in pdf using something like paste0(path, x, '.pdf').

  • Related