I want to create a density plot for each numerical column of a dataset and then the save the output as the column names.
To preserve data anonymity, I'll use mtcars
.
My ideal output would be density plots saved as the following:
mpg.pdf
cyl.pdf
disp.pdf
hp.pdf
drat.pdf
qsec.pdf
vs.pdf
am.pdf
gear.pdf
carb.pdf
My attempt, which obviously doesn't work..
library(dplyr)
library(ggplot2)
col_tmp <- colnames(mtcars)
make_plots <- function(col){
column <- mtcars %>% select(col)
col_plot <- column %>%
ggplot( aes(x=col))
geom_density(fill="#69b3a2", color="#e9ecef", alpha=0.8)
ggsave(col_plot, file="col.pdf")
}
lapply(mtcars, make_plots(col_tmp))
I thought maybe the issue was parsing the column names from the vector into the function? I tried using [[col]]
but that didn't work either...
CodePudding user response:
There's a few ways. One is to use the .data
construction.
make_plots <- function(col){
col_plot <- mtcars %>%
ggplot(aes(x = .data[[col]]))
geom_density(fill = "#69b3a2", color = "#e9ecef", alpha = 0.8)
file_name <- paste0(col, ".pdf")
ggsave(col_plot, file = file_name)
}
lapply(col_tmp, make_plots)
Note that your lapply
did not work, and that select
ing the column isn't necessary.
CodePudding user response:
The challenge is that inside your function col
is a text string that contains the column name. It is not the name of the column.
The following works for turning text strings into column names:
my_string = "gear"
mtcars %>%
mutate(new = 2*!!sym(my_string)) %>%
select(new, !!sym(my_string))
Which is equivalent to:
mtcars %>%
mutate(new = 2*gear) %>%
select(new , gear)
On this basis, I would modify your function as follows:
make_plots <- function(col){
column <- mtcars %>%
select(!!sym(col)) %>%
rename(only_column = !!sym(col))
# now regardless of `col` the name of the column is 'only_col'
col_plot <- column %>%
ggplot(aes(x = only_column))
geom_density(fill = "#69b3a2", color = "#e9ecef", alpha = 0.8)
file_name = paste0(col, ".pdf")
ggsave(col_plot, file = file_name)
}
CodePudding user response:
You may plot
the density
distribution functions and use polygon
to shade them. Adding 'cc'
in '#69b3a2cc'
defines an alpha of 0.8.
make_plots <- \(x) {
pdf(paste0(x, '.pdf'))
d <- density(mtcars[[x]])
plot(d, main=x, col='#e9ecef')
polygon(d, col='#69b3a2cc')
dev.off()
}
To subset for only numerical columns, look if a column inherits
from 'numeric'
and save the boolean vector first.
num_cols <- sapply(mtcars, inherits, 'numeric')
lapply(names(mtcars[num_cols]), make_plots)
Gives
.pdf's that look like this:
You may explicitly specify file=
path in pdf
using something like paste0(path, x, '.pdf')
.