Apologies, I'm going to do my best to create a reproducible example, but I am not sure if it's good enough, so excuse me if it isn't. I am trying to create multiple density plots to inspect the distribution of counts in 300 variables (species) as seperated by participant_group. I have a dataframe with dimensions 97 (participants) x 320. My first 2 columns are participant_id and participant_group respectively, and the rest 318 columns are the names of the species with their respective counts. I want to create a density plot for each of them.
participant_id <- c("P01","P02","P03","P04","P05","P06","P07","P08","P09","P10")
participant_group <- c("control", "responsive", "non-responsive", "control", "responsive", "non-responsive", "non-responsive", "control", "responsive", "non-responsive")
A <- c (0, 54, 0, 35, 76, 890, 45, 0, 1, 99)
B <- c (10, 504, 1, 52, 76, 90, 15, 20, 21, 9)
C <- c (460, 54, 5, 35, 7, 9, 45, 0, 1, 0)
D <- c (870, 654, 40, 5, 760, 80, 45, 0, 1, 76)
example_df <- data.frame(participant_id, participant_group, A, B, C, D)
So in this example, I would like to create density plots from column 3 to the ncol(example_df) I have tried the following code while trying to loop the number of columns or the name of the columns:
library(ggplot2)
# looping number of columns
loop.vector <- 3:6
plot_by_number <- for (i in loop.vector) {
taxa <- example_df[,i]
ggplot(example_df, aes(x=taxa, group=participant_group, fill=participant_group))
geom_density(adjust=1.5, alpha=.4)
}
# OR
# looping species names
species_names<-colnames(example_df[,3:ncol(example_df)])
plot_by_name <- for (i in species_names) {
ggplot(example_df, aes(x=i, group=participant_group, fill=participant_group))
geom_density(adjust=1.5, alpha=.4)
}
However, I get NULL for the plots. When I run the simple codes with the name or number of column it works:
ggplot(example_df, aes(x=A, group=participant_group, fill=participant_group))
geom_density(adjust=1.5, alpha=.4)
# OR
ggplot(example_df, aes(x=[,3], group=participant_group, fill=participant_group))
geom_density(adjust=1.5, alpha=.4)
I would also like to add the name of every column (species) as the title of the plot and save them all in a pdf file, however I am still far away from that.
I would really appreciate any help. Thanks for reading!
CodePudding user response:
Here is one way of doing it.
Create a custom function, with aes_string
for the single variable (columns)
f_ggplot <- function(v_column){
ggplot(data = example_df,
aes(group = participant_group,
fill = participant_group))
geom_density(aes_string(x = v_column),
adjust = 1.5,
alpha = 0.4)
labs(title = paste("Title for variable", v_column))
}
You can use the function on a single column:
f_ggplot("A")
Or pass a list of columns names (strings) to a lapply
.
l_cols <- c(LETTERS[1:4])
lapply(l_cols, f_ggplot)
PS: To make a report with all these result use Rmarkdown with pdf as output.
A note on tidyeval. Since ggplot2 3.0 (see here)
aes() now supports quasiquotation so that you can use !!, !!!, and :=. This replaces aes_() and aes_string() which are now soft-deprecated (but will remain around for a long time)
With that in mind, you could use sym()
and !!
, and rewrite the function as:
f_ggplot <- function(v_column){
v_column2 <- sym(v_column)
ggplot(data = example_df,
aes(x = !!v_column2,
group = participant_group,
fill = participant_group))
geom_density(adjust = 1.5,
alpha = 0.4)
labs(title = paste("Title for variable", v_column2))
}
The use is the same as before. More options on this question.
More on quasiquotation in Advanced R.