I want to put data on facets of violin plots and annotate these violins with the number of observations used to plot the violin.
Here is an example of what I have without observation counts:
library(ggplot2)
library(dplyr)
library(tidyverse)
data("iris")
c <- rep(c('r', 'g', 'b'), 50)
c <- sample(c)
facet_row <- rep(c('row1', 'row2', 'row3', 'row4', 'row5'), 30)
facet_col <- rep(c('col1', 'col2', 'col3'), 50)
iris$facet_rows <- facet_row
iris$facet_cols <- facet_col
iris$color <- c
iris$count <- sample(1:10, size = 150, replace = T)
p <- ggplot(iris, aes(x=Species, y=Petal.Length, fill=color))
geom_violin(alpha = 0.7, na.rm = T)
coord_flip()
facet_grid(rows = vars(facet_rows), cols = vars(facet_cols))
print(p)
I want to put the number of observations right behind those violins. I tried this so far:
count_data <- function (y){
df <- data.frame(y = min(y) - 0.2, label = length(y))
return(df)
}
p <- ggplot(iris, aes(x=Species, y=Petal.Length, fill=color))
geom_violin(alpha = 0.7, na.rm = T) stat_summary(fun.data = count_data, geom = "text", aes(group = Species))
coord_flip()
facet_grid(rows = vars(facet_rows), cols = vars(facet_cols))
print(p)
This produces an output with an issue:
Grouped violins now have one count value. The problem is that those violins most definetly will have different number of observations.
I have tried to just draw a geom_text
using precomputed number of observations
(assume that iris$count
actually contains observation counts that will have the same value for different rows, but random here):
p <- ggplot(iris, aes(x=Species, y=Petal.Length, fill=color))
geom_violin(alpha = 0.7, na.rm = T) geom_text(aes(label=count, y=Petal.Length), nudge_y = -0.1)
coord_flip()
facet_grid(rows = vars(facet_rows), cols = vars(facet_cols))
print(p)
This has a similar problem with the previous approach:
- It has values for two violins in the same group in one line.
- Each violin repeats the number of observations once for each observation.
I am relatively new to R, I feel like there is a clean way to do this, but I can't figure it out...
CodePudding user response:
Removing the explicit grouping and putting position_dodge
resolved the issue:
count_data <- function (y){
df <- data.frame(y = min(y) - 0.2, label = length(y))
return(df)
}
p <- ggplot(iris, aes(x=Species, y=Petal.Length, fill=color))
geom_violin(alpha = 0.7, na.rm = T) stat_summary(fun.data = count_data, geom = "text", position = position_dodge(1))
coord_flip()
facet_grid(rows = vars(facet_rows), cols = vars(facet_cols))
print(p)