I'm new in R and I'm struggling with some plotting in ggplot.
I have some monthly data I simply plotted as points connected with lines.
ggplot(data=df, aes(x=x,y=y))
geom_line(aes(group=g)) geom_point()
Now, I'd like to add pairwise results of Wilcoxon tests between the three categories grouped. It should look like this.
I'm a bit confused, I know stat_pvalue_manual
works with categories, but I have a continuous y axis. and it should be horizontal.
Maybe there are more functions to do this. does anyone have an example of how this could be done?
Thanks in advance.
structure(list(x = c("April", "April", "April", "May", "May", "May", "June", "June", "June", "July", "July", "July", "August", "August", "August", "September", "September", "September", "October", "October", "October", "November", "November", "November", "December", "December", "December", "January", "January", "January", "February", "February", "February"), g = c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"), y = c(4.748, 5.3388, 5.7433, 4.744, 5.4938, 6.1583, 4.767, 5.6, 6.2067, 4.889, 5.8363, 6.295, 4.887, 5.6413, 6.15, 4.94, 5.73, 6.1833, 4.974, 5.2113, 5.77, 5.022, 5.47, 5.9117, 4.964, 5.3425, 5.7217, 4.95, 5.15, 5.9833, 4.75, 5.425, 5.7833)), class = "data.frame", row.names = c(NA, -33L))
CodePudding user response:
There's a few things that make this fiddly, the main ones being that you have a discrete scale for your x-axis, and stat_pvalue_manual
seems to only work with continuous scales, and a coordinate swap is needed. As a result the factor needs to be ordered, and changed from geom_line
to geom_path
, and the means for each factor level need to be calculated and added into the stat_test
object. This results in:
#Test data
df <- structure(list(x = c("April", "April", "April", "May", "May", "May", "June", "June", "June", "July", "July", "July", "August", "August", "August", "September", "September", "September", "October", "October", "October", "November", "November", "November", "December", "December", "December", "January", "January", "January", "February", "February", "February"), g = c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"), y = c(4.748, 5.3388, 5.7433, 4.744, 5.4938, 6.1583, 4.767, 5.6, 6.2067, 4.889, 5.8363, 6.295, 4.887, 5.6413, 6.15, 4.94, 5.73, 6.1833, 4.974, 5.2113, 5.77, 5.022, 5.47, 5.9117, 4.964, 5.3425, 5.7217, 4.95, 5.15, 5.9833, 4.75, 5.425, 5.7833)), class = "data.frame", row.names = c(NA, -33L))
df$x <- factor(df$x, levels=unique(df$x))
stat.test <- compare_means(
y ~ g, data = df
)
#Calculate mean values by group
means <- aggregate(df$y, list(g=df$g), mean)
means2 <- means$x
names(means2) <- means$g
stat.test$group1 <- means2[stat.test$group1]
stat.test$group2 <- means2[stat.test$group2]
stat.test$y.position = c(13, 13.5, 13) #arbitrary location for plotting brackets
#Modify the plot
ggplot(data=df, aes(x=y,y=as.numeric(x)))
geom_path(aes(group=g))
geom_point()
stat_pvalue_manual(stat.test, coord.flip = TRUE) coord_flip()
scale_y_continuous("Month", labels=levels(df$x),
breaks=seq_along(levels(df$x)), minor_breaks = 1)