Home > front end >  horizontal p-val geom_point plot
horizontal p-val geom_point plot

Time:11-15

I'm new in R and I'm struggling with some plotting in ggplot.

I have some monthly data I simply plotted as points connected with lines.

  ggplot(data=df, aes(x=x,y=y))   
  geom_line(aes(group=g))   geom_point() 

enter image description here

Now, I'd like to add pairwise results of Wilcoxon tests between the three categories grouped. It should look like this. enter image description here

I'm a bit confused, I know stat_pvalue_manual works with categories, but I have a continuous y axis. and it should be horizontal.

Maybe there are more functions to do this. does anyone have an example of how this could be done?

Thanks in advance.

structure(list(x = c("April", "April", "April", "May", "May", "May", "June", "June", "June", "July", "July", "July", "August", "August", "August", "September", "September", "September", "October", "October", "October", "November", "November", "November", "December", "December", "December", "January", "January", "January", "February", "February", "February"), g = c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"), y = c(4.748, 5.3388, 5.7433, 4.744, 5.4938, 6.1583, 4.767, 5.6, 6.2067, 4.889, 5.8363, 6.295, 4.887, 5.6413, 6.15, 4.94, 5.73, 6.1833, 4.974, 5.2113, 5.77, 5.022, 5.47, 5.9117, 4.964, 5.3425, 5.7217, 4.95, 5.15, 5.9833, 4.75, 5.425, 5.7833)), class = "data.frame", row.names = c(NA, -33L))

CodePudding user response:

There's a few things that make this fiddly, the main ones being that you have a discrete scale for your x-axis, and stat_pvalue_manual seems to only work with continuous scales, and a coordinate swap is needed. As a result the factor needs to be ordered, and changed from geom_line to geom_path, and the means for each factor level need to be calculated and added into the stat_test object. This results in:

#Test data
df <- structure(list(x = c("April", "April", "April", "May", "May", "May", "June", "June", "June", "July", "July", "July", "August", "August", "August", "September", "September", "September", "October", "October", "October", "November", "November", "November", "December", "December", "December", "January", "January", "January", "February", "February", "February"), g = c("a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"), y = c(4.748, 5.3388, 5.7433, 4.744, 5.4938, 6.1583, 4.767, 5.6, 6.2067, 4.889, 5.8363, 6.295, 4.887, 5.6413, 6.15, 4.94, 5.73, 6.1833, 4.974, 5.2113, 5.77, 5.022, 5.47, 5.9117, 4.964, 5.3425, 5.7217, 4.95, 5.15, 5.9833, 4.75, 5.425, 5.7833)), class = "data.frame", row.names = c(NA, -33L))
df$x <- factor(df$x, levels=unique(df$x))

stat.test <- compare_means(
  y ~ g, data = df
)

#Calculate mean values by group
means <- aggregate(df$y, list(g=df$g), mean)
means2 <- means$x
names(means2) <- means$g

stat.test$group1 <- means2[stat.test$group1]
stat.test$group2 <- means2[stat.test$group2]
stat.test$y.position = c(13, 13.5, 13)  #arbitrary location for plotting brackets

#Modify the plot
ggplot(data=df, aes(x=y,y=as.numeric(x)))   
  geom_path(aes(group=g))   
  geom_point()   
  stat_pvalue_manual(stat.test, coord.flip = TRUE)   coord_flip()   
  scale_y_continuous("Month", labels=levels(df$x), 
                     breaks=seq_along(levels(df$x)), minor_breaks = 1)

Output plot

  • Related