Home > Net >  Grouped plot with consideration of the row order of the given data
Grouped plot with consideration of the row order of the given data

Time:08-25

Suppose the following data set is given:

df <- data.frame(x1 = seq(1,8,1),
                 x2 = c("G1","G2","G1","G2",
                        "G1","G2","G1","G2"))

I want to create a grouped plot, for example a grouped bar chart. But I want to use the order of the rows of the dataset for the plot, i.e. instead of combining all values of group 1 and group 2 and then plotting them, I want the bar plot to have a kind of alternating structure in which first the value of group 1 is displayed in red as specified in the dataset, then blue for group 2, then red again for group 1, and so on. Is this possible? I have added a picture of what I envision.

enter image description here

CodePudding user response:

Maybe you want something like this, which you modify the plot by changing the values using ggplot_build so that you have the same order as your dataframe like this:

df <- data.frame(x1 = seq(1,8,1),
                 x2 = c("G1","G2","G1","G2",
                        "G1","G2","G1","G2"))

df$group <- ""
library(ggplot2)
p <- ggplot(df, aes(x = group, y = x1, fill = x2))   
  geom_bar(position="stack", stat="identity")  
  labs(x = "", y = "")  
  coord_flip()

q <- ggplot_build(p)

q$data[[1]]$ymin <- c(0, 1, 3, 6, 10, 15, 21, 28)
q$data[[1]]$ymax <- c(1, 3, 6, 10, 15, 21, 28, 36)

q <- ggplot_gtable(q)
plot(q)

Created on 2022-08-25 with reprex v2.0.2

CodePudding user response:

It seems like these groups have values for specific time points? You could add an additional variable for each time point and then generate the plot as followed:

library(tidyverse)
df <- data.frame(x1 = seq(1,8,1),
                 x2 = c("G1","G2","G1","G2",
                        "G1","G2","G1","G2"),
                 time = rep(c(paste("t", c(1:4), sep = "")), each = 2)) 

df %>%
  ggplot(aes(y = x1, x = time, fill = x2)) 
  geom_bar(stat = "identity", position = "dodge2")

Created on 2022-08-25 with reprex v2.0.2

CodePudding user response:

Maybe you could replace the part

q$data[[1]]$ymin <- c(0, 1, 3, 6, 10, 15, 21, 28)
q$data[[1]]$ymax <- c(1, 3, 6, 10, 15, 21, 28, 36)

with the following part:

y_min <- rep(0,length(df$x1))
y_max <-  rep(0,length(df$x1))
for(k in 2:length(df$x1)){
y_min[k] <- sum(df$x1[1:(k-1)])  
}
y_max[1:(length(y_max)-1)] <- y_min[2:length(y_min)]
y_max[length(y_max)] <- max(y_min) max(df$x1)
q$data[[1]]$ymin <- y_min
q$data[[1]]$ymax <- y_max

For some level of generality.

  • Related