Home > Mobile >  Create box plot using starting and ending points for bars from data frame
Create box plot using starting and ending points for bars from data frame

Time:11-22

I am trying to create a boxplot but only have two values per factor, which I want to use as a starting and ending point for the boxplot bars.

I have a data frame (df) that looks like this:

   ID            **spp**     **lrr**             Est                SE
1  25           species 1    -1.029      -0.423814246776361   0.309105763160605
2  25           species 1    0.1820      -0.423814246776361   0.309105763160605
5  24           species 2    -3.694      -1.67397643357167    1.03077640640442
6  24           species 2    0.3463      -1.67397643357167    1.03077640640442
7  21           species 3    0.5181      2.484906649788       1.4142135623731
8  21           species 3    4.4516      2.484906649788       1.4142135623731

I need a bar per species (spp) using the values in lrr. For example, I expect the bar from species 1 to range from -1.029 to 0.1820, the bar from species 2 to range from -3.694 to 0.3463 and so on.

I tried using the following code:

ggplot(df)  
  aes(x = lrr, y = spp)  
  geom_boxplot()  
  theme_minimal()

However, instead of creating a single bar per species, it creates two separate points. I have also tried to rearrange the data by having two lrr columns (one for the starting point and one for the endpoint):

   ID            **spp**     **lrr1**      **lrr2**            Est                SE
1  25           species 1    -1.029         0.1820     -0.423814246776361   0.309105763160605
5  24           species 2    -3.694         0.3463     -1.67397643357167    1.03077640640442
7  21           species 3    0.5181         4.4516     2.484906649788       1.4142135623731

However, I still do not know how to force bars into a starting and ending point. Any help is appreciated.

CodePudding user response:

Something like this, using geom_crossbar ?

library(dplyr)
library(ggplot2)
library(scales)

df1 %>% 
  group_by(spp) %>% 
  mutate(upper = max(lrr), 
         lower = min(lrr)) %>% 
  ungroup() %>% 
  ggplot(aes(spp, lrr))   
  geom_crossbar(aes(ymin = lower, 
                    ymax = upper), 
                fatten = 1,
                width = 0.5)   
  scale_y_continuous(breaks = pretty_breaks())

Result:

enter image description here

Data:

df1 <- structure(list(ID = c(25L, 25L, 24L, 24L, 21L, 21L), spp = c("species 1", 
"species 1", "species 2", "species 2", "species 3", "species 3"
), lrr = c(-1.029, 0.182, -3.694, 0.3463, 0.5181, 4.4516), Est = c(-0.423814246776361, 
-0.423814246776361, -1.67397643357167, -1.67397643357167, 2.484906649788, 
2.484906649788), SE = c(0.309105763160605, 0.309105763160605, 
1.03077640640442, 1.03077640640442, 1.4142135623731, 1.4142135623731
)), class = "data.frame", row.names = c("1", "2", "5", "6", "7", 
"8"))

CodePudding user response:

Using your wide dataframe, you can set stat = "identity" inside geom_boxplot() and manually set the boxplot parameters:

library(ggplot2)

ggplot(df_wide)  
  geom_boxplot(
    aes(
      y = spp, 
      xmin = lrr1, xlower = lrr1, 
      xupper = lrr2, xmax = lrr2, 
      xmiddle = (lrr1   lrr2)/2
    ),
    stat = "identity"
  )

But if you don’t care about the middle bar, it may be easier to use your original (long) dataframe with geom = "bar" inside stat_summary():

ggplot(df, aes(lrr, spp))  
  stat_summary(
    fun.min = min, 
    fun = median, 
    fun.max = max, 
    geom = "bar", 
    color = "black", 
    fill = "white"
  )

  • Related