I am trying to create a boxplot but only have two values per factor, which I want to use as a starting and ending point for the boxplot bars.
I have a data frame (df) that looks like this:
ID **spp** **lrr** Est SE
1 25 species 1 -1.029 -0.423814246776361 0.309105763160605
2 25 species 1 0.1820 -0.423814246776361 0.309105763160605
5 24 species 2 -3.694 -1.67397643357167 1.03077640640442
6 24 species 2 0.3463 -1.67397643357167 1.03077640640442
7 21 species 3 0.5181 2.484906649788 1.4142135623731
8 21 species 3 4.4516 2.484906649788 1.4142135623731
I need a bar per species (spp) using the values in lrr. For example, I expect the bar from species 1 to range from -1.029 to 0.1820, the bar from species 2 to range from -3.694 to 0.3463 and so on.
I tried using the following code:
ggplot(df)
aes(x = lrr, y = spp)
geom_boxplot()
theme_minimal()
However, instead of creating a single bar per species, it creates two separate points. I have also tried to rearrange the data by having two lrr columns (one for the starting point and one for the endpoint):
ID **spp** **lrr1** **lrr2** Est SE
1 25 species 1 -1.029 0.1820 -0.423814246776361 0.309105763160605
5 24 species 2 -3.694 0.3463 -1.67397643357167 1.03077640640442
7 21 species 3 0.5181 4.4516 2.484906649788 1.4142135623731
However, I still do not know how to force bars into a starting and ending point. Any help is appreciated.
CodePudding user response:
Something like this, using geom_crossbar
?
library(dplyr)
library(ggplot2)
library(scales)
df1 %>%
group_by(spp) %>%
mutate(upper = max(lrr),
lower = min(lrr)) %>%
ungroup() %>%
ggplot(aes(spp, lrr))
geom_crossbar(aes(ymin = lower,
ymax = upper),
fatten = 1,
width = 0.5)
scale_y_continuous(breaks = pretty_breaks())
Result:
Data:
df1 <- structure(list(ID = c(25L, 25L, 24L, 24L, 21L, 21L), spp = c("species 1",
"species 1", "species 2", "species 2", "species 3", "species 3"
), lrr = c(-1.029, 0.182, -3.694, 0.3463, 0.5181, 4.4516), Est = c(-0.423814246776361,
-0.423814246776361, -1.67397643357167, -1.67397643357167, 2.484906649788,
2.484906649788), SE = c(0.309105763160605, 0.309105763160605,
1.03077640640442, 1.03077640640442, 1.4142135623731, 1.4142135623731
)), class = "data.frame", row.names = c("1", "2", "5", "6", "7",
"8"))
CodePudding user response:
Using your wide dataframe, you can set stat = "identity"
inside geom_boxplot()
and manually set the boxplot parameters:
library(ggplot2)
ggplot(df_wide)
geom_boxplot(
aes(
y = spp,
xmin = lrr1, xlower = lrr1,
xupper = lrr2, xmax = lrr2,
xmiddle = (lrr1 lrr2)/2
),
stat = "identity"
)
But if you don’t care about the middle bar, it may be easier to use your original (long) dataframe with geom = "bar"
inside stat_summary()
:
ggplot(df, aes(lrr, spp))
stat_summary(
fun.min = min,
fun = median,
fun.max = max,
geom = "bar",
color = "black",
fill = "white"
)