Ive got data on 2 regression models (both got an estimate with corresponding lower and upper ends of a CI). Both models include 3 variables (age, sex, and smoking status). The variable term
(corresponding to the variables in the model) is ordered so it appears in the order I've specified.
The data looks as follows:
library(tidyverse)
library(ggplot2)
mydata <- structure(list(term = structure(c(1L, 1L, 2L, 2L, 3L, 4L, 4L,
5L, 5L), .Label = c("Age (years)", "Sex (male)", "Never smoking (reference)",
"Current smoking", "Former smoking", ">90 (reference)", "60-89",
"<60"), class = c("ordered", "factor")), estimate = c(1.5, 2.2,
0.7, 1.8, 1, 1.5, 2.2, 0.7, 1.6), conf_low = c(1.3, 1.8, 0.9,
1, 1, 1.3, 1.8, 0.9, 1), conf_high = c(1.7, 2.6, 0.5, 2.6, 1,
1.7, 2.6, 0.5, 2.4), model = structure(c(1L, 2L, 1L, 2L, NA,
1L, 2L, 1L, 2L), .Label = c("Model 1", "Model 2"), class = c("ordered",
"factor")), label = structure(c(3L, 6L, 1L, 5L, 2L, 3L, 6L, 1L,
4L), .Label = c("0.7 (0.9-0.5)", "1.0 (1.0-1.0)", "1.5 (1.3-1.7)",
"1.6 (1.0-2.4)", "1.8 (1.0-2.6)", "2.2 (1.8-2.6)"), class = "factor")), row.names = c(NA,
-9L), class = c("tbl_df", "tbl", "data.frame"))
head(mydata)
# A tibble: 6 x 6
term estimate conf_low conf_high model label
<ord> <dbl> <dbl> <dbl> <ord> <fct>
1 Age (years) 1.5 1.3 1.7 Model 1 1.5 (1.3-1.7)
2 Age (years) 2.2 1.8 2.6 Model 2 2.2 (1.8-2.6)
3 Sex (male) 0.7 0.9 0.5 Model 1 0.7 (0.9-0.5)
4 Sex (male) 1.8 1 2.6 Model 2 1.8 (1.0-2.6)
5 Never smoking (reference) 1 1 1 NA 1.0 (1.0-1.0)
6 Current smoking 1.5 1.3 1.7 Model 1 1.5 (1.3-1.7)
And I've made the following plot:
ggplot(data=mydata,
aes(x=estimate,
y=fct_rev(term),
color=model))
geom_point(position=position_dodge(width=0.3))
geom_errorbarh(aes(xmin=conf_low, xmax=conf_high, height=0.15), position=position_dodge(width=0.3))
Now im further trying to achieve twothings that I can't get working:
- I'd like to add a title/variable name above
Never smoking (reference)
in the plot on the Y axis, so that just above that label it says Smoking status (in bold). I could add another row to the original dataframe, but then the distance between the title and 'never smoking' will be kind of large. Is there a way to add a title otherwise, ór a way to adjust the specific distance between certain lables on the Y axis? - There is also a column
label
in the dataset, corresponding to the estimate and 95%CI. I'd like to add this as an extra column on the right of the plot, so that they are on the same height as their corresponding points/error-bars.
Thanks!
CodePudding user response:
One option to achieve your desired result would be to use annotation_custom
to add the group header and to use geom_text
to add the text labels for the error bars:
Note: To put the annotation outside of the plot I use coord_cartesian(clip = "off")
library(ggplot2)
library(forcats)
ggplot(
data = mydata,
aes(
x = estimate,
y = fct_rev(term),
color = model
)
)
geom_point(position = position_dodge(width = 0.3))
geom_errorbarh(aes(xmin = conf_low, xmax = conf_high, height = 0.15), position = position_dodge(width = 0.3))
annotation_custom(grob = grid::textGrob(label = "Smoking status", gp = grid::gpar(fontface = "bold"), hjust = 1), xmin = -Inf, xmax = -Inf, ymin = 3.2, ymax = 3.2)
geom_text(aes(x = max(conf_high) * 1.05, label = label, group = model), position = position_dodge(width = 0.3), hjust = 0, show.legend = FALSE, color = "black")
scale_x_continuous(expand = expansion(mult = c(.05, .3)))
coord_cartesian(clip = "off")
#> Warning: Removed 1 rows containing missing values (geom_point).