If this question has already been answered, please link as I have not been able to locate a similar question. I have referred to
I would like to have the continuous variable of SES_Cscore as well on the x-axis. Is there a way to do this?
Thank you in advance and have a lovely day!
CodePudding user response:
This could be done by reshaping the data slightly so that SES_Score is recorded as a variable with one value per Species, and not as a variable to be mapped to bar height for each observation. I do that here by reshaping wide (so that the three variables each get their own columns), and then reshaping long again but only for the variables we want to map to y.
library(tidyverse)
data %>%
pivot_wider(names_from = variable, values_from = value) %>%
pivot_longer(2:3) %>%
mutate(Species2 = paste(Species, round(SES_Cscore,digits = 2), sep = "\n") %>%
fct_reorder(SES_Cscore)) -> data2
data2
## A tibble: 20 × 5
# Species SES_Cscore name value Species2
# <chr> <dbl> <chr> <dbl> <fct>
# 1 BABO_BW 6.73 obs.C-score 328680 "BABO_BW\n6.73"
# 2 BABO_BW 6.73 exp.C-score 276507 "BABO_BW\n6.73"
# 3 BABO_RC 5.31 obs.C-score 408360 "BABO_RC\n5.31"
# 4 BABO_RC 5.31 exp.C-score 345488 "BABO_RC\n5.31"
# 5 BABO_SKS 4.35 obs.C-score 285090 "BABO_SKS\n4.35"
# etc.
We could alternately achieve the reshaping differently in a way that might be more performant for large data, by making it into a join between the observations we want to map to y, and the observations we want to use for each species' x position:
left_join(data %>% filter(variable != "SES_Cscore"),
data %>% filter(variable == "SES_Cscore") %>%
transmute(Species, x_val = value,
Species_label = paste(Species, sprintf(value,
fmt = "%#.2f"), sep = "\n") %>% fct_reorder(value)))
Once reshaped, it's more straightforward to get a plot that is ordered by the SES_Cscore for each species:
ggplot(data2, aes(Species2, value, fill = name))
geom_col(position = "dodge")
If you want to plot with a continuous x axis related to SES_Cscore, you may run into some graphic design challenges, since the data might be bunched up in some cases. Note how the default bar width gets quite squished so that ggplot can keep the 2nd and 3rd Species bars from overlapping.
This approach also takes a little more work, since ggplot's axes work for either discrete (categorical) data, or continuous data, and there isn't a default designed to manage a combination, with categorical data that is mapped continuously. So you'd have to revert to some sort of geom_text
to make manual labels, and some customization if you want them to look more like normal axes labels.
ggplot(data2, aes(SES_Cscore, value, fill = name))
geom_col(position = "dodge")
ggrepel::geom_text_repel(aes(y = 0, label = Species),
angle = 90, direction = "x", hjust = 0, lineheight = 0.8, size = 3,
data = data2 %>% distinct(Species, .keep_all = TRUE))
CodePudding user response:
Up front, scaling the data and using a second axis can visually misrepresent the data: it's not hard to look at this plot hastily and infer that the blue bars' values mean the same thing as the red/green bars.
Having said that, try this:
library(ggplot2)
library(dplyr)
fac <- 50000
mycolors <- c("obs.C-score" = "red", "exp.C-score" = "green", "SES_Cscore" = "blue")
data %>%
mutate(value = value * ifelse(variable == "SES_Cscore", fac, 1)) %>%
ggplot(aes(x = Species, y = value))
geom_bar(aes(fill = variable), stat = "identity", position = "dodge")
scale_y_continuous(
sec.axis = sec_axis(name = "SES_Cscore", ~ . / fac),
breaks = ~ scales::extended_breaks()(pmax(0, .))
)
scale_color_manual(values = mycolors)
theme(
axis.title.y.right = element_text(color = mycolors["SES_Cscore"]),
axis.text.y.right = element_text(color = mycolors["SES_Cscore"]),
axis.ticks.y.right = element_line(color = mycolors["SES_Cscore"])
)
I'm using blue colors on the second (right) axis to try to visually pair it with the blue bars. I also took the liberty of keeping the primary (left) axis at 0 or more based on my inference of the data; it is not required at all. Also, I could have omitted scale_color_manual(.)
and just assume that out use of element_text(color="blue")
is going to be correct; that would fail if/when your data changes with either fewer or more levels within variable
, so I control them manually ... and I try to assign everything on the second axis the right color :-)