Let's say we have this following hierarcichal data on the habitats that make up my fantasy island (which is of course always warm and sunny!)
set.seed(1)
hab_dat <- data.frame(
habitat_type = rep(c("sea", "coast", "land"), times = 1, each = 3),
habitat_name = c("rocky", "sandy", "seaweed",
"beach", "pebbles", "rockpools",
"fields", "hills", "forest"),
area_km2 = sample(10:40, size =9))
hab_dat
I want to plot the total area of each habitat type and so write following code
hab_dat %>%
group_by(habitat_type) %>%
summarise(area_km2 = sum(area_km2)) %>%
ggplot(aes(x = habitat_type, y = area_km2, fill = habitat_type))
geom_bar(stat = "identity")
scale_fill_manual(values = c("gold", "forestgreen", "blue"))
Looks good, but the legend is not very informative. I would like for the habitats contained within each habitat type to be included in the legend under the appropriate habitat type, just as qualitative information. Here is an example I made in paint.
I can get a bit closer using the following code without affecting the appearance of the plot, however, I am missing the habitat_type titles and also have multiple tiles for the same colour.
hab_dat <- hab_dat %>% mutate(col = rep(c("blue", "gold", "forestgreen"), times = 1, each = 3))
pal <- setNames(as.character(hab_dat$col), as.character(hab_dat$habitat_name))
ggplot(hab_dat, aes(x = habitat_type, y = area_km2, fill = habitat_name))
geom_bar(position = "stack", stat = "identity")
scale_fill_manual(values = pal)
I have been looking at solutions along the lines of this one but am trying for a more automated solution as my actual data is a bit larger than this, and also one that presents the colour tile once per group as per my drawing.
CodePudding user response:
I don't think there is an elegant solution that adresses your problem. I'll suggest here that you format the labels to imply the hierarchy.
library(ggplot2)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
set.seed(1)
hab_dat <- data.frame(
habitat_type = rep(c("sea", "coast", "land"), times = 1, each = 3),
habitat_name = c("rocky", "sandy", "seaweed",
"beach", "pebbles", "rockpools",
"fields", "hills", "forest"),
area_km2 = sample(10:40, size =9))
# Format labels
labels <- split(hab_dat$habitat_name, hab_dat$habitat_type)
labels <- unlist(Map(function(top, bottom) {
paste0(top, "\n", paste("- ", bottom, collapse = "\n"))
}, top = names(labels), bottom = labels))
hab_dat %>%
group_by(habitat_type) %>%
summarise(area_km2 = sum(area_km2)) %>%
ggplot(aes(x = habitat_type, y = area_km2, fill = habitat_type))
geom_bar(stat = "identity")
scale_fill_manual(
values = c("gold", "forestgreen", "blue"),
labels = function(i) {labels[i]} # Lookup label
)
Created on 2022-07-19 by the reprex package (v2.0.1)