Adding subcategories to a legend in ggplot2


Let's say we have this following hierarcichal data on the habitats that make up my fantasy island (which is of course always warm and sunny!)


hab_dat <- data.frame(
  habitat_type = rep(c("sea", "coast", "land"), times = 1, each = 3),
  habitat_name = c("rocky", "sandy", "seaweed",
                  "beach", "pebbles", "rockpools",
                  "fields", "hills", "forest"),
  area_km2 = sample(10:40, size =9))

I want to plot the total area of each habitat type and so write following code

hab_dat %>% 
  group_by(habitat_type) %>% 
  summarise(area_km2 = sum(area_km2)) %>%
  ggplot(aes(x = habitat_type, y = area_km2, fill = habitat_type))  
  geom_bar(stat = "identity")  
  scale_fill_manual(values = c("gold", "forestgreen", "blue"))

Looks good, but the legend is not very informative. I would like for the habitats contained within each habitat type to be included in the legend under the appropriate habitat type, just as qualitative information. Here is an example I made in paint. enter image description here

I can get a bit closer using the following code without affecting the appearance of the plot, however, I am missing the habitat_type titles and also have multiple tiles for the same colour.

hab_dat <- hab_dat %>% mutate(col = rep(c("blue", "gold", "forestgreen"), times = 1, each = 3))

pal <- setNames(as.character(hab_dat$col), as.character(hab_dat$habitat_name))

ggplot(hab_dat, aes(x = habitat_type, y = area_km2, fill = habitat_name))  
  geom_bar(position = "stack", stat = "identity")  
  scale_fill_manual(values = pal)

enter image description here

I have been looking at solutions along the lines of this one but am trying for a more automated solution as my actual data is a bit larger than this, and also one that presents the colour tile once per group as per my drawing.

I don't think there is an elegant solution that adresses your problem. I'll suggest here that you format the labels to imply the hierarchy.

hab_dat <- data.frame(
  habitat_type = rep(c("sea", "coast", "land"), times = 1, each = 3),
  habitat_name = c("rocky", "sandy", "seaweed",
                   "beach", "pebbles", "rockpools",
                   "fields", "hills", "forest"),
  area_km2 = sample(10:40, size =9))

# Format labels
labels <- split(hab_dat$habitat_name, hab_dat$habitat_type)
labels <- unlist(Map(function(top, bottom) {
  paste0(top, "\n", paste("- ", bottom, collapse = "\n"))
}, top = names(labels), bottom = labels))

hab_dat %>% 
  group_by(habitat_type) %>% 
  summarise(area_km2 = sum(area_km2)) %>%
  ggplot(aes(x = habitat_type, y = area_km2, fill = habitat_type))  
  geom_bar(stat = "identity")  
    values = c("gold", "forestgreen", "blue"),
    labels = function(i) {labels[i]} # Lookup label

