Home > Back-end >  How do I order X-Axis of geom_col plot with 2 factors?
How do I order X-Axis of geom_col plot with 2 factors?

Time:07-01

I have tried the method from similar problems on this site but still have not been able to figure this out.

enter image description here

I'm using the titanic dataset from kaggle. The end result I am looking for is that within each Pclass factor, I want age.class (the bars) to be sorted from low to high by n. My attempt is below.

library(tidyverse)
library(titanic)


df <- titanic::titanic_train

head(df)

# Start -------------------------------------------------------------------
df = df %>% 
  mutate(has.cabin = if_else(Cabin == '', 0, 1) %>% as.factor(),
         Pclass = Pclass %>% as.factor(),
         age.class = case_when(
           Age < 5  ~ 'baby',
           Age >5 & Age < 12 ~ 'Child',
           Age > 12 & Age < 18 ~ 'Teen', 
           Age > 18 & Age < 25 ~ 'Young Adult',
           Age > 25 & Age <35 ~ 'Mid Adult', 
           Age > 35 & Age < 60 ~  'Adult', 
           Age > 60 ~ 'Elderly',
           TRUE ~ 'Undefined'
                              )
         )

plot.data = df %>% count(has.cabin, Pclass, age.class) 
  

lvls <- unique(plot.data$Pclass[order(plot.data$age.class,-plot.data$n)])
plot.data$age.classv2 = factor(plot.data$age.class, levels=lvls)

plot.data %>% 
  ggplot(., aes(x = Pclass, y = n, fill = age.class))   
  geom_col(position = 'dodge')   
  facet_grid(~ has.cabin)

CodePudding user response:

I don't know of an easy way to order factors within a factor within facets. There are the functions

  • forcats::fct_reorder to reorder factors
  • tidytext::reorder_within to reorder one factor within a facet

I've used the second one and faceted by Pclass and made 2 plots, one for has.cabin == 0 and one for has.cabin == 1 and afterwards stitched them together.

One need a separate variable for the fill argument because internally, reorder_within generates several variables with the facet name appended. If you don't use the extra variable, then you see these names, see the comments in Julia Silge's blog.

library(titanic)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
library(tidytext)
library(ggplot2)
library(patchwork)

df <- titanic::titanic_train

df <- df %>% 
  mutate(has.cabin = if_else(Cabin == '', 0, 1) %>% as.factor(),
         Pclass = as.factor(Pclass),
         age.class = case_when(
           Age < 5  ~ 'baby',
           Age >5 & Age < 12 ~ 'Child',
           Age > 12 & Age < 18 ~ 'Teen', 
           Age > 18 & Age < 25 ~ 'Young Adult',
           Age > 25 & Age <35 ~ 'Mid Adult', 
           Age > 35 & Age < 60 ~  'Adult', 
           Age > 60 ~ 'Elderly',
           TRUE ~ 'Undefined'
         )
  )

p1 <- df %>%
  count(has.cabin, Pclass, age.class) %>% 
  filter(has.cabin == "0") %>% 
  mutate(age.class.plot = reorder_within(age.class, n, Pclass),
         Pclass = paste0("Plcass ", Pclass)) %>% 
  ggplot(aes(x = age.class.plot, y = n, fill = age.class))   
  geom_col(position = 'dodge')   
  scale_x_reordered()  
  facet_grid(~ Pclass, scales = "free_x")  
  theme(
    axis.title.x = element_blank(),
    axis.ticks.x = element_blank(),
    axis.text.x = element_blank()
  )  
  labs(title = "has.cabin 0")  
  coord_cartesian(ylim = c(0, 180))

p2 <- df %>%
  count(has.cabin, Pclass, age.class) %>% 
  filter(has.cabin == "1") %>% 
  mutate(age.class.plot = reorder_within(age.class, n, Pclass),
         Pclass = paste0("Plcass ", Pclass)) %>% 
  ggplot(aes(x = age.class.plot, y = n, fill = age.class))   
  geom_col(position = 'dodge')   
  scale_x_reordered()  
  facet_grid(~ Pclass, scales = "free_x")  
  theme(
    axis.title = element_blank(),
    axis.ticks = element_blank(),
    axis.text = element_blank()
  )  
  labs(title = "has.cabin 1")  
  coord_cartesian(ylim = c(0, 180))

p1   p2   plot_layout(guides = "collect")

Created on 2022-06-30 by the reprex package (v1.0.0)

I think in order to reorder a factor within a factor within a facet one would need to adapt reorder_within.

  • Related