I have tried the method from similar problems on this site but still have not been able to figure this out.
I'm using the titanic dataset from kaggle. The end result I am looking for is that within each Pclass factor, I want age.class (the bars) to be sorted from low to high by n. My attempt is below.
library(tidyverse)
library(titanic)
df <- titanic::titanic_train
head(df)
# Start -------------------------------------------------------------------
df = df %>%
mutate(has.cabin = if_else(Cabin == '', 0, 1) %>% as.factor(),
Pclass = Pclass %>% as.factor(),
age.class = case_when(
Age < 5 ~ 'baby',
Age >5 & Age < 12 ~ 'Child',
Age > 12 & Age < 18 ~ 'Teen',
Age > 18 & Age < 25 ~ 'Young Adult',
Age > 25 & Age <35 ~ 'Mid Adult',
Age > 35 & Age < 60 ~ 'Adult',
Age > 60 ~ 'Elderly',
TRUE ~ 'Undefined'
)
)
plot.data = df %>% count(has.cabin, Pclass, age.class)
lvls <- unique(plot.data$Pclass[order(plot.data$age.class,-plot.data$n)])
plot.data$age.classv2 = factor(plot.data$age.class, levels=lvls)
plot.data %>%
ggplot(., aes(x = Pclass, y = n, fill = age.class))
geom_col(position = 'dodge')
facet_grid(~ has.cabin)
CodePudding user response:
I don't know of an easy way to order factors within a factor within facets. There are the functions
forcats::fct_reorder
to reorder factorstidytext::reorder_within
to reorder one factor within a facet
I've used the second one and faceted by Pclass
and made 2 plots, one for has.cabin == 0
and one for has.cabin == 1
and afterwards stitched them together.
One need a separate variable for the fill
argument because internally, reorder_within
generates several variables with the facet name appended. If you don't use the extra variable, then you see these names, see the comments in Julia Silge's blog.
library(titanic)
library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#> filter, lag
#> The following objects are masked from 'package:base':
#>
#> intersect, setdiff, setequal, union
library(tidytext)
library(ggplot2)
library(patchwork)
df <- titanic::titanic_train
df <- df %>%
mutate(has.cabin = if_else(Cabin == '', 0, 1) %>% as.factor(),
Pclass = as.factor(Pclass),
age.class = case_when(
Age < 5 ~ 'baby',
Age >5 & Age < 12 ~ 'Child',
Age > 12 & Age < 18 ~ 'Teen',
Age > 18 & Age < 25 ~ 'Young Adult',
Age > 25 & Age <35 ~ 'Mid Adult',
Age > 35 & Age < 60 ~ 'Adult',
Age > 60 ~ 'Elderly',
TRUE ~ 'Undefined'
)
)
p1 <- df %>%
count(has.cabin, Pclass, age.class) %>%
filter(has.cabin == "0") %>%
mutate(age.class.plot = reorder_within(age.class, n, Pclass),
Pclass = paste0("Plcass ", Pclass)) %>%
ggplot(aes(x = age.class.plot, y = n, fill = age.class))
geom_col(position = 'dodge')
scale_x_reordered()
facet_grid(~ Pclass, scales = "free_x")
theme(
axis.title.x = element_blank(),
axis.ticks.x = element_blank(),
axis.text.x = element_blank()
)
labs(title = "has.cabin 0")
coord_cartesian(ylim = c(0, 180))
p2 <- df %>%
count(has.cabin, Pclass, age.class) %>%
filter(has.cabin == "1") %>%
mutate(age.class.plot = reorder_within(age.class, n, Pclass),
Pclass = paste0("Plcass ", Pclass)) %>%
ggplot(aes(x = age.class.plot, y = n, fill = age.class))
geom_col(position = 'dodge')
scale_x_reordered()
facet_grid(~ Pclass, scales = "free_x")
theme(
axis.title = element_blank(),
axis.ticks = element_blank(),
axis.text = element_blank()
)
labs(title = "has.cabin 1")
coord_cartesian(ylim = c(0, 180))
p1 p2 plot_layout(guides = "collect")
Created on 2022-06-30 by the reprex package (v1.0.0)
I think in order to reorder a factor within a factor within a facet one would need to adapt reorder_within
.