I have a df of public and private schools within counties, and each has an assigned value. I want to use forcats::fct_reorder
to rearrange the counties by the median value, but only for the private schools. Using default forcats::fct_reorder
arranges by total median, which is less useful for what I'm doing.
Reprex here:
# make df
set.seed(1)
df <-
data.frame(
county = rep(c("Bexar","Travis","Tarrant","Aransas"), each=20),
type = rep(c("public","private"), each=10)
) %>%
mutate(value = case_when(type == "public" ~ runif(80,0,1),
type == "private" ~ runif(80, 0, 10)))
# private values are way higher than public
# relevel by median value
df %>%
mutate(county = forcats::fct_reorder(county, value, .fun=median)) %>%
# this rearranges counties by total median, but I only want to arrange by median of the private schools
# plot
ggplot(aes(x=county, y = value, color = type))
geom_point(position = position_dodge(
width=.75
))
geom_boxplot(alpha=.5)
Desired output would order them by increasing median of private schools only: Aransas, Travis, Tarrant, Bexar.
thanks!
CodePudding user response:
library(tidyverse)
set.seed(1)
df <-
data.frame(
county = rep(c("Bexar","Travis","Tarrant","Aransas"), each=20),
type = rep(c("public","private"), each=10)
) %>%
mutate(value = case_when(type == "public" ~ runif(80,0,1),
type == "private" ~ runif(80, 0, 10)))
private_medians <-
df %>%
filter(type == "private") %>%
group_by(county) %>%
summarise(median = median(value)) %>%
arrange(median)
private_medians
#> # A tibble: 4 x 2
#> county median
#> <chr> <dbl>
#> 1 Aransas 3.91
#> 2 Travis 4.39
#> 3 Tarrant 5.68
#> 4 Bexar 6.24
# add other counties at the end in case they do not appear in the private subset
levels <- private_medians$county %>% union(df$county %>% unique())
df %>%
mutate(county = county %>% factor(levels = levels)) %>%
ggplot(aes(x=county, y = value, color = type))
geom_point(position = position_dodge(
width=.75
))
geom_boxplot(alpha=.5)
Created on 2021-10-18 by the reprex package (v2.0.1)