Home > Blockchain >  How to Display Categorical Groups for x-axis Label of Multiple Box plots in ggplot2
How to Display Categorical Groups for x-axis Label of Multiple Box plots in ggplot2

Time:04-20

I am trying to examine the distribution of flights from NYC per day for each month of the year. The dataset I'm using is "nycflights13", which can be installed as a package. Then, I transform the data as follow:

# load
flights <- nycflights13::flights

# rid NA
flights <- flights %>% tidyr::drop_na(tailnum)

# filter out only flights going FROM NYC and add month
flights_with_month <- flights %>%
  filter(origin != "EWR") %>%
  mutate(mth = month(time_hour), label = TRUE)

# calculate flights per day
flights_with_month <- flights_with_month %>%
  group_by(mth, day) %>%
  mutate(total_daily_flights = n())

# making boxplots
ggplot(flights_with_month, aes(y = total_daily_flights, group = mth))  
  geom_boxplot()

This is the boxplot I'm getting; you can notice that the x-axis is not the name of the months, and it also seems that they are not ordered from January to December as well. Data Snapshot

CodePudding user response:

Your code has two issues. mutate(mth = month(time_hour), label = TRUE) should be mutate(mth = month(time_hour, label = TRUE)), and your ggplot should set x = mth rather than grp = mth. The data were being plotted in the correct order, but with incorrect labeling.

# load
flights <- nycflights13::flights

# rid NA
flights <- flights %>% tidyr::drop_na(tailnum)

# filter out only flights going FROM NYC and add month
flights_with_month <- flights %>%
  filter(origin != "EWR") %>%
  mutate(mth = month(time_hour, label = TRUE))

# calculate flights per day
flights_with_month <- flights_with_month %>%
  group_by(mth, day) %>%
  mutate(total_daily_flights = n())

# making boxplots
ggplot(flights_with_month, aes(y = total_daily_flights, x = mth))  
  geom_boxplot()

enter image description here

  • Related