I'm creating a graph in R that shows the total number of rides per month. Actually, the start month is April 2020 and the end month is March 2021. This arrangement has an important meaning in analyzing and explaining for others. However, since R automatically orders it from 1 => 12 so I would like to ask whether there is another way for me to rearrange this?
library(ggplot2);library(dplyr)
Bike %>%
group_by(member_casual, Date_month) %>%
summarise(.groups = 'drop',
average_duration = mean(Time_duration)) %>%
arrange(member_casual, Date_month) %>%
mutate(Average_Time = hms(average_duration)) %>%
ggplot(aes(
x = Date_month,
y = Average_Time,
group = member_casual,
colour = member_casual
))
geom_line() geom_point(size = 3)
scale_colour_manual(name = "Membership Type",
values = c(casual = '#ff9900', member = '#0099cc'))
labs(
title = "Average Ride Duration By Month",
x = "Month",
y = "Average Time",
subtitle = "April 2020 to March 2021"
)
CodePudding user response:
As @rawr mentioned, you can use factor
set the order along the x-axis.
library(tidyverse)
data %>%
ggplot(aes(
x = factor(month, levels = c(4:12, 1:3)),
y = value,
group = group
))
geom_line()
geom_point(size = 3)
scale_colour_manual(name = "Membership Type",
values = c(casual = '#ff9900', member = '#0099cc'))
labs(
title = "Average Ride Duration By Month",
x = "Month",
y = "Average Time",
subtitle = "April 2020 to March 2021"
)
Output
Data
data <- structure(
list(
group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1),
month = 1:12,
value = c(
22.1462195669301,
34.7227226593532,
11.3724566600285,
37.9181093303487,
11.8698693322949,
17.4393743579276,
21.4907402452081,
17.6362748816609,
29.6476712427102,
36.9375107018277,
25.3655201010406,
20.449273835402
)
),
class = "data.frame",
row.names = c(NA, -12L)
)
CodePudding user response:
If you have the year as well you could use the package lubridate
to create a date and plot that along the x-axis instead. Then the order should be fixed and I think the names look better as well.
Building on Andrew's example, we could do something like this:
library(dplyr)
library(ggplot2)
data %>%
mutate(date = lubridate::make_date(year = year, month = month)) %>%
ggplot(aes(
x = date,
y = value,
group = group
))
geom_line()
geom_point(size = 3)
scale_colour_manual(name = "Membership Type",
values = c(casual = '#ff9900', member = '#0099cc'))
labs(
title = "Average Ride Duration By Month",
x = "Date",
y = "Average Time",
subtitle = "April 2020 to March 2021"
)
Output:
Data:
data <- structure(
list(
group = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1),
month = 1:12,
year = c(rep(2021, 3), rep(2020, 9)),
value = c(
22.1462195669301,
34.7227226593532,
11.3724566600285,
37.9181093303487,
11.8698693322949,
17.4393743579276,
21.4907402452081,
17.6362748816609,
29.6476712427102,
36.9375107018277,
25.3655201010406,
20.449273835402
)
),
class = "data.frame",
row.names = c(NA, -12L)
)