I'm working with a bike share dataset that I've named "all_rides_v02".
Relevant columns are day_of_the_week(self explanatory) and member_casual(rides are logged as either casual or member)
$ ride_id <chr> "99103BB87CC6C1BB", "EAFCCCFB0A3FC5A1", "9EF4F46C57AD23…
$ rideable_type <chr> "electric_bike", "electric_bike", "electric_bike", "ele…
$ member_casual <chr> "member", "member", "member", "member", "member", "memb…
$ date <date> 2021-08-10, 2021-08-10, 2021-08-21, 2021-08-21, 2021-0…
$ month <chr> "08", "08", "08", "08", "08", "08", "08", "08", "08", "…
$ day <chr> "10", "10", "21", "21", "19", "19", "19", "13", "17", "…
$ year <chr> "21", "21", "21", "21", "21", "21", "21", "21", "21", "…
$ day_of_the_week <chr> "Tuesday", "Tuesday", "Saturday", "Saturday", "Thursday…
I'm trying to create a line graph with multiple(two)lines where one line represents "member rides" and the other line is "casual rides". The x-axis would be day_of_the_week and the y-axis would be the number of rides(which is not explicitly logged in the dataset).
Any advice?
ggplot(data=all_rides_v02)
geom_line(aes(x=day_of_the_week, y=value, color=as.factor(member_casual)))
geom_line()
geom_point()
I could probably post a dozen ways I've done it incorrectly. The main issue I keep running into is that I don't know how to work around not having the "y value". I just want it to be the number of rides.
CodePudding user response:
You'd need to aggregate your data first. If you're using the full tidyverse, you can go
all_rides_v02 %>%
group_by(day_of_the_week, member_casual) %>%
summarise(count = n()) %>%
ggplot()
geom_line(aes(x = day_of_the_week, y = count, colour = member_casual))