I'm new to R and looking to get some help/explanation on why my code is doing what it is doing. I've started doing the Tidy Tuesday projects to better learn R, so that is where the data is from. Tidy Tuesday information
Goal: The end result I am looking to do is sort my bar graph by which country's runners had the most first place finishes from the data and only display the top 10.
Thought process In my head, how this would happen would be having R add up each instance of the country and have it saved into a variable.
So my first attempt is returning this:
The top_N is something I found googling around, but if I take it out, it does look right, just not limited to the top ten.
Questions:
- Am I using reorder correctly to control the order of nationalities?
- What is the best way to limit the which results are shown?
- Where exactly in the code is it counting each nationality? I'm thinking it is in the sum, but not not 100% sure. Most examples I've found of this used it for numerical values, not strings and that has me a bit confused.
library(tidyverse)
library(ggplot2)
library(readr)
library(dplyr)
ultra_rankings <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-26/ultra_rankings.csv')
race <- readr::read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2021/2021-10-26/race.csv')
ultra_rankings %>%
filter(rank == '1') %>% #Only looks at rows that have a first place finish
top_n(10, nationality) %>% #I think this is what is throwing me off
ggplot(aes(x = reorder(nationality, -rank, sum), y = rank)) geom_bar(stat = "identity")
labs(title = "First Place Rankings by Country", caption = "Data from runrepeat.com")
theme(plot.title = element_text(hjust = .5)) ylab("Total First Place Finishes") xlab("Runner Nationalities")
CodePudding user response:
Try this:
gt <- ultra_rankings %>% filter(rank==1) %>% group_by(nationality) %>% count(nationality) %>%arrange(-n) %>% head(10)
Then we have to change the factor to preserve sort order
gt$nationality <- factor(gt$nationality, levels = unique(gt$nationality))
Now it can be plotted:
ggplot(data=gt,aes(x=nationality,y=n)) geom_bar(stat="identity")