I'm new to programming and I've stumbled upon something I cannot resolve, and I was hoping you guys can.
I'm working with the FIFA 19 data set. See a section of the data set below:
Name Nationality Overall Club Balance
1 L. Messi Argentina 94 FC Barcelona 95
2 Cristiano Ronaldo Portugal 94 Juventus 70
3 Neymar Jr Brazil 92 Paris Saint-Germain 84
4 De Gea Spain 91 Manchester United 43
5 K. De Bruyne Belgium 91 Manchester City 77
6 E. Hazard Belgium 91 Chelsea 94
What I want is a worldmap where you can insert the club with the readline(prompt) function, and then gives you the map, with all the players of that club, and then displaying where in the world those players come from (so the count for each nationality per club, if that makes sense).
I saw on the following website how I could make a worldmap of this data set.
The code for making a world map of total player count (so for the whole data set), is the following. It gives me the right output.
overall_data <- fifa %>%
group_by(Nationality) %>%
summarise(Count = n(),
Avg_Overall = mean(Overall),
Avg_Potential = mean(Potential),
Avg_Pot_Diff = mean(Potential-Overall))
worldmap = map_data("world")
merged_data <- merge(x = worldmap, y = overall_data, by.x = "region", by.y = "Nationality", all.x = TRUE) %>% arrange(order)
ggplot(data = merged_data, aes(x = long, y = lat, group = group))
geom_polygon(aes(fill = Count))
labs(fill='Total Player Counts')
Namely, this worldmap: Worldmap Total Player Count
But this does not allow me to make different maps for each club. So I made a separate data file, that counts for each club, how many of each nationality there is. I made a crosstab using the crosstab() function from this source: ("http://pcwww.liv.ac.uk/~william/R/crosstab.r").This is the code I used:
nat_per_club <- crosstab(fifa, row.vars = "Nationality", col.vars = "Club", type ="f")
nat_per_club_crosstab <- nat_per_club$crosstab
nat_per_club_dataframe <- data.frame(nat_per_club_crosstab)
nat_per_club_dataframe <- nat_per_club_dataframe[!(nat_per_club_dataframe$Freq==0), ]
This gives me the following table:
Nationality Club Freq
8 Armenia SSV Jahn Regensburg 1
60 Germany SSV Jahn Regensburg 19
87 Kosovo SSV Jahn Regensburg 1
94 Lithuania SSV Jahn Regensburg 1
104 Morocco SSV Jahn Regensburg 1
121 Poland SSV Jahn Regensburg 1
This is exactly what I want, as it shows per club and per nationality, the frequency. Now the question is, how can I implement it in the world map with the readline(prompt) function. So that if I insert e.g., SSV Jahn Regensburg, that it shows me the world map of where in the world their players come from?
This is what I tried but it unfortunately doesn't work.
worldmap2 = map_data("world")
merged_data2 <- merge(x = worldmap2, y = nat_per_club_dataframe, by.x = "region", by.y = "Nationality", all.x = TRUE) %>% arrange(order)
ggplot(data = merged_data2, aes(x = long, y = lat, group = group))
geom_polygon(aes(fill = Freq))
labs(fill='Total Player Count')
players_from_where <- function() {
club_name <- as.character(readline(prompt="Club: "))
for(i in 1:nrow(nat_per_club_dataframe)){
if(nat_per_club_dataframe[i, "Club"] == club_name){
ggplot(data = merged_data2, aes(x = long, y = lat, group = group))
geom_polygon(aes(fill = Freq))
labs(fill='Total Player Counts')
}
}
}
I hope someone can help!
CodePudding user response:
There is no need for a for
loop. You could simply filter your aggregated dataset using dplyr::filter
or subset
for the inputted club name, merge it to the map data and plot.
library(tidyverse)
players_from_where <- function(club_name) {
x <- filter(nat_per_club_dataframe, Club %in% club_name)
worldmap <- map_data("world")
merged_data2 <- merge(x = worldmap, y = x, by.x = "region", by.y = "Nationality", all.x = TRUE) %>%
arrange(order)
ggplot(data = merged_data2, aes(x = long, y = lat, group = group))
geom_polygon(aes(fill = n))
labs(fill = "Total Player Counts")
}
# club_name <- as.character(readline(prompt = "Club: "))
club_name <- "FC Barcelona"
players_from_where(club_name)
DATA
nat_per_club_dataframe <- structure(list(Nationality = c(
"Argentina", "Belgium", "Brazil",
"Chile", "Croatia", "France", "Germany", "Netherlands", "Portugal",
"Senegal", "Spain", "Uruguay", "Austria", "Brazil", "Colombia",
"France", "Germany", "Netherlands", "Poland", "Portugal", "Spain"
), Club = c(
"FC Barcelona", "FC Barcelona", "FC Barcelona", "FC Barcelona",
"FC Barcelona", "FC Barcelona", "FC Barcelona", "FC Barcelona",
"FC Barcelona", "FC Barcelona", "FC Barcelona", "FC Barcelona",
"FC Bayern München", "FC Bayern München", "FC Bayern München",
"FC Bayern München", "FC Bayern München", "FC Bayern München",
"FC Bayern München", "FC Bayern München", "FC Bayern München"
), n = c(
1L, 1L, 4L, 1L, 1L, 3L, 1L, 1L, 1L, 1L, 17L, 1L, 1L,
1L, 1L, 3L, 18L, 1L, 1L, 1L, 2L
)), class = "data.frame", row.names = c(
NA,
-21L
))