I am working with football data from 2010 to 2020 for teams and I want to make an excel file with data that has the index of the share of countries in a particular team in a particular year. Currently I am using R with the code :
data_collect <- ARS_data %>%
filter(Team == "ARS") %>%
filter(Year == 2010) %>%
count(Nation)
sum(data_collect$n)
After this, I am manually inputting data into the excel sheet based on the table created. This is the data that I am using :
I want to find a way in R that I can use to automate this process and then export the result back into excel.
My results look like this right now:
I would really appreciate any help with this!
CodePudding user response:
Using some mock data as example data you could achieve your desired result like so:
- Use
count
to get the counts by team, year and nation - Use
add_count
to add the total number of players per team and year - Use
pivot_wider
to convert to wide format
library(dplyr)
library(tidyr)
data_collect <- ARS_data %>%
count(Team, Year, Nation) %>%
add_count(Team, Year, wt = n, name = "#players") %>%
pivot_wider(names_from = Nation, values_from = n, values_fill = 0)
data_collect
#> # A tibble: 3 × 8
#> Team Year `#players` ESP FRA GER ITA ENG
#> <chr> <int> <int> <int> <int> <int> <int> <int>
#> 1 ARS 2010 9 2 4 1 2 0
#> 2 ARS 2011 6 4 0 2 0 0
#> 3 ARS 2012 5 1 1 2 0 1
There are several options to export to excel. Using the writexl
package you could do:
writexl::write_xlsx(list(mysheet = data_collect), path = tempfile(fileext = ".xlsx"))
Simply replace path
by your desired filename and mysheet
by your desired sheet name.
DATA
set.seed(42)
ARS_data <-
data.frame(
Team = "ARS",
Year = sample(2010:2012, 20, replace = TRUE),
Nation = sample(c("ENG", "GER", "ITA", "ESP", "FRA"), 20, replace = TRUE)
)