I have a dataframe with the following structure:
structure(list(year = c(2000, 2000, 2000, 2000), country = c("Argentina",
"Argentina", "Bolivia", "Bolivia"), sex = c("female", "male",
"female", "male"), n = c(559L, 732L, 429L, 401L), prop = c(0.432997676219985,
0.567002323780015, 0.516867469879518, 0.483132530120482)), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), row.names = c(NA, -4L), groups = structure(list(
year = c(2000, 2000), country = c("Argentina", "Bolivia"),
.rows = structure(list(1:2, 3:4), ptype = integer(0), class = c("vctrs_list_of",
"vctrs_vctr", "list"))), class = c("tbl_df", "tbl", "data.frame"
), row.names = c(NA, -2L), .drop = TRUE))
It gives me the number of men and women by country and year in a particular country, and I would like to generate a table with the sex ratio, using this info: so dividing the number of men/women * 100
I am trying to do it with summarise in dplyr, but I am new to R and it still hasn't worked. I generated the following table with Excel, but could anyone help to generate it in R? Thank you!
structure(list(...1 = c("Argentina", "Bolivia"), `2000` = c(130.948121645796,
93.4731934731935), `2001` = c(124.281984334204, 90.5963302752293
), `2002` = c(143.857331571995, 90.3614457831325), `2003` = c(128.766519823789,
100.762388818297), `2004` = c(127.245508982036, 93.2367149758454
), `2005` = c(120.72027627035, 95.7905544147844), `2006` = c(156.038072575848,
108.026440037771), `2007` = c(154.332700822264, 99.5266272189349
), `2008` = c(148.448385053832, 107.915921288014), `2009` = c(157.430730478589,
102.793885081708), `2010` = c(176.933333333333, 103.193430656934
), `2011` = c(163.636363636364, 94.1605839416058), `2012` = c(158.337689492943,
95.0551813100186), `2013` = c(171.422121896163, 87.2223764575236
), `2014` = c(150.452830188679, 87.1960770959613), `2015` = c(151.637107776262,
84.2784842979496), `2016` = c(149.716016660356, 87.1463077984817
), `2017` = c(147.160940325497, 91.5987882867721), `2018` = c(145.516962843296,
91.933564935992), `2019` = c(156.911764705882, 93.1910097460717
), `2020` = c(141.142557651992, 86.5374317232516), `2021` = c(150.355618776671,
81.2922614575507)), row.names = c(NA, -2L), class = c("tbl_df",
"tbl", "data.frame"))
CodePudding user response:
We can summarise
after grouping and then reshape to 'wide' with pivot_wider
library(dplyr)
library(tidyr)
df1 %>%
group_by(year, country) %>%
summarise(prop = 100 * last(n)/first(n), .groups = 'drop') %>%
pivot_wider(names_from = year, values_from = prop, values_fill = 0)