I am trying to use pivot_wider inside a function. However, I cannot access the new columns once I have changed the format. I get the error message that columns a and b don't exist.
This is my function:
dip <- function(df, a, b, year_min) {
wow <- df %>%
filter(country %in% c("a","b")) %>%
filter(date >= year_min) %>%
group_by(rcid) %>%
pivot_wider(id_cols = "rcid",
names_from = c("a","b"),
values_from = "vote") %>%
na.omit()
percentage <- sum(wow$a == wow$b)*100/nrow(wow_df)
return(percentage)
}
dip(joined_df, "United States", "Germany", 1990-01-01)
When I try to run the code without the function, it works:
joined_df <- un_votes %>%
inner_join(un_roll_calls, by = "rcid")
wow_df <- joined_df %>%
filter(country == "United States" | country == "Germany") %>%
filter(date >= "1990-01-01") %>%
group_by(rcid) %>%
pivot_wider(id_cols = "rcid",
names_from = country,
values_from = "vote") %>%
na.omit()
percentage <- sum(wow_df$"Germany" == wow_df$"United States")*100/nrow(wow_df)
I am fairly new to function and appreciate any tips. The data I'm using is from the un_votes packages.
This is the error message I get:
Error in
chr_as_locations()
: ! Can't subset columns that don't exist. ✖ Columna
doesn't exist. Backtrace:
- global dip(joined_df, "United States", "Germany", 1990 - 1 - 1)
- tidyr:::pivot_wider.data.frame(...)
- tidyr::build_wider_spec(...)
- tidyselect::eval_select(enquo(names_from), data)
- tidyselect:::eval_select_impl(...) ...
- tidyselect:::reduce_sels(node, data_mask, context_mask, init = init)
- tidyselect:::walk_data_tree(new, data_mask, context_mask)
- tidyselect:::as_indices_sel_impl(...)
- tidyselect:::as_indices_impl(x, vars, call = call, strict = strict)
- tidyselect:::chr_as_locations(x, vars, call = call) Error in chr_as_locations(x, vars, call = call)
CodePudding user response:
Here we need to unquote the "a"
, "b"
. In addition, replace the $
with [[
dip <- function(df, a, b, year_min) {
wow <- df %>%
filter(country %in% c(a, b)) %>%
filter(date >= as.Date(year_min)) %>%
pivot_wider(id_cols = "rcid",
names_from = country,
values_from = "vote") %>%
na.omit()
percentage <- sum(wow[[a]] == wow[[b]])*100/nrow(wow)
return(percentage)
}
-testing
dip(joined_df, "United States", "Germany", "1990-01-01")
[1] 33.33333
data
joined_df <- structure(list(rcid = c(1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L,
1L, 2L, 3L, 4L, 5L), country = c("United States", "United States",
"United States", "United States", "United States", "Germany",
"Germany", "Germany", "Germany", "Germany", "Spain", "Spain",
"Spain", "Spain", "Spain"), date = structure(c(12904, 8066, -25932,
8401, 12843, 11231, 6971, 10470, 9251, 13787, 14304, 14396, 13361,
11566, 8126), class = "Date"), vote = c(73, 88, 25, 73, 76, 73,
91, 25, 88, 31, 45, 34, 80, 66, 60)), row.names = c(NA, -15L),
class = "data.frame")