Let's take this hypothetical code for instance:
```{r}
dataset_custom <- function(top, dataset, variable) {
{{dataset}} %>%
count({{variable}}) %>%
top_n(top, n) %>%
arrange(-n) %>%
left_join({{dataset}}, by = "{{variable}}")
}
```
I know this will return an error when I try to run (say) dataset_custom(5, dataset, variable)
because of the by = "{{variable}}"
in left_join. How do I get around this issue?
I know that when you left join and you want to join it by a particular variable, you do by = "variable"
where variable
has quotations around it, but how do I do it when I write it as a function and I want the stuff in the quotations to change as depending on the input to the function I'm trying to create?
Thank you!
CodePudding user response:
It is useful if you provide some toy data, like the one found in the example of ?left_join
. Note that left_join(df1, df1)
is just df1
. Instead, we can use a 2nd data argument.
df1 <- tibble(x = 1:3, y = c("a", "a", "b"))
df2 <- tibble(x = c(1, 1, 2), z = c("first", "second", "third"))
df1 %>% left_join(df2, by = "x")
f <- function(data, data2, variable) {
var <- deparse(substitute(variable))
data %>%
count({{ variable }}) %>%
arrange(-n) %>%
left_join(data2, by = var)
}
f(df1, df2, x)
x n z
<dbl> <int> <chr>
1 1 1 first
2 1 1 second
3 2 1 third
4 3 1 NA
# and
f(df2, df1, x)
x n y
<dbl> <int> <chr>
1 1 2 a
2 2 1 a
for this to work we need to use defusing operations so that the input is evaluated correctly. Figuratively speaking, using {{ }}
as the by
argument is like using a hammer instead of sandpaper for polishing things - it is a forcing operation where none should happen.