I am writing a function (in R) that I'd like to use data masking on so that the variable name can be passed in easily. I have read Programming with dplyr and understand how to use my variable inside the function by embracing it. However, I also want to do a join, which requires a string to be passed to it.
The following code is a MWE of what I'm trying to do, and it works. However, for the join, I have this hard-coded, and won't work if I want to join by a different variable.
How can I use data-masking to get at the variables easily, but then convert it to a string so it can be used in the join?
Thank you!
dat1 <- tibble::tibble(dat1.v1 = 1:10, dat1.v2 = 101:110)
dat2 <- tibble::tibble(dat2.var1 = 1:10, dat2.var2 = 1001:1010)
my.func <- function(df1, df2, my.var){
df1 <- df1 %>%
mutate("{{my.var}}.plus.one" := {{my.var}} 1)
left_join(df2, df1, by=c("dat2.var1" = "dat1.v1"))
}
my.func(dat1, dat2, dat1.v1)
CodePudding user response:
I don't know of a clean way to do it natively using the by
argument in a join.
This approach works for me.
library(tidyverse)
dat1 <- tibble::tibble(dat1.v1 = 1:10, dat1.v2 = 101:110)
dat2 <- tibble::tibble(dat2.var1 = 1:10, dat2.var2 = 1001:1010)
my_func <- function(df1, df2, my_var){
df2 <- df2 %>%
rename({{ my_var }} := dat2.var1)
df1 <- df1 %>%
mutate("{{ my_var }}.plus.one" := {{ my_var }} 1) %>%
right_join(df2)
df1
}
my_func(dat1, dat2, dat1.v1)
#> Joining, by = "dat1.v1"
#> # A tibble: 10 × 4
#> dat1.v1 dat1.v2 dat1.v1.plus.one dat2.var2
#> <int> <int> <dbl> <int>
#> 1 1 101 2 1001
#> 2 2 102 3 1002
#> 3 3 103 4 1003
#> 4 4 104 5 1004
#> 5 5 105 6 1005
#> 6 6 106 7 1006
#> 7 7 107 8 1007
#> 8 8 108 9 1008
#> 9 9 109 10 1009
#> 10 10 110 11 1010