I am working in R.
I have two data sets - data1 and data2.
data1 <- data.frame(region_name = c("North", "North", "West"),
type = c("big", "small", "big"),
gamma_rate=7:9)
data2<- data.frame(region_name = c("West", "West", "East"),
type= c("small", "big", "big"),
beta_rate=7:9)
Both of these data sets have columns called "region_name" and "type" in them.
I want to left_join data 2 onto data 1, by "region" and "type", but within a function. If I was doing it without a function, this would be my output:
data_final <- data1 %>%
left_join(data2, by = c("region_name" = "region_name", "type" = "type"))
This my function:
my_function(group1, group2) {
data_final <- data1 %>%
left_join(data2, by = c({{group1}} = {{group1}}, {{group2}} = {{group2}}))
}
output <- my_function(region_name, type)
I know the bit in the "by...." argument is incorrect in the function. Can anyone help with out how to correct it?
This seems similar: join datasets using a quosure as the by argument
But it looks like it is just for one join variable?
CodePudding user response:
If you truely want it as a function, I would pass your data in as well, then a function becomes universal and you can use it no matter how your two tables are called.
data1 <- data.frame(region_name = c("North", "North", "West"),
type = c("big", "small", "big"),
gamma_rate=7:9)
data2<- data.frame(region_name = c("West", "West", "East"),
type= c("small", "big", "big"),
beta_rate=7:9)
my_function <- function(join, df1, df2) {
df1 %>% left_join(df2, by = join)
}
my_function(data1, data2, join = c("region_name", "type"))
region_name type gamma_rate beta_rate
1 North big 7 NA
2 North small 8 NA
3 West big 9 8
CodePudding user response:
library(dplyr)
data1 <- data.frame(region = c("A", "B", "C"),
type = c("A", "B", "C"),
value = c(1, 2, 3))
data2 <- data.frame(region = c("A", "B", "C"),
type = c("A", "D", "C"),
value = c(4, 5, 6))
my_join_function <- function(group1, group2){
data1 %>%
left_join(data2,
by = c(group1, group2))
}
my_join_function('region', 'type')
region | type | value.x | value.y |
---|---|---|---|
A | A | 1 | 4 |
B | B | 2 | NA |
C | C | 3 | 6 |
CodePudding user response:
A data.table
solution
library(data.table)
Dummy data
data1 <- data.table(region_name = c('a', 'b')
, type = 1:2
); data1
region_name type
1: a 1
2: b 2
data2 <- data.table(region_name = c('b', 'c')
, type = 2:3
); data2
region_name type
1: b 2
2: c 3
function
my_function <- function(x)
{
return(data2[data1, on=(x)])
}
run function
my_function(c('region_name', 'type'))
region_name type
1: a 1
2: b 2