I'm trying to join two data sets in R using a left join in a function. Firstly I have my main data frame GE_GC referenced by the "df" in the function and I am trying to join a data frame called GE_GC_Teacher_Names however I would like the "GE_GC" part of the object name to be dynamic as I have multiple data sets with unique set of names that needs to be joined. For example if the "df" reference in my function was EX_EF then the function would join the EX_EF_Teacher_Names data frame onto the EX_EF data frame.
Q2_Table <- function(df){
df %>% select(contains("Q2_")) %>%
gather(var,value) %>%
group_by(var) %>%
summarise(
Mean = round(mean(as.numeric(value), na.rm = TRUE), 2),
Responses = length(value[!is.na(value)]),
"Very Dissatisfied" = paste0(length(value[value == "1" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "1" & !is.na(value)])/Responses*100), ")"),
"Dissatisfied" = paste0(length(value[value == "2" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "2" & !is.na(value)])/Responses*100), ")"),
"Neutral" = paste0(length(value[value == "3" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "3" & !is.na(value)])/Responses*100), ")"),
"Satisfied" = paste0(length(value[value == "4" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "4" & !is.na(value)])/Responses*100), ")"),
"Very Satisfied" = paste0(length(value[value == "5" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "5" & !is.na(value)])/Responses*100), ")")
) %>%
left_join(
as.name(paste0((deparse(substitute(df))),"_Teacher_Names")), #Here works with imputed df name but trying to dynamically name teacher df
by = 'var'
) %>%
rename("Teacher" = teacher) %>%
select(-var, -value) %>%
relocate(Teacher, .before = Mean)
}
Q2_Output <- Q2_Table(GE_GC)
When trying to run this function I get the following error even though a matching column called "var" is present in the GE_GC and GE_GC_Teacher_Names data frames.
Error in
auto_copy()
: !x
andy
must share the same src. i setcopy
= TRUE (may be slow). Runrlang::last_error()
to see where the error occurred. >
The following code works fine when I input the teacher data frame name manually
Q2_Table <- function(df){
df %>% select(contains("Q2_")) %>%
gather(var,value) %>%
group_by(var) %>%
summarise(
Mean = round(mean(as.numeric(value), na.rm = TRUE), 2),
Responses = length(value[!is.na(value)]),
"Very Dissatisfied" = paste0(length(value[value == "1" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "1" & !is.na(value)])/Responses*100), ")"),
"Dissatisfied" = paste0(length(value[value == "2" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "2" & !is.na(value)])/Responses*100), ")"),
"Neutral" = paste0(length(value[value == "3" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "3" & !is.na(value)])/Responses*100), ")"),
"Satisfied" = paste0(length(value[value == "4" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "4" & !is.na(value)])/Responses*100), ")"),
"Very Satisfied" = paste0(length(value[value == "5" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "5" & !is.na(value)])/Responses*100), ")")
) %>%
left_join(
GE_GC_Teacher_Names, #Here works with imputed df name but trying to dynamically name teacher df
by = 'var'
) %>%
rename("Teacher" = teacher) %>%
select(-var, -value) %>%
relocate(Teacher, .before = Mean)
}
Q2_Output <- Q2_Table(GE_GC)
So this section is the problem:
left_join(
as.name(paste0((deparse(substitute(df))),"_Teacher_Names")), #Here works with imputed df name but trying to dynamically name teacher df
by = 'var'
)
Any help would be appreciated thank you.
CodePudding user response:
I suggest you simplify this a little by making the Teacher
frame an argument to the function. This does two things:
- Simplifies your logic, where you are not relying so much on the name of an object (with the assumption of other existing variables); and
- Ensures the function is more functional, where its output is derived exclusively by the arguments passed to it, no inference, no guessing.
Q2_Table <- function(df, tchr) {
df %>%
select(contains("Q2_")) %>%
pivot_longer(everything(), names_to = "var") %>%
group_by(var) %>%
summarise(
Mean = round(mean(as.numeric(value), na.rm = TRUE), 2),
Responses = length(value[!is.na(value)]),
"Very Dissatisfied" = paste0(length(value[value == "1" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "1" & !is.na(value)])/Responses*100), ")"),
"Dissatisfied" = paste0(length(value[value == "2" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "2" & !is.na(value)])/Responses*100), ")"),
"Neutral" = paste0(length(value[value == "3" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "3" & !is.na(value)])/Responses*100), ")"),
"Satisfied" = paste0(length(value[value == "4" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "4" & !is.na(value)])/Responses*100), ")"),
"Very Satisfied" = paste0(length(value[value == "5" & !is.na(value)]), " (", sprintf("%1.0f%%", length(value[value == "5" & !is.na(value)])/Responses*100), ")")
) %>%
left_join(tchr, by = 'var') %>%
rename("Teacher" = teacher) %>%
select(-var, -value) %>%
relocate(Teacher, .before = Mean)
}
Q2_Table(GE_GC, GE_GC_Teacher_Names)
# # A tibble: 1 x 8
# Teacher Mean Responses `Very Dissatisfied` Dissatisfied Neutral Satisfied `Very Satisfied`
# <int> <dbl> <int> <chr> <chr> <chr> <chr> <chr>
# 1 1 6.5 6 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0 (0%)
(Notice that I also shifted from gather
to pivot_longer
. While it adds nothing here, if you use it more and in more-complicated situations, using this newer function will pay off.)
Data
GE_GC <- structure(list(Q2_1 = c("6", "7", "6", "6", "7", "7")), row.names = c(NA, -6L), class = "data.frame")
GE_GC_Teacher_Names <- structure(list(var = c("Q2_1", "Q2_2", "Q2_3", "Q2_4"), value = c("Example", "Example", "Example", "Example"), teacher = 1:4), row.names = c(NA, -4L), class = "data.frame")