Using strings as variable names in cor.test in R function-CodePudding

How to use a string as a variable name inside cor.test function in R? I am trying to write a general function that will allow running some correlation tests.

library(tidyverse)
data(economics)

crtest  <- function(valy, valx){
    attach(economics)
    rlnshp  <- cor.test(!!sym(valy), !!sym(valx))
    detach(economics)
}
crtest("uempmed", "pce")

This gives an error message:

Error in !sym(valy) : invalid argument type

Is there a way to avoid using the attach() command? And get some way to generate economics$pce. paste0 did not work as the string generated does not convert to the variable name in the text.

Thanks for your patience.

CodePudding user response：

Usage of sym and !! works within the context of tidyverse functions. Here, it is better to extract the data column with [[

crtest  <- function(data, valy, valx){
      cor.test(data[[valy]], data[[valx]])
   
}

There is no need to use attach/detach (it is also discouraged)

-testing

crtest(economics, "uempmed", "pce")
Pearson's product-moment correlation

data:  data[[valy]] and data[[valx]]
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616

if we want to make use of rlang functions, do the evaluation (!!) after converting to symbol within summarise - ensym can convert both unquoted/quoted arguments to symbol

library(dplyr)
crtest  <- function(data, valy, valx){
      data %>%
         summarise(rlnshp = list(cor.test(!! rlang::ensym(valy), 
      !! rlang::ensym(valx)))) %>%
             pull(rlnshp) %>%
             purrr::pluck(1)
      
   
 }

-testing

> crtest(economics, uempmed, pce)

    Pearson's product-moment correlation

data:  uempmed and pce
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616 

> 
> crtest(economics, "uempmed", "pce")

    Pearson's product-moment correlation

data:  uempmed and pce
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616

CodePudding user response：

Maybe another strategy could be using {{}}:

crtest  <- function(df, x, y){
  df %>% 
    summarise(cor_coef = cor.test({{x}}, {{y}})$estimate,
              p_val = cor.test({{x}}, {{y}})$p.value)
}

crtest(economics, uempmed, pce)

# A tibble: 1 x 2
  cor_coef    p_val
     <dbl>    <dbl>
1    0.727 1.92e-95

CodePudding user response：

You can replace !!sym with get()

crtest  <- function(valy, valx) {
  cor.test(get(valy), get(valx))
}

crtest("uempmed", "pce")

    Pearson's product-moment correlation

data:  get(valy) and get(valx)
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616