Home > front end >  Using strings as variable names in cor.test in R function
Using strings as variable names in cor.test in R function

Time:02-27

How to use a string as a variable name inside cor.test function in R? I am trying to write a general function that will allow running some correlation tests.

library(tidyverse)
data(economics)

crtest  <- function(valy, valx){
    attach(economics)
    rlnshp  <- cor.test(!!sym(valy), !!sym(valx))
    detach(economics)
}
crtest("uempmed", "pce")

This gives an error message:

Error in !sym(valy) : invalid argument type

Is there a way to avoid using the attach() command? And get some way to generate economics$pce. paste0 did not work as the string generated does not convert to the variable name in the text.

Thanks for your patience.

CodePudding user response:

Usage of sym and !! works within the context of tidyverse functions. Here, it is better to extract the data column with [[

crtest  <- function(data, valy, valx){
      cor.test(data[[valy]], data[[valx]])
   
}

There is no need to use attach/detach (it is also discouraged)

-testing

crtest(economics, "uempmed", "pce")
Pearson's product-moment correlation

data:  data[[valy]] and data[[valx]]
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616 

if we want to make use of rlang functions, do the evaluation (!!) after converting to symbol within summarise - ensym can convert both unquoted/quoted arguments to symbol

library(dplyr)
crtest  <- function(data, valy, valx){
      data %>%
         summarise(rlnshp = list(cor.test(!! rlang::ensym(valy), 
      !! rlang::ensym(valx)))) %>%
             pull(rlnshp) %>%
             purrr::pluck(1)
      
   
 }

-testing

> crtest(economics, uempmed, pce)

    Pearson's product-moment correlation

data:  uempmed and pce
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616 

> 
> crtest(economics, "uempmed", "pce")

    Pearson's product-moment correlation

data:  uempmed and pce
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616 

CodePudding user response:

Maybe another strategy could be using {{}}:

crtest  <- function(df, x, y){
  df %>% 
    summarise(cor_coef = cor.test({{x}}, {{y}})$estimate,
              p_val = cor.test({{x}}, {{y}})$p.value)
}

crtest(economics, uempmed, pce)
# A tibble: 1 x 2
  cor_coef    p_val
     <dbl>    <dbl>
1    0.727 1.92e-95

CodePudding user response:

You can replace !!sym with get()

crtest  <- function(valy, valx) {
  cor.test(get(valy), get(valx))
}

crtest("uempmed", "pce")

    Pearson's product-moment correlation

data:  get(valy) and get(valx)
t = 25.32, df = 572, p-value < 2.2e-16
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 0.6859316 0.7633838
sample estimates:
      cor 
0.7269616
  • Related