Everything is running fine, but I'm checking to make sure that non-numeric values don't totally screw up the test. I turned the variables into numeric using as.numeric and it returned "introduced by coercion" - but it worked!
I'm running this line of code with a file of 2020 Presidential election data by county and Unemployment data.
cor.test(Unemployment2020, PercentD2020, method = 'spearman', exact = FALSE).
Does the "exact = FALSE" piece make it unnecessary for there to be the same number of numeric values for each variable?
CodePudding user response:
This has nothing to do with exact = FALSE
.
Since cor.test
is an S3 generic, when you pass two numeric vectors to it, you will invoke the stats:::cor.test.default
method. Reviewing the source code of this function, you will see that it silently drops the NA
values in lines 10 to 13 of the function body:
OK <- complete.cases(x, y)
x <- x[OK]
y <- y[OK]
n <- length(x)
The complete.cases(x, y)
here will drop NA
values from both vectors, so that only matching entries where neither are NA
will be considered.
We can see this in action with the following example. Suppose we have an x and a y vector and want to run cor.test
, but each has an NA
value at a different point:
x <- c(1, 2, NA, 3, 4, 5)
y <- c(1.1, 1.9, 7, 3.3, 4.5, NA)
cor.test(x, y)
#>
#> Pearson's product-moment correlation
#>
#> data: x and y
#> t = 13.671, df = 2, p-value = 0.005308
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#> 0.7634907 0.9998944
#> sample estimates:
#> cor
#> 0.9946918
We should get the same result if we drop the third entry from each vector (since x
has an NA
there) and drop the 6th entry where y
has an NA
:
x <- c(1, 2, 3, 4)
y <- c(1.1, 1.9, 3.3, 4.5)
cor.test(x, y)
#>
#> Pearson's product-moment correlation
#>
#> data: x and y
#> t = 13.671, df = 2, p-value = 0.005308
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#> 0.7634907 0.9998944
#> sample estimates:
#> cor
#> 0.9946918
Created on 2022-07-22 by the reprex package (v2.0.1)