Calculating t-test with apply (by row) returns data are essentially constant-CodePudding

I'm trying to use the t-test with just two values:

1 and n.

This works:

a <- c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
     8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97, 8.97)

a <- rep(c(1, 8.97), each = 15)

and:

t.test(x = a, mu = 2.33, alternative = "greater")

    One Sample t-test

data:  a
t = 3.5879, df = 29, p-value = 0.0006046
alternative hypothesis: true mean is greater than 2.33
95 percent confidence interval:
 3.727653      Inf
sample estimates:
mean of x 
    4.985

Right, but:

variavel <- seq(from = 1, to = 9, by = .01)

df <- data.frame(fix1 = 1, fix2 = 1, fix3 = 1, fix4 = 1, fix5 = 1, fix6 = 1, fix7 = 1, fix8 = 1, fix9 = 1, fix10 = 1, fix11 = 1, fix12 = 1, fix13 = 1, fix14 = 1, fix15 = 1, 
               ge1 = variavel, ge2 = variavel, ge3 = variavel, ge4 = variavel, ge5 = variavel, ge6 = variavel, ge7 = variavel, ge8 = variavel, ge9 = variavel, ge10 = variavel, ge11 = variavel, ge12 = variavel, ge13 = variavel, ge14 = variavel, ge15 = variavel
               )

When I go to calculate the test for each line and obtain the p-value,

apply(X = df, MARGIN = 1, FUN = function(x) {
  t.test(x = x, mu = 2.33, alternative = "greater")$p.value
})

it gives me the error:

Error in t.test.default(x = x, mu = 2.33) : data are essentially constant

What happens?

CodePudding user response：

Your first row is invariant, and since that's an error, the rest of the data is never tested.

table(unlist(df[1,]))
#  1 
# 30

You have two options:

Omit row 1 in your calculation:

head(apply(X = df[-1,], MARGIN = 1, FUN = function(x) {
  t.test(x = x, mu = 2.33, alternative = "greater")$p.value
}))
# 2 3 4 5 6 7 
# 1 1 1 1 1 1

Catch it inside the loop:

head(apply(X = df, MARGIN = 1, FUN = function(x) {
  tryCatch(t.test(x = x, mu = 2.33, alternative = "greater")$p.value, 
           error = function(e) NA_real_)
}))
# [1] NA  1  1  1  1  1