Home > Software design >  I get an error when I try to run t.test()
I get an error when I try to run t.test()

Time:05-15

I am trying to perform a t.test as follows:

t.test(data = a, score ~ group, paired = T)

However, I get this error:

Error in complete.cases(x, y) : not all arguments have the same length

I think this is because group$target has some NAs. How can I tell the t.test() function to run the test regardless (as the NAs are meant to be in there)?

Here is some info about my data:

table(a$group)
target: 96
nontarget: 96   

str(a$score)
num [1:192] 3 4.5 5.75 6.25 6 7 5 5.5 NA 5.25 ...

str(a$group)
Factor w/ 2 levels "nontarget","target": 2 2 2 2 2 2 2 2 2 2 ...

A sample of some of the data:

ResponseId group score
R_XZz2leQjPyBF4OZ target 4.750000
R_yx5aiVCJfpz1Y9b target NA
R_z0RbO2yL1QT3jTX target 6.500000
R_3DnI1SqwhDrourD nontarget 3.250000
R_3e39IHkvt1yh0R8 nontarget 1.833333
R_3e5kUZaUet2HYTw nontarget 2.916667

CodePudding user response:

Responding to the comments, I am concerned about @JohnGarland's question (are you sure you really have paired data?), but provided that you do (and the different ResponseID values are a red herring), I think that you have to discard all pairs with NA values (you say "the NAs are meant to be in there", but since there is no way to include a pair with a missing value in a paired t-test [*see a counterargument below], you will have to discard them). One way to do this is to convert the data set to wide format, then use na.omit() (then use the slightly weird syntax required to do a paired t-test with wide-format data).

Read in sample data (short but it doesn't matter):

dd <- read.table(header=TRUE, text="
ResponseId  group   score
R_XZz2leQjPyBF4OZ   target  4.750000
R_yx5aiVCJfpz1Y9b   target  NA
R_z0RbO2yL1QT3jTX   target  6.500000
R_3DnI1SqwhDrourD   nontarget   3.250000
R_3e39IHkvt1yh0R8   nontarget   1.833333
R_3e5kUZaUet2HYTw   nontarget   2.916667
")

Convert to wide format:

d_wide <- with(dd,
               data.frame(target = score[group == "target"],
                          nontarget = score[group == "nontarget"]))

This is a low-tech method. You can also use reshape() or unstack() from base R (although I have trouble figuring these out); reshape2::reshape; or tidyr::pivot_wider.

Once you've done this:

t.test(Pair(target, nontarget) ~ 1, data = na.omit(d_wide))

The counter-argument is that you can actually do a little bit better with unbalanced data by fitting a linear mixed-effect model (LMM) with restricted maximum likelihood (REML); this can use the unpaired data to slightly improve the estimate of the mean values of the two groups; it should (??) give the same answer as the classical t-test when the groups are balanced.

dd$pair <- factor(rep(1:3, 2))
mm <- nlme::lme(score ~ group, 
          random = ~1|pair, data = dd, method = "REML", 
     na.action = na.omit)
summary(mm)

CodePudding user response:

Partly working from context and partly working from how the data are set up. Most would not set up a paired sample t test in the way given in the OP with 2 factors (target group vs. nontarget group are the same?, not usually, though sometimes does occur within participants). As well, note that

a <- data.frame(ResponseID = letters[1:6], 
     group = as.factor(c(rep("target",3),rep("nontarget",3))),
     score = c(4.75,NA,6.5,3.25,1.83,2.92))

t.test(data = a, score ~ group, paired = FALSE)

works just fine on the data structure as given.

  • Related