My data frame is the following :
Df <- structure(list(SES = c("High", "High", "High", "Low", "High",
"Low", "High", "High", "High", "Low", "Low", "Low", "High", "High",
"Low", "High", "High", "Low", "High", "High", "Low", "High",
"Low", "Low", "Low", "Low", "High", "Low", "High", "Low", "High",
"High", "Low", "High", "Low", "High", "High", "High", "Low",
"High", "High", "Low", "Low", "High", "Low", "Low", "Low", "Low",
"High", "High", "Low", "High"), entry_age = c(12, 2.5, 7, 2.5,
2.5, 12, 9, 2.5, 3, 8, 12, 2.5, 5.5, 6, 2.5, 2.5, 2.5, 16, 12,
5, 7, 2.5, 12, 2.5, 2.5, 12, 12, 12, 6, 24, 2.5, 2.5, 2, 3.5,
2.5, 2.5, 2.5, 4, 7, 12, 7, 9, 12, 6, 18, 15, 8, 12, 2.5, 6,
10, 5)), row.names = c(NA, -52L), class = c("tbl_df", "tbl",
"data.frame"))
I have a nice difference in means and would like to test its significance with a t-test using the t.test function as follows:
t.test(Df$SES, Df$entry_age)
So veeeeery easy, nothing complicated. However, what I obtain is the following error code, which I don't understand:
Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my))) stop("data are essentially constant") :
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(x) :
l'argument n'est ni numérique, ni logique : renvoi de NA
2: In var(x) : NAs introduced by coercion
I ran a NA test and there is none.
Could you help me please? Sorry for this veeeery low level question but did not find the meaning of this error message in Google.
You will have my endless gratitude
CodePudding user response:
Look at help('t.test')
to understand the usage; the way you call it, it expects to test values between the groups x=Df$SE (which is not what you want) and y=Df$entry_age. Then try this:
Df <- structure(list(SES = c("High", "High", "High", "Low", "High",
"Low", "High", "High", "High", "Low", "Low", "Low", "High", "High",
"Low", "High", "High", "Low", "High", "High", "Low", "High",
"Low", "Low", "Low", "Low", "High", "Low", "High", "Low", "High",
"High", "Low", "High", "Low", "High", "High", "High", "Low",
"High", "High", "Low", "Low", "High", "Low", "Low", "Low", "Low",
"High", "High", "Low", "High"), entry_age = c(12, 2.5, 7, 2.5,
2.5, 12, 9, 2.5, 3, 8, 12, 2.5, 5.5, 6, 2.5, 2.5, 2.5, 16, 12,
5, 7, 2.5, 12, 2.5, 2.5, 12, 12, 12, 6, 24, 2.5, 2.5, 2, 3.5,
2.5, 2.5, 2.5, 4, 7, 12, 7, 9, 12, 6, 18, 15, 8, 12, 2.5, 6,
10, 5)), row.names = c(NA, -52L), class = c("tbl_df", "tbl",
"data.frame"))
t.test(entry_age~SES, data=Df)
#>
#> Welch Two Sample t-test
#>
#> data: entry_age by SES
#> t = -2.9888, df = 35.479, p-value = 0.005059
#> alternative hypothesis: true difference in means between group High and group Low is not equal to 0
#> 95 percent confidence interval:
#> -6.695627 -1.280563
#> sample estimates:
#> mean in group High mean in group Low
#> 5.303571 9.291667
Created on 2022-05-17 by the reprex package (v2.0.1)