New to posting to Stack so apologies for any issues.
I'm learning to get more comfortable in R and currently looking at using broom/purr to run multiple stat tests at one time. An example of my current data looks like this:
Subject | PreScoreTestA | PostScoreTestA | PreScoreTestB | PostScoreTestB | PreScoreTestC | PostScoreTestC |
---|---|---|---|---|---|---|
1 | 30 | 40 | 6 | 8 | 12 | 10 |
2 | 15 | 12 | 9 | 13 | 7 | 7 |
3 | 20 | 22 | 11 | 12 | 9 | 10 |
But over many subjects and more tests. I want to do a dependent t-test to see scores changed over the course of a training program, but don't want to run a test for each score.
I've seen a couple examples of people using group by, nest, and map to run multiple t-tests, but their data was in a longer format
Is there a way to achieve the same goal while in a wide format? Or will I need to use pivot_longer to change the data.
Thanks in advance!
ETA had an edit here but was giving incorrect results and so have removed Still looking for some help on the arguments and same length
CodePudding user response:
Yes, some pivoting is needed. Asssuming you have no directional hypotheses and you want to do a pre-post assessment for each test, this might be what you are looking for:
df <- as.data.frame(rbind(c(1, 30, 40, 6, 8, 12, 10),
c(2, 15, 12, 9, 13, 7, 7),
c(3, 20, 22, 11, 12, 9, 10)))
names(df) <- c("Subject",
"PrePushup", "PostPushup",
"PreRun", "PostRun",
"PreJump", "PostJump")
df %>%
pivot_longer(-Subject,
names_to = c("time", "test"), values_to = "score",
names_pattern = "(Pre|Post)(.*)") %>%
group_by(test) %>%
nest() %>%
mutate(t_tests = map(data, ~t.test(score ~ time, data = .x, paired = TRUE))) %>%
pull(t_tests) %>%
purrr::set_names(c("Pushup", "Run", "Jump"))
$Pushup
Paired t-test
data: score by time
t = 0.79241, df = 2, p-value = 0.5112
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-13.28958 19.28958
sample estimates:
mean of the differences
3
$Run
Paired t-test
data: score by time
t = 2.6458, df = 2, p-value = 0.1181
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.461250 6.127916
sample estimates:
mean of the differences
2.333333
$Jump
Paired t-test
data: score by time
t = -0.37796, df = 2, p-value = 0.7418
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-4.127916 3.461250
sample estimates:
mean of the differences
-0.3333333