I have a dataset like Christmas:
Christmas <- data_frame(month = c("1", "1", "2", "2"),
NP = c(2, 3, 3, 1),
ND = c(4, 2, 0, 6),
NO = c(1, 5, 2, 4),
variable = c("mean", "sd", "mean", "sd"))
and I want to calculate the t-statistic of each column, by month. The formula for the t-statistic I want to use is t-statistic = mean/sd. (Note: I want to calculate this for all (in this case, they are only NP,ND, and NO) the columns).
The new dataset will look like t_statistics:
t_statistic <- data_frame(
month = c("1", "2"),
NP = c(2/3, 3),
ND = c(4/2, 0),
NO = c(1/5, 2/4)
)
Any clue?
CodePudding user response:
If we already have the mean/sd
values created, then it is just first
element divided by last
(as there was only two rows per group)
library(dplyr)
out <- Christmas %>%
group_by(month) %>%
summarise(across(NP:NO, ~first(.)/last(.)))
-output
out
# A tibble: 2 × 4
month NP ND NO
<chr> <dbl> <dbl> <dbl>
1 1 0.667 2 0.2
2 2 3 0 0.5
-checking with OP's output
> identical(t_statistic, out)
[1] TRUE
Or if the mean/sd
are not ordered
Christmas %>%
arrange(month, variable) %>%
group_by(month) %>%
summarise(across(NP:NO, ~first(.)/last(.)))