I have dataframe contain variables :
Group high weigh age col5
row1 A 12 57 18 AA
row2 C 22 80 29 BB
row3 B 17 70 20 CC
row4 A 13 60 26 DD
row5 D 19 69 25 AA
row6 B 10 15 19 BB
row7 C 20 66 22 CC
row8 D 13 53 18 DD
i want to calulate standar error using the function std.error from package plotrix or using other method ( like calculating directly sd/sqrt(length(data[,column])) of all quantitative error by group in (first column), so the result i want is
Group se_high se_weigh se_age
row1 A 0.223 0.023 0.1
row3 B 0.12 0.1 0.12
row7 C 0.1 0.04 0.09
row8 D 0.05 0.12 0.07
i tried to use group_by dplyr fubction to group column one and then use std.error but i don't know how to combine them
#this is the dplyr function to calculate the mean by group
library(dplyr)
data %>%
group_by(group) %>%
summarise_at(vars("A", "B", "C","D"), mean)
i also would like to know how to calculate std.error by two groups ( column 1 and last column 5 for example )
Thank you
CodePudding user response:
You were close! Summarize_at is actually deprecated now so here's what I'd do:
library(dplyr)
data %>%
group_by(Group) %>%
summarize(se_high=plotrix::std.error(high),
se_weigh=plotrix::std.error(weigh),
se_age=plotrix::std.error(age))
which returns
# A tibble: 4 x 4
Group se_high se_weigh se_age
<chr> <dbl> <dbl> <dbl>
1 A 0.5 1.5 4
2 B 3.5 27.5 0.5
3 C 1 7 3.5
4 D 3 8 3.5
CodePudding user response:
Here is a solution to do it in one go:
library(dplyr)
df %>%
group_by(Group) %>%
summarise(across(where(is.numeric), ~ sd(.x)/ sqrt(length(.x)), .names = "std_{.col}"))
# A tibble: 4 x 4
Group std_high std_weigh std_age
<chr> <dbl> <dbl> <dbl>
1 A 0.5 1.5 4
2 B 3.5 27.5 0.5
3 C 1 7 3.5
4 D 3 8 3.5