My data is:
X0 X1 X2 X3 category
0 15 4 4 TAH
0 2 5 0 MAT
0 11 9 0 BIO
I want to calculate row-wise normality, skewness and kurtosis. The main reason is that I have categories over different rows (in a dedicated column). Is there a function that can achieve this functionality?
I have been trying to do this using the moments
package and the dplyr
package, similar to this post:
Function that calculates, mean, variance and skewness at the same time in a dataframe.
But their solution is column wise not row wise.
df3 %>%
gather(category, Val) %>%
group_by(category) %>%
summarise(Mean = mean(Val),
Vari = var(Val),
Skew = skewness(Val))
For normality, I have tried the following command separately for each row:
shapiro.test(df3[1,])
Any help on this would be greatly appreciated.
CodePudding user response:
You can use rowwise
-
library(dplyr)
library(tidyr)
df %>%
rowwise() %>%
mutate(Mean = mean(c_across(X0:X3)),
Vari = var(c_across(X0:X3)),
Skew = moments::skewness(c_across(X0:X3))) %>%
ungroup
# X0 X1 X2 X3 category Mean Vari Skew
# <int> <int> <int> <int> <chr> <dbl> <dbl> <dbl>
#1 0 15 4 4 TAH 5.75 41.583 0.84778
#2 0 2 5 0 MAT 1.75 5.5833 0.68925
#3 0 11 9 0 BIO 5 34 0.058244
Similar to your attempt you may get the data in long format and calculate the statistics for each category
(rowwise).
df %>%
pivot_longer(cols = -category) %>%
group_by(category) %>%
summarise(Mean = mean(value),
Vari = var(value),
Skew = moments::skewness(value))
CodePudding user response:
You may try
library(PerformanceAnalytics)
df %>%
select(category, X0:X3) %>%
t %>%
as.data.frame %>%
row_to_names(row_number = 1) %>%
mutate(TAH = as.numeric(TAH),
MAT = as.numeric(MAT),
BIO = as.numeric(BIO)) %>%
sapply(., function(x) list(mean = mean(x), var = var(x), skew = skewness(x), kur = kurtosis(x)))
TAH MAT BIO
mean 5.75 1.75 5
var 41.58333 5.583333 34
skew 0.8477758 0.6892545 0.05824397
kur -0.8325348 -1.141902 -1.922722