Home > Mobile >  Row wise normality, skewness and kurtosis in one command
Row wise normality, skewness and kurtosis in one command

Time:09-16

My data is:

X0 X1 X2 X3 category
0  15 4  4  TAH
0  2  5  0  MAT
0  11 9  0  BIO

I want to calculate row-wise normality, skewness and kurtosis. The main reason is that I have categories over different rows (in a dedicated column). Is there a function that can achieve this functionality?

I have been trying to do this using the moments package and the dplyr package, similar to this post: Function that calculates, mean, variance and skewness at the same time in a dataframe. But their solution is column wise not row wise.

df3 %>%
  gather(category, Val) %>% 
  group_by(category) %>% 
  summarise(Mean = mean(Val), 
            Vari = var(Val), 
            Skew = skewness(Val))

For normality, I have tried the following command separately for each row:

shapiro.test(df3[1,])

Any help on this would be greatly appreciated.

CodePudding user response:

You can use rowwise -

library(dplyr)
library(tidyr)

df %>%
  rowwise() %>%
  mutate(Mean = mean(c_across(X0:X3)), 
         Vari = var(c_across(X0:X3)), 
         Skew = moments::skewness(c_across(X0:X3))) %>%
  ungroup

#     X0    X1    X2    X3 category  Mean    Vari     Skew
#  <int> <int> <int> <int> <chr>    <dbl>   <dbl>    <dbl>
#1     0    15     4     4 TAH       5.75 41.583  0.84778 
#2     0     2     5     0 MAT       1.75  5.5833 0.68925 
#3     0    11     9     0 BIO       5    34      0.058244

Similar to your attempt you may get the data in long format and calculate the statistics for each category (rowwise).

df %>%
  pivot_longer(cols = -category) %>%
  group_by(category) %>%
  summarise(Mean = mean(value), 
            Vari = var(value), 
            Skew = moments::skewness(value))

CodePudding user response:

You may try

library(PerformanceAnalytics)

df %>%
  select(category, X0:X3) %>%
  t %>%
  as.data.frame %>%
  row_to_names(row_number = 1) %>%
  mutate(TAH = as.numeric(TAH),
         MAT = as.numeric(MAT),
         BIO = as.numeric(BIO)) %>%
  sapply(., function(x) list(mean = mean(x), var = var(x), skew = skewness(x), kur = kurtosis(x)))

     TAH        MAT       BIO       
mean 5.75       1.75      5         
var  41.58333   5.583333  34        
skew 0.8477758  0.6892545 0.05824397
kur  -0.8325348 -1.141902 -1.922722
  • Related