Home > Enterprise >  Calculate internal consistency of items by grouping variables using dplyr/tidyverse
Calculate internal consistency of items by grouping variables using dplyr/tidyverse

Time:07-04

I’d like to calculate the internal consistency (alpha and omega) of items by grouping variables (e.g., age and raterType). Ideally I’d be able to do this using dplyr/tidyverse. My question is similar to another question (Using dplyr to nest or group two variables, then perform the Cronbach's alpha function or other statistics to the data), however I can’t get the solution to work in my case.

Here is a minimal example:

library("tidyverse")
library("psych")
library("MBESS")

mydata <- expand.grid(ID = 1:100,
                      age = 1:5,
                      raterType = c("self",
                                    "friend",
                                    "parent"))

set.seed(12345)

mydata$item1 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item2 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item3 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item4 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item5 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item6 <- sample(1:7, nrow(mydata), replace = TRUE)

mydata$item1[sample(nrow(mydata), 100)] <- NA
mydata$item2[sample(nrow(mydata), 100)] <- NA
mydata$item3[sample(nrow(mydata), 100)] <- NA
mydata$item4[sample(nrow(mydata), 100)] <- NA
mydata$item5[sample(nrow(mydata), 100)] <- NA
mydata$item6[sample(nrow(mydata), 100)] <- NA

itemNames <- paste("item", 1:6, sep = "")

To calculate internal consistency for the entire dataset, I would calculate alpha and omega, respectively, by the following code:

alpha(mydata[,itemNames])$total$raw_alpha
ci.reliability(mydata[,itemNames], type = "omega", interval.type = "none")$est

However, I want to calculate alpha and omega for each combination of age and raterType.

Here's my attempt:

mydata %>%
  pivot_longer(cols = c(-age, -raterType, -ID)) %>%
  select(-ID) %>% 
  nest_by(age, raterType) %>%
  mutate(alpha = alpha(data)$total$raw_alpha,
         omega = ci.reliability(data, type = "omega", interval.type = "none")$est)

This throws an error. For some reason, the code provides incorrect estimates for omega and throws an error for alpha:

> # This provides the wrong estimates:
> mydata %>%
      pivot_longer(cols = c(-age, -raterType, -ID)) %>%
      select(-ID) %>% 
      nest_by(age, raterType) %>%
      mutate(omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
# A tibble: 15 × 4
# Rowwise:  age, raterType
     age raterType               data omega
   <int> <fct>     <list<tibble[,2]>> <dbl>
 1     1 self               [600 × 2] 0.218
 2     1 friend             [600 × 2] 0.257
 3     1 parent             [600 × 2] 0.261
 4     2 self               [600 × 2] 0.196
 5     2 friend             [600 × 2] 0.257
 6     2 parent             [600 × 2] 0.209
 7     3 self               [600 × 2] 0.179
 8     3 friend             [600 × 2] 0.225
 9     3 parent             [600 × 2] 0.247
10     4 self               [600 × 2] 0.224
11     4 friend             [600 × 2] 0.252
12     4 parent             [600 × 2] 0.218
13     5 self               [600 × 2] 0.248
14     5 friend             [600 × 2] 0.218
15     5 parent             [600 × 2] 0.202
> 
> # This throws an error:
> mydata %>%
      pivot_longer(cols = c(-age, -raterType, -ID)) %>%
      select(-ID) %>% 
      nest_by(age, raterType) %>%
      mutate(alpha = alpha(data)$total$raw_alpha)
Number of categories should be increased  in order to count frequencies. 
Error in `mutate()`:
! Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ The error occurred in row 1.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric-alike variables
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ NAs introduced by coercion
ℹ The warning occurred in row 1. 
2: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ Item = name had no variance and was deleted but still is counted in the score
ℹ The warning occurred in row 1.

The omega values above do not correspond to the values obtained from running the ci.reliability() function on the respective subset of the data:

> alpha(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames])$total$raw_alpha
[1] -0.3018416
> ci.reliability(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames], type = "omega", interval.type = "none")$est
[1] 0.00836356

CodePudding user response:

Perhaps this helps

out1 <-  mydata %>%
    group_by(age, raterType) %>%    
     summarise(alpha = alpha(across(all_of(itemNames)))$total$raw_alpha, 
     omega = ci.reliability(across(all_of(itemNames)), 
    type = "omega", interval.type = "none")$est, .groups = 'drop')

-output

> out1
# A tibble: 15 × 4
     age raterType   alpha     omega
   <int> <fct>       <dbl>     <dbl>
 1     1 self      -0.135    2.76   
 2     1 friend     0.138    0.231  
 3     1 parent    -0.229  255.     
 4     2 self      -0.421   NA      
 5     2 friend     0.0650  58.7    
 6     2 parent     0.153   NA      
 7     3 self      -0.302    0.00836
 8     3 friend     0.147    0.334  
 9     3 parent     0.196    0.132  
10     4 self      -0.0699  NA      
11     4 friend     0.118    0.214  
12     4 parent    -0.0303  31.1    
13     5 self      -0.0166   0.246  
14     5 friend    -0.192    0.0151 
15     5 parent     0.0847  NA      

Or may be this

out2 <- mydata %>%
   nest_by(age, raterType) %>%
   mutate(alpha = alpha(data[, itemNames])$total$raw_alpha, 
   omega = ci.reliability(data[, itemNames], type = "omega", 
    interval.type = "none")$est)

-output

out2
# A tibble: 15 × 5
# Rowwise:  age, raterType
     age raterType               data   alpha     omega
   <int> <fct>     <list<tibble[,7]>>   <dbl>     <dbl>
 1     1 self               [100 × 7] -0.135    2.76   
 2     1 friend             [100 × 7]  0.138    0.231  
 3     1 parent             [100 × 7] -0.229  255.     
 4     2 self               [100 × 7] -0.421   NA      
 5     2 friend             [100 × 7]  0.0650  58.7    
 6     2 parent             [100 × 7]  0.153   NA      
 7     3 self               [100 × 7] -0.302    0.00836
 8     3 friend             [100 × 7]  0.147    0.334  
 9     3 parent             [100 × 7]  0.196    0.132  
10     4 self               [100 × 7] -0.0699  NA      
11     4 friend             [100 × 7]  0.118    0.214  
12     4 parent             [100 × 7] -0.0303  31.1    
13     5 self               [100 × 7] -0.0166   0.246  
14     5 friend             [100 × 7] -0.192    0.0151 
15     5 parent             [100 × 7]  0.0847  NA     
  • Related