I’d like to calculate the internal consistency (alpha and omega) of items by grouping variables (e.g., age
and raterType
). Ideally I’d be able to do this using dplyr/tidyverse. My question is similar to another question (Using dplyr to nest or group two variables, then perform the Cronbach's alpha function or other statistics to the data), however I can’t get the solution to work in my case.
Here is a minimal example:
library("tidyverse")
library("psych")
library("MBESS")
mydata <- expand.grid(ID = 1:100,
age = 1:5,
raterType = c("self",
"friend",
"parent"))
set.seed(12345)
mydata$item1 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item2 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item3 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item4 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item5 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item6 <- sample(1:7, nrow(mydata), replace = TRUE)
mydata$item1[sample(nrow(mydata), 100)] <- NA
mydata$item2[sample(nrow(mydata), 100)] <- NA
mydata$item3[sample(nrow(mydata), 100)] <- NA
mydata$item4[sample(nrow(mydata), 100)] <- NA
mydata$item5[sample(nrow(mydata), 100)] <- NA
mydata$item6[sample(nrow(mydata), 100)] <- NA
itemNames <- paste("item", 1:6, sep = "")
To calculate internal consistency for the entire dataset, I would calculate alpha and omega, respectively, by the following code:
alpha(mydata[,itemNames])$total$raw_alpha
ci.reliability(mydata[,itemNames], type = "omega", interval.type = "none")$est
However, I want to calculate alpha and omega for each combination of age
and raterType
.
Here's my attempt:
mydata %>%
pivot_longer(cols = c(-age, -raterType, -ID)) %>%
select(-ID) %>%
nest_by(age, raterType) %>%
mutate(alpha = alpha(data)$total$raw_alpha,
omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
This throws an error. For some reason, the code provides incorrect estimates for omega and throws an error for alpha:
> # This provides the wrong estimates:
> mydata %>%
pivot_longer(cols = c(-age, -raterType, -ID)) %>%
select(-ID) %>%
nest_by(age, raterType) %>%
mutate(omega = ci.reliability(data, type = "omega", interval.type = "none")$est)
# A tibble: 15 × 4
# Rowwise: age, raterType
age raterType data omega
<int> <fct> <list<tibble[,2]>> <dbl>
1 1 self [600 × 2] 0.218
2 1 friend [600 × 2] 0.257
3 1 parent [600 × 2] 0.261
4 2 self [600 × 2] 0.196
5 2 friend [600 × 2] 0.257
6 2 parent [600 × 2] 0.209
7 3 self [600 × 2] 0.179
8 3 friend [600 × 2] 0.225
9 3 parent [600 × 2] 0.247
10 4 self [600 × 2] 0.224
11 4 friend [600 × 2] 0.252
12 4 parent [600 × 2] 0.218
13 5 self [600 × 2] 0.248
14 5 friend [600 × 2] 0.218
15 5 parent [600 × 2] 0.202
>
> # This throws an error:
> mydata %>%
pivot_longer(cols = c(-age, -raterType, -ID)) %>%
select(-ID) %>%
nest_by(age, raterType) %>%
mutate(alpha = alpha(data)$total$raw_alpha)
Number of categories should be increased in order to count frequencies.
Error in `mutate()`:
! Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ The error occurred in row 1.
Caused by error in `FUN()`:
! only defined on a data frame with all numeric-alike variables
Run `rlang::last_error()` to see where the error occurred.
Warning messages:
1: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ NAs introduced by coercion
ℹ The warning occurred in row 1.
2: Problem while computing `alpha = alpha(data)$total$raw_alpha`.
ℹ Item = name had no variance and was deleted but still is counted in the score
ℹ The warning occurred in row 1.
The omega values above do not correspond to the values obtained from running the ci.reliability()
function on the respective subset of the data:
> alpha(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames])$total$raw_alpha
[1] -0.3018416
> ci.reliability(mydata[which(mydata$age == 3 & mydata$raterType == "self"), itemNames], type = "omega", interval.type = "none")$est
[1] 0.00836356
CodePudding user response:
Perhaps this helps
out1 <- mydata %>%
group_by(age, raterType) %>%
summarise(alpha = alpha(across(all_of(itemNames)))$total$raw_alpha,
omega = ci.reliability(across(all_of(itemNames)),
type = "omega", interval.type = "none")$est, .groups = 'drop')
-output
> out1
# A tibble: 15 × 4
age raterType alpha omega
<int> <fct> <dbl> <dbl>
1 1 self -0.135 2.76
2 1 friend 0.138 0.231
3 1 parent -0.229 255.
4 2 self -0.421 NA
5 2 friend 0.0650 58.7
6 2 parent 0.153 NA
7 3 self -0.302 0.00836
8 3 friend 0.147 0.334
9 3 parent 0.196 0.132
10 4 self -0.0699 NA
11 4 friend 0.118 0.214
12 4 parent -0.0303 31.1
13 5 self -0.0166 0.246
14 5 friend -0.192 0.0151
15 5 parent 0.0847 NA
Or may be this
out2 <- mydata %>%
nest_by(age, raterType) %>%
mutate(alpha = alpha(data[, itemNames])$total$raw_alpha,
omega = ci.reliability(data[, itemNames], type = "omega",
interval.type = "none")$est)
-output
out2
# A tibble: 15 × 5
# Rowwise: age, raterType
age raterType data alpha omega
<int> <fct> <list<tibble[,7]>> <dbl> <dbl>
1 1 self [100 × 7] -0.135 2.76
2 1 friend [100 × 7] 0.138 0.231
3 1 parent [100 × 7] -0.229 255.
4 2 self [100 × 7] -0.421 NA
5 2 friend [100 × 7] 0.0650 58.7
6 2 parent [100 × 7] 0.153 NA
7 3 self [100 × 7] -0.302 0.00836
8 3 friend [100 × 7] 0.147 0.334
9 3 parent [100 × 7] 0.196 0.132
10 4 self [100 × 7] -0.0699 NA
11 4 friend [100 × 7] 0.118 0.214
12 4 parent [100 × 7] -0.0303 31.1
13 5 self [100 × 7] -0.0166 0.246
14 5 friend [100 × 7] -0.192 0.0151
15 5 parent [100 × 7] 0.0847 NA