I have a set of observations that each contains yes/no answers to a set of detailed questions. What I want to do is aggregate the "yes" scores by a reduced set of categories for the questions. I can do this with a "for" loop but I know there must be a better way. Below is some code that set ups a random data set that should clarify what I am trying to do.
#
# Initial data. 10 observations with no/yes (0/1) answers to 15 questions. Questions are
# grouped into 3 categories, 5 questions each. The category is not included in the data set. I want to compute the sums of each category across all 10 observations.
#
# Set up random test data
#
myData<-data.frame(COL1=sample(rep(c(0,1),5),10),
COL2=sample(rep(c(0,1),5),10),
COL3=sample(rep(c(0,1),5),10),
COL4=sample(rep(c(0,1),5),10),
COL5=sample(rep(c(0,1),5),10),
COL6=sample(rep(c(0,1),5),10),
COL7=sample(rep(c(0,1),5),10),
COL8=sample(rep(c(0,1),5),10),
COL9=sample(rep(c(0,1),5),10),
COL10=sample(rep(c(0,1),5),10),
COL11=sample(rep(c(0,1),5),10),
COL12=sample(rep(c(0,1),5),10),
COL13=sample(rep(c(0,1),5),10),
COL14=sample(rep(c(0,1),5),10),
COL15=sample(rep(c(0,1),5),10))
print(myData)
#
# Allocate storage for category totals
#
catSums<-data.frame(
Cat01=rep(NA,10),
Cat02=rep(NA,10),
Cat03=rep(NA,10))
#
# For loop to aggregate sums in each category
#
for (i in 1:10) {
catSums$Cat01[i]=sum(myData[i,c(1:5)])
catSums$Cat02[i]=sum(myData[i,c(6:10)])
catSums$Cat03[i]=sum(myData[i,c(11:15)])
}
print(catSums)
CodePudding user response:
You could use dplyr
:
library(dplyr)
myData %>%
summarise(Cat01 = rowSums(across(COL1:COL5)),
Cat02 = rowSums(across(COL6:COL10)),
Cat03 = rowSums(across(COL11:COL15)))
This returns
# Cat01 Cat02 Cat03
# 1 2 3 4
# 2 2 4 1
# 3 2 4 2
# 4 2 4 4
# 5 4 3 2
# 6 2 1 4
# 7 4 3 2
# 8 0 1 3
# 9 3 1 1
# 10 4 1 2
CodePudding user response:
You could do:
setNames(as.data.frame(sapply(split(1:15, rep(1:3, each = 5)), function(x) {
rowSums(myData[x])})), c('Cat01', 'Cat02', 'Cat03'))
#> Cat01 Cat02 Cat03
#> 1 4 4 3
#> 2 0 1 1
#> 3 2 2 2
#> 4 3 3 2
#> 5 2 3 2
#> 6 3 2 3
#> 7 1 2 2
#> 8 2 3 3
#> 9 4 4 4
#> 10 4 1 3
CodePudding user response:
setNames(
as.data.frame(sapply(c(1,6,11),function(x) rowSums(myData[x:(x 4)]))),
c("Cat01","Cat02", "Cat03")
)
Output:
Cat01 Cat02 Cat03
1 2 0 4
2 1 3 2
3 1 2 3
4 3 2 2
5 2 4 2
6 1 2 1
7 4 3 3
8 2 3 2
9 4 5 2
10 5 1 4