Is there a nice way to sum and aggregate the following dataframe:
Persons item1, item2, item3, ....
1 0 1 3 ....
1 2 2 4 ....
2 1 2 4 ....
3 0 1 1 ....
1 1 1 1 ....
... ... ... ... ....
To:
Persons Items
1 15
2 7
3 2
... ...
So basically sum up all the items for each person and add that sum to another row (if that person has another row).
CodePudding user response:
Here's one way:
library(dplyr)
Persons <- c(1, 1, 2, 3, 1)
item1 <- c(0, 2, 1, 0, 1)
item2 <- c(1, 2, 2, 1, 1)
item3 <- c(3, 4, 4, 1, 1)
data_1 <- data.frame(Persons, item1, item2, item3)
data_1$Items_raw <- rowSums(data_1[,2:4])
data_1 <- data_1 %>% group_by(Persons) %>% summarize(Items = sum(Items_raw))
CodePudding user response:
In base R you could use tapply
...
tapply(rowSums(df[,-1]), df[,1], sum)
1 2 3
15 7 2
CodePudding user response:
You can use the dplyr
package.
library(dplyr)
df <- data.frame(
Persons = c(1, 1, 2, 3, 1),
item1 = c(0, 2, 1, 0, 1),
item2 = c(1, 2, 2, 1, 1),
item3 = c(3, 4, 4, 1, 1)
)
df %>%
group_by(Persons) %>%
summarise(items = sum(item1, item2, item3))
And you'll obtain:
# A tibble: 3 x 2
Persons items
<dbl> <dbl>
1 1 15
2 2 7
3 3 2
CodePudding user response:
Thanks to combining all the reaction under my post (thanks everyone!), I found a solution to my problem. I solved it the following way:
People = data.frame(People$Persons,
rowSums(People[, 2:522]))
colnames(People) = c("Persons", "Items")
People = ddply(People, "Persons", numcolwise(sum))
Its not a beautiful solution (wish I could do it in 1 line) but it works!