Just stuck on some df manipulation. I have a large ASV matrix with samples as rows and taxa as columns. I would like to merge specific rows while adding the matrix values of those rows.
Example data frame (code below):
I would like to merge sample-1, sample-2, and sample-3 with each other. Same for sample-4 and sample-5. The merged dataset would have only two rows which contain the summation of the values for each previous row. (Specifically the first three rows would become a single row with new ASV values: ASV1=11, ASV2=14, ASV3=1, ASV4=2, ASV5=8).
> dput(example.matrix)
structure(list(ASV1 = c(8L, 0L, 3L, 6L, 1L), ASV2 = c(1L, 4L,
9L, 3L, 2L), ASV3 = c(1L, 0L, 0L, 1L, 1L), ASV4 = c(0L, 0L, 2L,
3L, 0L), ASV5 = c(0L, 7L, 1L, 4L, 0L)), class = "data.frame", row.names = c("sample-1",
"sample-2", "sample-3", "sample-4", "sample-5"))
CodePudding user response:
We can use:
library(tidyverse)
df %>%
group_by(group = c(1,1,1,2,2)) %>%
summarize(across(everything(), sum))
which gives:
# A tibble: 2 x 6
group ASV1 ASV2 ASV3 ASV4 ASV5
<dbl> <int> <int> <int> <int> <int>
1 1 11 14 1 2 8
2 2 7 5 2 3 4