I have a data matrix where row and column names are similar. However, row names are repetitive. I want to sum cell values by a unique combination of row names and column names (for example, sum of all cell values having dimension of Row1 * Col1) and create a matrix in R. The new matrix will be sum of all cells by unique combination or Row and Colum names. Thank you
CodePudding user response:
You can do as such. See the comments trough the code as well. Also you need the pckgs that G. Grothendieck gave in his/her answer.
df <- data.frame(Row = c("r_1", "r_2",etc.),
Col = c("Col1", "Col2",etc.),
Value = c(1, 2, etc.))
row_names <- unique(df$Row)
col_names <- unique(df$Col)
#this will give you all possible combinations
combinations <- expand.grid(Row = row_names, Col = col_names)
result <- df %>% group_by(Row, Col) %>% summarize(Value = sum(Value))
result <- left_join(combinations, result, by = c("Row" = "Row", "Col" = "Col"))
result[is.na(result)] <- 0
names(result)[1] <- "R"# choose the value that suits you best
names(result)[2] <- "C"# choose the value that suits you best
CodePudding user response:
Here is a full tidyverse solution in the single pipe:
library(dplyr)
library(tidyr)
rn <- c("name1", "name2", "name3",
"name1", "name2", "name3") # necessary as duplicate names are not allowed in data.frame
# and will be dropped at type cast
matrix(1:18, ncol=3) %>%
`colnames<-`(c("name1", "name2", "name3")) %>%
`rownames<-`(rn) %>% # here is the sample matrix
# alike one from your question ready for analysis
data.frame() %>%
mutate(rn = rn, .before=1) %>%
# restore the row names
# you can store them from your matrix in a variable before type cast
pivot_longer(-rn) %>%
group_by(rn, name) %>%
summarise(value=sum(value)) %>%
pivot_wider()
It will result in:
# A tibble: 3 x 4
# Groups: rn [3]
rn name1 name2 name3
<chr> <int> <int> <int>
1 name1 5 17 29
2 name2 7 19 31
3 name3 9 21 33