Consider the following fake example where I extract all comparisons corresponding to a name called A
from a matrix called matr
.
### Set up example matrix ###
matr <- matrix(c(2,0,3,0,5,0.7,1,0,0.9,6,11,9,0,1,0.5,2,0,1,0.3,3,6,1,0.31,0,0), nrow = 5, ncol = 5)
dimnames(matr) = list(c("A", "B", "A", "C", "A"), c("A", "B", "A", "C", "A"))
matr
# Pretend the matrix is symmetric - for my real matrix, it is
matr[upper.tri(matr, diag = TRUE)] <- NA # gwt lower triangle
matr
for (rowLoopCounter in 1:nrow(matr)){
#Get the row of interest
matr_work <- matr[rowLoopCounter,,drop=FALSE]
for (colLoopCounter in 1:nrow(matr)) {
if (row.names(matr)[rowLoopCounter] == colnames(matr)[colLoopCounter]){
matr[rowLoopCounter, colLoopCounter] <- NA
}
}
}
A_row <- c(matr[grepl("A", row.names(matr)), ]) # get comparisions in row
sA_col <- c(matr[, grepl("A", colnames(matr))]) # get comparisions in columns
total <- as.numeric(na.omit(unlist(c(_A_row, A_col)))) # combine results
total
#[1] 0 6 3 0 0 1
The above implementation is quite verbose, but only gets the job done for A
. I need to also do this for B
and C
.
This can be done using a for
loop (or apply()
).
I naively tried using split()
, which only works on vectors and gives strange results (leaves out the values 1 in A
and puts it in C
for some reason):
splt <- split(matr, colnames(matr)) # using rownames(matr) is equivalent
#$A
#[1] NA NA NA NA 0 6 NA NA NA NA NA 3 NA NA NA
#$B
#[1] 0 NA NA NA NA
#$C
#[1] 0.0 0.9 1.0 NA NA
$A$
should contain the same elements as total
.
I recently discovered the new asplit()
function, but I get an error
asplit(matr, c(1, 2))
#Error in array(newx[, i], d.call, dn.call) : 'dims' cannot be of length 0
What I would like from asplit()
is a similar output returned by split()
where values are stored in named lists. However, from running the examples in the documentation for asplit()
, there's no way to do this.
CodePudding user response:
You can use split()
on both the column and row by swapping the rownames of which(!is.na(matr), arr.ind=T)
. Then use mapply()
to combine your two lists.
#Get index of matr by its array index, removing NA values
ind<- which(!is.na(matr), arr.ind=T)
#Create a list by factor of row names.
list_1<- split(x = matr[ind], f = row.names(ind))
#Then substitute the column name as the row name.
row.names(ind)<- colnames(matr)[unname(ind[,2])]
#Create a second list by factor of column name.
list_2<- split(x = matr[ind], f = row.names(ind))
#Combine your lists
mapply(c, list_1, list_2)
Output of the mapply()
:
$A
[1] 0 6 3 0 0 1
$B
[1] 0.0 0.0 0.9 6.0
$C
[1] 0.0 0.9 1.0 3.0
CodePudding user response:
not entirely sure what you want to achieve - but if it is a list of vectors, per letter, containing all matrix values where row and column letter coincide, you can do this:
library(dplyr) ## for convenient dataframe manipulation
df <-
cbind(
expand.grid(row = dimnames(matr)[[1]],
col = dimnames(matr)[[2]]),
value = as.vector(matr)
)
# > head(df)
# row col value
# 1 A A 2.0
# 2 B A 0.0
# 3 A A 3.0
# 4 C A 0.0
# 5 A A 5.0
# 6 A B 0.7
filter above df
for coinciding row and column letters, and summarise per letter:
df <- df |>
filter(row == col) |>
group_by(row) |>
summarise(total = list(value))
convert to named list:
totals = setNames(df$total, df$row)
output:
## > totals
## $A
## [1] 2.00 3.00 5.00 11.00 0.00 0.50 6.00 0.31 0.00
##
## $B
## [1] 1
##
## $C
## [1] 0.3
CodePudding user response:
one liner would be:
with(na.omit(as.data.frame.table(matr)), split(c(Freq, Freq), c(Var1, Var2)))
$A
[1] 0 6 3 0 0 1
$B
[1] 0.0 0.0 0.9 6.0
$C
[1] 0.0 0.9 1.0 3.0