I will give an example of my problem using a smaller matrix. Say I have a matrix with row names and column names such as this:
set.seed(10)
a <- matrix(rexp(200), ncol=9,nrow = 3)
colnames(a) <- paste(rep(c("aaa" , "bbb" , "ccc") , each = 3) , rep(c(1:3) , times = 3) , sep = "")
rownames(a) <- c("aaa" , "bbb" , "ccc")
giving matrix a
:
aaa1 aaa2 aaa3 bbb1 bbb2 bbb3 ccc1 ccc2 ccc3
aaa 0.01495641 1.5750419 2.3276229 0.6722683 1.3165471 1.63298388 1.7447187 0.3469224 1.3981074
bbb 0.92022120 0.2316586 0.7291238 0.4265298 0.4132938 0.07119408 0.2929501 0.7950826 1.1104594
ccc 0.75215894 1.0866730 1.2883101 1.1154219 0.6765753 2.56885161 0.6453052 1.3962992 0.1704216
I would like to find an efficient code that matches the row names with each column name without the digit, returning a vector. In this case:
aaa1 aaa2 aaa3 bbb1 bbb2 bbb3 ccc1 ccc2 ccc3
0.01495641 1.57504185 2.32762287 0.42652979 0.41329383 0.07119408 0.64530516 1.39629918 0.17042160
I obtained the previous matrix using this code:
b <- c(a[grepl("aaa" , rownames(a)) , grepl("aaa" , colnames(a))] ,
a[grepl("bbb" , rownames(a)) , grepl("bbb" , colnames(a))] ,
a[grepl("ccc" , rownames(a)) , grepl("ccc" , colnames(a))] )
Is there a way to do this efficiently, even if the matrix is much larger and possibly has a different name structure than this?
CodePudding user response:
An easier option is to reshape to 'long' by converting to data.frame
from table
, and then subset
the rows based on the values of 'Var1' and 'Var2'
out <- subset(as.data.frame.table(a), Var1 == sub("\\d ", "", Var2),
select =c(Var2, Freq))
with(out, setNames(Freq, Var2))
aaa1 aaa2 aaa3 bbb1 bbb2 bbb3 ccc1 ccc2 ccc3
0.01495641 1.57504185 2.32762287 0.42652979 0.41329383 0.07119408 0.64530516 1.39629918 0.17042160
Or with row/column
indexing
i1 <- match( sub("\\d ", "", colnames(a)), rownames(a))
a[cbind(i1, seq_along(i1))]
[1] 0.01495641 1.57504185 2.32762287 0.42652979 0.41329383 0.07119408 0.64530516 1.39629918 0.17042160