I'd like to lookup matrix cells by using rows and columns from a data frame. Preferably, I'd like to do this in a vectorized way for best performance. However, the most obvious syntax leads to a lookup of all the row-column combinations possible, not only the combinations that stem from one data frame row:
Here is a small example:
> m1 <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), 3, 3)
>
> m1
[,1] [,2] [,3]
[1,] 1 4 7
[2,] 2 5 8
[3,] 3 6 9
>
> p1 <- data.frame(row = c(2, 3, 1), column = c(3, 1, 2))
>
> p1
row column
1 2 3
2 3 1
3 1 2
>
> # vectorized indexing that does not work as intended
> m1[p1$row, p1$column]
[,1] [,2] [,3]
[1,] 8 2 5
[2,] 9 3 6
[3,] 7 1 4
>
> # this works as intended, but is possible slow due to R-language looping
> sapply(1 : nrow(p1), function (i) { m1[p1[i, "row"], p1[i, "column"]] })
[1] 8 3 4
The sapply
call computes the output I expect (only m1[2, 3]
, m1[3, 1]
and m1[1, 2]
), but it's expected to be slow for larger data frames because it loops in R language.
Any thoughts on a better (ideally vectorized) way?
CodePudding user response:
For your intended purpose you need to use a matrix to subset the matrix using certain row,column combinations. So you can try:
m1[as.matrix(p1)]
# [1] 8 3 4
Or if you have two vectors:
m1[cbind(row_idx, col_idx)]