I have many matrices that all have the same column names but different numbers of rows. I would like them all to be uniform length, and have the rows be ordered in consecutive increasing order by the first column. See two examples of matrices that need alteration below:
count 0 1 2 3 4 5 6 7
[1,] 0 0 2 0 1 0 0 0 0
[2,] 1 1 1 4 0 1 0 1 0
[3,] 2 1 1 2 0 2 0 1 2
[4,] 3 0 1 0 0 0 0 0 0
[5,] 4 0 0 0 4 0 0 0 0
[6,] 5 0 0 0 0 0 3 0 0
[7,] 8 0 0 0 0 0 1 0 0
count 0 1 2 3 4 5 6 7
[1,] 0 0 2 0 1 0 0 0 0
[2,] 1 1 1 4 0 1 0 1 0
[3,] 2 1 1 2 0 2 0 1 2
[4,] 3 0 1 0 0 0 0 0 0
[5,] 4 0 0 0 4 0 0 0 0
[6,] 7 0 0 0 0 0 3 0 0
The max value of the count
column should always be 8; I would like to insert vectors into the right locations starting with the appropriate consecutive value followed by eight trailing zeroes. The above matrices should look like this:
count 0 1 2 3 4 5 6 7
[1,] 0 0 2 0 1 0 0 0 0
[2,] 1 1 1 4 0 1 0 1 0
[3,] 2 1 1 2 0 2 0 1 2
[4,] 3 0 1 0 0 0 0 0 0
[5,] 4 0 0 0 4 0 0 0 0
[6,] 5 0 0 0 0 0 3 0 0
[7,] 6 0 0 0 0 0 0 0 0 # this row has been inserted
[8,] 7 0 0 0 0 0 0 0 0 # this row has been inserted
[9,] 8 0 0 0 0 0 1 0 0
count 0 1 2 3 4 5 6 7
[1,] 0 0 2 0 1 0 0 0 0
[2,] 1 1 1 4 0 1 0 1 0
[3,] 2 1 1 2 0 2 0 1 2
[4,] 3 0 1 0 0 0 0 0 0
[5,] 4 0 0 0 4 0 0 0 0
[6,] 5 0 0 0 0 0 0 0 0 # this row has been inserted
[7,] 6 0 0 0 0 0 0 0 0 # this row has been inserted
[8,] 7 0 0 0 0 0 3 0 0
[9,] 8 0 0 0 0 0 0 0 0 # this row has been inserted
The matrices are embedded in a long list, many of which already have 9 rows and do not need to be modified, so ideally this solution could be vectorized to work across the list rather than in a for
loop. Below is some gross code to produce a toy list of 4 matrices (with the first two elements corresponding to the problematic matrices shown above).
list(matrix(c(0,1,2,3,4,5,8,0,1,1,0,0,0,0,2,1,1,1,0,0,0,0,4,2,0,0,0,0,1,0,0,0,4,0,0,0,1,2,0,0,0,0,0,0,0,0,0,3,1,0,1,1,0,0,0,0,0,0,2,0,0,0,0),nrow=7,ncol=9,dimnames=(list(character(0),c("count",0:7)))),matrix(c(0,1,2,3,4,7,0,1,1,0,0,0,2,1,1,1,0,0,0,4,2,0,0,0,1,0,0,0,4,0,0,1,2,0,0,0,0,0,0,0,0,3,0,1,1,0,0,0,0,0,2,0,0,0),nrow=6,ncol=9,dimnames=(list(character(0),c("count",0:7)))),matrix(c(0,1,2,3,4,5,6,7,8,0,1,0,0,1,0,0,0,0,2,1,1,1,0,0,0,0,0,4,2,0,0,0,0,0,0,1,0,0,0,0,0,4,0,0,0,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,0,1,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0),nrow=9,ncol=9,dimnames=(list(character(0),c("count",0:7)))),matrix(c(0,1,2,3,4,5,6,7,8,0,1,0,0,1,0,0,0,0,2,1,1,1,0,0,0,0,0,4,2,0,0,0,0,0,0,1,0,0,0,0,0,4,0,0,0,1,2,0,0,0,0,0,0,0,0,0,0,0,0,0,3,1,0,1,1,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0),nrow=9,ncol=9,dimnames=(list(character(0),c("count",0:7)))))
This is similar to another question I asked recently - hopefully I can keep these matrices in a nice simple empty list
and not have to pre-specify the dim
s as an array
-based solution might require. Thanks in advance!
CodePudding user response:
Creating a complete matrix m
out of the max
imum row numbers with identical number of columns and colnames
resp. dimnames
, match
and order
.
maxrow <- max(sapply(L, nrow))
colnm <- colnames(el(L))
m <- `dimnames<-`(cbind(0:(maxrow - 1L), matrix(0L, maxrow, length(colnm) - 1L)),
list(NULL, colnm))
lapply(L, \(x) rbind(x, m[-match(x[, 'count'], m[, 'count']), ]) |> {\(.) .[order(.[, 'count']), ]}())
# [[1]]
# count 0 1 2 3 4 5 6 7
# [1,] 0 0 2 0 1 0 0 0 0
# [2,] 1 1 1 4 0 1 0 1 0
# [3,] 2 1 1 2 0 2 0 1 2
# [4,] 3 0 1 0 0 0 0 0 0
# [5,] 4 0 0 0 4 0 0 0 0
# [6,] 5 0 0 0 0 0 3 0 0
# [7,] 6 0 0 0 0 0 0 0 0 ##
# [8,] 7 0 0 0 0 0 0 0 0 ##
# [9,] 8 0 0 0 0 0 1 0 0
#
# [[2]]
# count 0 1 2 3 4 5 6 7
# [1,] 0 0 2 0 1 0 0 0 0
# [2,] 1 1 1 4 0 1 0 1 0
# [3,] 2 1 1 2 0 2 0 1 2
# [4,] 3 0 1 0 0 0 0 0 0
# [5,] 4 0 0 0 4 0 0 0 0
# [6,] 5 0 0 0 0 0 0 0 0 ##
# [7,] 6 0 0 0 0 0 0 0 0 ##
# [8,] 7 0 0 0 0 0 3 0 0
# [9,] 8 0 0 0 0 0 0 0 0 ##
#
# [[3]]
# count 0 1 2 3 4 5 6 7
# [1,] 0 0 2 4 0 1 0 1 0
# [2,] 1 1 1 2 0 2 0 1 2
# [3,] 2 0 1 0 0 0 0 0 0
# [4,] 3 0 1 0 0 0 0 0 0
# [5,] 4 1 0 0 0 0 0 0 0
# [6,] 5 0 0 0 4 0 0 0 0
# [7,] 6 0 0 0 0 0 3 0 0
# [8,] 7 0 0 0 0 0 1 0 0
# [9,] 8 0 0 1 0 0 0 0 0
#
# [[4]]
# count 0 1 2 3 4 5 6 7
# [1,] 0 0 2 4 0 1 0 1 0
# [2,] 1 1 1 2 0 2 0 1 2
# [3,] 2 0 1 0 0 0 0 0 0
# [4,] 3 0 1 0 0 0 0 0 0
# [5,] 4 1 0 0 0 0 0 0 0
# [6,] 5 0 0 0 4 0 0 0 0
# [7,] 6 0 0 0 0 0 3 0 0
# [8,] 7 0 0 0 0 0 1 0 0
# [9,] 8 0 0 1 0 0 0 0 0
Data:
L <- list(structure(c(0, 1, 2, 3, 4, 5, 8, 0, 1, 1, 0, 0, 0, 0, 2,
1, 1, 1, 0, 0, 0, 0, 4, 2, 0, 0, 0, 0, 1, 0, 0, 0, 4, 0, 0, 0,
1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 1, 0, 1, 1, 0, 0, 0, 0, 0,
0, 2, 0, 0, 0, 0), dim = c(7L, 9L), dimnames = list(NULL, c("count",
"0", "1", "2", "3", "4", "5", "6", "7"))), structure(c(0, 1,
2, 3, 4, 7, 0, 1, 1, 0, 0, 0, 2, 1, 1, 1, 0, 0, 0, 4, 2, 0, 0,
0, 1, 0, 0, 0, 4, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0, 0, 0, 3, 0, 1,
1, 0, 0, 0, 0, 0, 2, 0, 0, 0), dim = c(6L, 9L), dimnames = list(
NULL, c("count", "0", "1", "2", "3", "4", "5", "6", "7"))),
structure(c(0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 0, 0, 1, 0,
0, 0, 0, 2, 1, 1, 1, 0, 0, 0, 0, 0, 4, 2, 0, 0, 0, 0, 0,
0, 1, 0, 0, 0, 0, 0, 4, 0, 0, 0, 1, 2, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 3, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 0,
0, 2, 0, 0, 0, 0, 0, 0, 0), dim = c(9L, 9L), dimnames = list(
NULL, c("count", "0", "1", "2", "3", "4", "5", "6", "7"
))), structure(c(0, 1, 2, 3, 4, 5, 6, 7, 8, 0, 1, 0,
0, 1, 0, 0, 0, 0, 2, 1, 1, 1, 0, 0, 0, 0, 0, 4, 2, 0, 0,
0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 4, 0, 0, 0, 1, 2, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 1, 0, 1, 1, 0, 0, 0, 0,
0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0), dim = c(9L, 9L), dimnames = list(
NULL, c("count", "0", "1", "2", "3", "4", "5", "6", "7"
))))
CodePudding user response:
Another possible solution, which is based on the following idea:
First, create a complete matrix of zeros (
m
).Iterate with
map2
to assign, to the matrix of zeros, each matrix of the list, considering only the rows where the columnscount
of both matrices match.
library(tidyverse)
m <- matrix(0, 9, 9)
colnames(m) <- colnames(m1)
m[,1] <- 0:8
map2(mylist, rep(list(m), length(mylist)), ~ {.y[.y[,1] %in% .x[,1]] <- .x; .y})
#> [[1]]
#> count 0 1 2 3 4 5 6 7
#> [1,] 0 0 2 0 1 0 0 0 0
#> [2,] 1 1 1 4 0 1 0 1 0
#> [3,] 2 1 1 2 0 2 0 1 2
#> [4,] 3 0 1 0 0 0 0 0 0
#> [5,] 4 0 0 0 4 0 0 0 0
#> [6,] 5 0 0 0 0 0 3 0 0
#> [7,] 6 0 0 0 0 0 0 0 0
#> [8,] 7 0 0 0 0 0 0 0 0
#> [9,] 8 0 0 0 0 0 1 0 0
#>
#> [[2]]
#> count 0 1 2 3 4 5 6 7
#> [1,] 0 0 2 0 1 0 0 0 0
#> [2,] 1 1 1 4 0 1 0 1 0
#> [3,] 2 1 1 2 0 2 0 1 2
#> [4,] 3 0 1 0 0 0 0 0 0
#> [5,] 4 0 0 0 4 0 0 0 0
#> [6,] 5 0 0 0 0 0 0 0 0
#> [7,] 6 0 0 0 0 0 0 0 0
#> [8,] 7 0 0 0 0 0 3 0 0
#> [9,] 8 0 0 0 0 0 0 0 0