I have a large sparse matrix. After populating the matrix with some math, I realized I had some infinite values due to a division by zero error. How can I check this matrix for non-finite values?
Here is a toy matrix.
A <- Matrix(nrow = 150000, ncol = 150000, data = 0, sparse = TRUE)
A[1, 1] = Inf
A[1, 3] = NA
A[2, 1] = -Inf
Trying to find its non-finite values gives me an error:
test <- A[!is.finite(A)]
#Error: cannot allocate vector of size 83.8 Gb
I also tried scanning this matrix row by row but it takes forever.
library(magrittr)
for(i in 1:nrow(A)){
if((
A[i, ] %>% .[!is.finite(.)] %>% length
) > 0) print(i)
}
I then tried running it in parallel, but I think it is overkill. What's more, it still takes a long time.
library(parallel)
library(magrittr)
numCores <- detectCores() - 1
cl <- makeCluster(numCores)
clusterExport(cl, c("A"))
clusterEvalQ(cl, library(magrittr))
out <- A %>% nrow %>% seq %>% parLapply(cl, X = ., function(i) A[i, ] %>% .[!is.finite(.)])
How can I proceed?
CodePudding user response:
If we want to know if a sparse matrix A
has any Inf
, -Inf
, NaN
or NA
, we can do
any(!is.finite(A@x))
#[1] TRUE
If we also want to know their positions, we can do
subset(summary(A), !is.finite(x))
i j x
1 1 1 Inf
2 2 1 -Inf
3 1 3 NA
Remark:
See R: element-wise matrix division for distinctions between is.infinite
, !is.finite
, is.na
and is.nan
.