Home > Enterprise >  Obtain vector of non-zero elements in sparse matrix, keeping both column and row names
Obtain vector of non-zero elements in sparse matrix, keeping both column and row names

Time:07-30

Suppose I have the following matrix:

mat <- matrix(data = c(1, 2, 3, 0, 0, 0, 0, 0, 0, 
                       0, 0, 0, 2, 3, 4, 0, 0, 0,
                       0, 0, 0, 0, 0, 0, 5, 6, 7),
              nrow = 9, 
              dimnames = list(c(paste0("x", 1:3),
                                paste0("y", 1:3),
                                paste0("z", 1:3)),
                              c("a", "b", "c")))

   a b c
x1 1 0 0
x2 2 0 0
x3 3 0 0
y1 0 2 0
y2 0 3 0
y3 0 4 0
z1 0 0 5
z2 0 0 6
z3 0 0 7

Instead of a matrix, I want a vector only keeping the non-zero elements.

red <- apply(mat, 1, function(x) x[x != 0])
x1 x2 x3 y1 y2 y3 z1 z2 z3 
 1  2  3  2  3  4  5  6  7 

Is there a way to have the reduced vector keep the column names as well? Preferably the pattern "colname character rowname". See desired output below. Note that we do not know in advance how many columns/rows there will be, nor how they are named.

a=x1 a=x2 a=x3 b=y1 b=y2 b=y3 c=z1 c=z2 c=z3 
 1    2    3    2    3    4    5    6    7 

Thank you in advance!

CodePudding user response:

An idea can be,

library(dplyr)
library(tidyr)

data.frame(mat) %>% 
 tibble::rownames_to_column('id') %>% 
 pivot_longer(-1) %>% 
 filter(value != 0) %>% 
 unite(key, name, id, sep = '=')

# A tibble: 9 x 2
  key   value
  <chr> <dbl>
1 a=x1      1
2 a=x2      2
3 a=x3      3
4 b=y1      2
5 b=y2      3
6 b=y3      4
7 c=z1      5
8 c=z2      6
9 c=z3      7

CodePudding user response:

We could also use asplit and unlist it (but get .'s instead of ='s):

red <- unlist(asplit(mat, 2))
red <- red[red != 0]

Output:

a.x1 a.x2 a.x3 b.y1 b.y2 b.y3 c.z1 c.z2 c.z3 
   1    2    3    2    3    4    5    6    7 

Or you could of course - at the cost of some elegance - use some regex, as suggested by @user321797 to get the desired output exactly:

names(red) <- gsub(names(red), pattern = "\\.", replacement = "=")

Output:

a=x1 a=x2 a=x3 b=y1 b=y2 b=y3 c=z1 c=z2 c=z3 
   1    2    3    2    3    4    5    6    7 

CodePudding user response:

A base solution with which(..., arr.ind = TRUE):

ind <- which(mat != 0, arr.ind = TRUE)
setNames(
  mat[ind],
  paste(colnames(mat)[ind[, 'col']], rownames(mat)[ind[, 'row']], sep = '=')
)

# a=x1 a=x2 a=x3 b=y1 b=y2 b=y3 c=z1 c=z2 c=z3 
#    1    2    3    2    3    4    5    6    7

CodePudding user response:

Here is a tidyverse solution with using deframe():

library(tidyverse)

data.frame(mat) %>% 
  rownames_to_column() %>% 
  pivot_longer(-rowname) %>% 
  mutate(names = paste(name, rowname, sep="="), .keep="unused", .before=1) %>% 
  filter(value !=0) %>% 
  deframe()
a=x1 a=x2 a=x3 b=y1 b=y2 b=y3 c=z1 c=z2 c=z3 
   1    2    3    2    3    4    5    6    7 
  • Related