Home > database >  Convert list of matrices to a data.frame with row.names as names of list
Convert list of matrices to a data.frame with row.names as names of list

Time:03-19

I have a list of matrices called list1 with the following structure:

$ENSLAFT00000000003
     start end
[1,]   360 360
[2,]   394 394

$ENSLAFT00000000011
     start end
[1,]    15  15

$ENSLAFT00000000020
     start end
[1,]    45  45

$ENSLAFT00000000023
     start end

$ENSLAFT00000000024
     start  end
[1,]    13   13
[2,]   369  369
[3,]   602  602
[4,]   775  775
[5,]   983  983
[6,]  1200 1200
[7,]  1491 1491

Some of the items are empty but I would like to transform this list into a data.frame with two columns and the following structure:

ID                         pos
ENSLAFT00000000003         360
ENSLAFT00000000003         394
ENSLAFT00000000011          15
ENSLAFT00000000020          45
ENSLAFT00000000024          13
ENSLAFT00000000024         369
ENSLAFT00000000024         602
ENSLAFT00000000024         775
ENSLAFT00000000024         983
ENSLAFT00000000024        1200
ENSLAFT00000000024        1491 

ENSLAFT00000000023 was omitted in the output as it was an empty matrix in the initial list.

I can somewhat get the desired structure but without keeping the rowname identity by using:

as.data.frame(do.call(rbind, e)[,1])

But I am still lacking to keep the rownames, which are needed.

Do you have any suggestions for doing this data transformation in R?

Best regards

CodePudding user response:

We extract the 'start' element by looping over the list and stack it to a two column data.frame

out <- stack(lapply(lst1, \(x) {
          st <- x[,"start"]
          if(length(st) == 0) st <- NA_real_
          st
            }))[2:1]
names(out) <- c("ID", "pos")

CodePudding user response:

Here is a tidyverse option:

library(tidyverse)

map(list1, ~ select(as.data.frame(.x), start)) %>%
  enframe %>%
  unnest(value,keep_empty = TRUE)

Output

  name               start
  <chr>              <dbl>
1 ENSLAFT00000000003   360
2 ENSLAFT00000000003   360
3 ENSLAFT00000000011    15
4 ENSLAFT00000000023    NA

Data

list1 <- list(ENSLAFT00000000003 = structure(c(360, 360, 394, 394), .Dim = c(2L, 
2L), .Dimnames = list(NULL, c("start", "end"))), ENSLAFT00000000011 = structure(c(15, 
15), .Dim = 1:2, .Dimnames = list(NULL, c("start", "end"))), 
    ENSLAFT00000000023 = structure(logical(0), .Dim = c(0L, 2L
    ), .Dimnames = list(NULL, c("start", "end"))))

CodePudding user response:

Another possible solution:

library(tidyverse)

set.seed(123)
l <- list(a = as.data.frame(matrix(sample(1:20,4), 2, 2)),
          b = as.data.frame(matrix(sample(1:20,4), 2, 2)),
          c = as.data.frame(matrix(sample(1:40,8), 4, 2)))
l

#> $a
#>   V1 V2
#> 1 15 14
#> 2 19  3
#> 
#> $b
#>   V1 V2
#> 1 10 11
#> 2 18  5
#> 
#> $c
#>   V1 V2
#> 1 14  5
#> 2 25 37
#> 3 26 28
#> 4 27  9

l %>% enframe %>% unnest(everything()) %>% select(ID = 1, pos = 2)

#> # A tibble: 8 × 2
#>   ID      pos
#>   <chr> <int>
#> 1 a        15
#> 2 a        19
#> 3 b        10
#> 4 b        18
#> 5 c        14
#> 6 c        25
#> 7 c        26
#> 8 c        27
  • Related