Home > database >  Transform a list of list (containing matrices) into a data.frame
Transform a list of list (containing matrices) into a data.frame

Time:11-13

I have a list of list which I want to transform into a data.frame. Some of the element in the "inner" list contains a matrix and I want to preserve columns and their name.

I made quite some progress using lapply, and now I ended up with another list which has this pattern:

[[1]]
[[1]]$X
[1] 0.100

[[1]]$Y
                   VAL   LCI  UCI  
AB                 1000  500  1300 
CD                 30    10   400 

This pattern repeat for like 200 rows.

Now the question is that I want to create a data.frame with 4 columns: X, VAL, LCI, UCI. Obviously I would have rows which alternate "AB" and "CD" since these rownames are fixed in my list.

Anyone has an idea on how to perform this? Here is a reproducible example - although not fancy, list1 produced by this code is exactly the object that I need to transform in a dataframe:

#Create list x
var1 <- 0.100 
var1 <- list(var1)
names(var1) <- "x"

#Create list y
var2 <- matrix(1:6, nrow = 2)
rownames(var2) <- c("AB","CD")
colnames(var2) <- c("VAL","LCI","UCI")
var2 <- list(var2)
names(var2) <- "y"

#Create a second element
var3 <- 0.200
var3 <- list(var3)
names(var3) <- "x"

var4 <- matrix(7:12, nrow = 2)
rownames(var4) <- c("AB","CD")
colnames(var4) <- c("VAL","LCI","UCI")
var4 <- list(var4)
names(var4) <- "y"

#Create a list
list1 <- list(c(var1,var2),c(var3,var4))

CodePudding user response:

Not sure I completely understand your request but I give it a try! So, based on your example, please find below a reprex

Reprex

  • Code of the function DFbindRows
DFbindRows <- function(x){
  x <- do.call(rbind, lapply(x, as.data.frame))
  names(x) <- gsub("y.","",names(x))
  return(x)
}
  • Test of the function on your data list1
DF <- DFbindRows(list1)
#>       x   VAL   LCI   UCI
#> AB  0.1     1     3     5
#> CD  0.1     2     4     6
#> AB1 0.2     7     9    11
#> CD1 0.2     8    10    12

NB: the number after AB or CD is mandatory because it is not allowed to have duplicated row names in a dataframe

  • Check of the object class
class(DF)
#> [1] "data.frame"

Created on 2021-11-10 by the reprex package (v2.0.1)


Workaround for the row names problem

To circumvent the problem I pointed out in the Nota Bene above, it is possible

  • to create an ID column in which the rows AB and CD alternate, and
  • to set numbers as row.names.

So, please find below the function DFbindRows2 which offers you the possibility to get such a dataframe

Reprex

  • Code of the function DFbindRows2
DFbindRows2 <- function(x){
  x <- do.call(rbind, lapply(x, as.data.frame))
  x$ID <- rep(c("AB", "CD"), nrow(x)/2)
  x <- x[,c("ID", 
            names(x)[-grep("ID",names(x))])]
  row.names(x) <- seq(nrow(x))
  names(x) <- gsub("y.","",names(x))
  return(x)
}
  • Test of the function on your data list1
DF <- DFbindRows2(list1)
#>   ID   x   VAL   LCI   UCI
#> 1 AB 0.1     1     3     5
#> 2 CD 0.1     2     4     6
#> 3 AB 0.2     7     9    11
#> 4 CD 0.2     8    10    12
  • Check of the object class
class(DF)
#> [1] "data.frame"

Created on 2021-11-10 by the reprex package (v2.0.1)

  • Related