R: Creating a Column For Each Element in a List-CodePudding

I am working with the R programming language. I have a list ("my_list") that looks something like this - each element in the list (e.g. [[i]]) has a different number of subelements (e.g. [[i]][j]) :

 > my_list
    
    my_list
    [[1]]
    [1] "subelement1"   "subelement2"   "subelement3"
    
    [[2]]
    [1] "subelement1"                             "subelement2" "subelement3"         "subelement4"           "subelement5"    
    
    [[3]]
    [1] "subelement1"                           "subelement2"   "subelement3"  "subelement4"         "subelement5"
    
    [[4]]
    [1] "subelement1"                   "subelement2"           "subelement3" "subelement4" "subelement5" 
    
    > summary(my_list)
           Length Class  Mode     
      [1,] 3      -none- character
      [2,] 5      -none- character
      [3,] 5      -none- character
      [4,] 5      -none- character
      [5,] 5      -none- character
      [6,] 5      -none- character
      [7,] 5      -none- character
      [8,] 5      -none- character
      [9,] 5      -none- character
     [10,] 5      -none- character
     [11,] 5      -none- character
     [12,] 6      -none- character

For each element in this list, I want to extract each of these subelement and make them into a dataframe all together (each row in this dataframe will not necessarily have the same number of columns). Since I don't the maximum number of subelements, I tried to find out the maximum number of subelements - but some parsing is still involved (many entries in the "Length" column are not numbers for some reason?):

summary = summary(my_list)

> summary

    Var1   Var2      Freq
1      A Length         3
2      B Length         5
3      C Length         5
4      D Length         5
5      E Length         5
6      F Length         5
7      G Length         5
8      H Length         5

####

96    R3 Length         5
97    S3 Length         5
98    T3 Length         5
99    U3 Length         5
100   V3 Length         5

####

101    A  Class    -none-
102    B  Class    -none-
103    C  Class    -none-
104    D  Class    -none-

######

296   R3   Mode character
297   S3   Mode character
298   T3   Mode character
299   U3   Mode character
300   V3   Mode character

    summary = data.frame(summary)
    freq = as.numeric(gsub("([0-9] ).*$", "\\1", summary$Freq))
     freq = freq[!is.na(freq)]

> max(freq)
[1] 6

With this very "roundabout way" - I now know there at most 6 subelements, and I can create 6 corresponding columns:

col1 = sapply(my_list,function(x) x[1])
col2 = sapply(my_list,function(x) x[2])
col3 = sapply(my_list,function(x) x[3])
col4 = sapply(my_list,function(x) x[4])
col5 = sapply(my_list,function(x) x[5])
col6 = sapply(my_list,function(x) x[6])

#final answer : desired output
final_data = data.frame(col1, col2, col3, col4, col5, col6)

My Question: Would there have been an easier way to find out the maximum number of subelements in this list and then create a data frame with the correct number of columns? I.e. Is there an "automatic" way to create a data frame with the same number of columns as subelements in the list and name these columns accordingly (e.g. col1, col2, col3, etc.)?

Thanks!

CodePudding user response：

Your solution is functional, so obviously take this with a grain of salt, but it's possible to find the maximum length of a sublist with one loop.

max_length <- 0
lapply(my_list, \(x){if (length(x) > max_length){max_length = length(x)} }
> max_length
[1] 6

To make a dataframe with the corresponding columns a similar approach can be used:

#create an empty dataframe to add rows to
df <- data.frame(matrix(ncol = max_length, nrow = 0))
colnames(df) <- sprintf("raster[%d]",seq(1:max_length))

#add rows
lapply(listanswer, \(x){df[nrow(df)   1,] <- x})

See this post regarding sprintf. Since you need to know the maximum row length going in, two loops are necessary, one to find the max length, and one to fill the data frame.

CodePudding user response：

Try this

mx <- max(sapply(my_list , length))

df <- do.call(rbind , lapply(my_list , \(x) if(length(x) == mx) x
 else c(x , rep(NA , mx - length(x)))))

df <- data.frame(df)
colnames(df) <- paste0("col" , 1:mx)

output

         col1        col2        col3        col4        col5
1 subelement1 subelement2 subelement3        <NA>        <NA>
2 subelement1 subelement2 subelement3 subelement4 subelement5
3 subelement1 subelement2 subelement3 subelement4 subelement5
4 subelement1 subelement2 subelement3 subelement4 subelement5