Home > Net >  lists of tibble to column in data.frame
lists of tibble to column in data.frame


I want to create a column which is a list of tibbles (of different row number). The straight forward way fails. Example:

> x <- data.frame('a' = 1:2, 
                  'b' = list(tibble('c' = 1:4, 'd' = 1:4),
                             tibble('c' = 1:3, 'd' = 1:3)))
Error in (function (..., row.names = NULL, check.rows = FALSE, check.names = TRUE,  : 
  arguments imply differing number of rows: 4, 3

I can avoid the error by wrapping it with I. However, when I do so, and try to unnest I can't.

> x <- data.frame('a' = 1:2, 
                  'b' = I(list(tibble('c' = 1:4, 'd' = 1:4),
                             tibble('c' = 1:3, 'd' = 1:3))))
> x %>% unnest(cols = b) 
# A tibble: 2 x 2
      a b               
  <int> <I<list>>       
1     1 <tibble [4 x 2]>
2     2 <tibble [3 x 2]>

How can I create a column which is a list of tibble, which later I can unnest?

CodePudding user response:

It's much easier to create list columns using tibbles instead of data.frames (See e.g. Hadley's note on this here).

You can fix your code by swtiching from data.frame() to tibble():


x <- tibble(
  'a' = 1:2,
  'b' = list(
    tibble('c' = 1:4, 'd' = 1:4),
    tibble('c' = 1:3, 'd' = 1:3)

#> # A tibble: 2 × 2
#>       a b               
#>   <int> <list>          
#> 1     1 <tibble [4 × 2]>
#> 2     2 <tibble [3 × 2]>

x %>% tidyr::unnest(b)
#> # A tibble: 7 × 3
#>       a     c     d
#>   <int> <int> <int>
#> 1     1     1     1
#> 2     1     2     2
#> 3     1     3     3
#> 4     1     4     4
#> 5     2     1     1
#> 6     2     2     2
#> 7     2     3     3

Created on 2022-03-31 by the reprex package (v2.0.1)

CodePudding user response:

you can create the data.frame without list-column first and add the list:

x <- data.frame(a = 1:2)
x$b <- list(tibble('c' = 1:4, 'd' = 1:4),
            tibble('c' = 1:3, 'd' = 1:3)


# 'data.frame': 2 obs. of  2 variables:
# $ a: int  1 2
# $ b:List of 2
#  ..$ : tibble [4 x 2] (S3: tbl_df/tbl/data.frame)
#  .. ..$ c: int  1 2 3 4
#  .. ..$ d: int  1 2 3 4
#  ..$ : tibble [3 x 2] (S3: tbl_df/tbl/data.frame)
#  .. ..$ c: int  1 2 3
#  .. ..$ d: int  1 2 3
  • Related