Home > Software engineering >  R language: How can I flatten/expand lists embedded in a data frame?
R language: How can I flatten/expand lists embedded in a data frame?

Time:06-23

I have a data frame in which one column is a list of vectors, with various lengths. This code creates an example:

df1 <- data.frame(
  A=c("able", "baker", "carr"),
  B=c("whiskey", "tango", "foxtrot")
)
df1$C <- list(14, c(2,18,32), c(10,6))

The actual data originated in an SPSS database. (BTW, I could not figure out how to create this example with a single statement.)

I'd like to convert it to a data frame like the one created by the following code:

df2 <- data.frame(
  A=c("able", rep("baker",3), rep("carr",2)),
  B=c("whiskey", rep("tango",3), rep("foxtrot",2)),
  C=c(14, 2, 18, 32, 10, 6)
)

I don't want to resort to ugly surgery and looping -- been there, done that.

CodePudding user response:

library(tidyr)
df2 <- unnest(df1, cols = "C")

Result:

# A tibble: 6 × 3
  A     B           C
  <chr> <chr>   <dbl>
1 able  whiskey    14
2 baker tango       2
3 baker tango      18
4 baker tango      32
5 carr  foxtrot    10
6 carr  foxtrot     6

CodePudding user response:

Not explicit looping, and not sure whether or not this is ugly surgery, but here is a base R approach, much cleaner thanks to @onyambu

# For each row Unroll by length of df1$C: res => data.frame
res <- transform(
  df1[
    rep(
      seq(
        nrow(df1)
      ),
      lengths(df1$C)
    ),
  ], 
  C = unlist(df1$C), 
  row.names = NULL
)
  • Related