I'm (once more) stuck with flattening nested lists.
I have this tibble with some list-columns (originating from a JSON format).
library(tidyr)
library(dplyr)
df = tibble(id = c(1, 2, 3),
branch = list(NULL, list(colA = 'abc', colB = 'mno'),
list(list(colA = 'def', colB = 'uvw'),
list(colA = 'ghi', colB = 'xyz'))))
I want to unnest_wider column 'branch'. That works with rows 1 and 2:
df %>%
slice(1:2) %>%
unnest_wider(branch)
However, row 3 consists of a list of lists which I have to unnest_longer first:
bind_rows(
df %>% slice(1,2),
df %>% slice(3) %>% unnest_longer(branch)) %>%
unnest_wider(branch)
above code gives the desired output, but I'm looking for a generic solution like:
If an element of column 'branch' is of type 'unnamed list' (indicating that there is a list of lists) then unnest_longer. Afterwards apply unnest_wider to the whole column 'branch'
Any help appreciated!
CodePudding user response:
A little bit convoluted but here's a possible solution:
- Iterate through the rows of your
df
- Determine if it's a named
list
by checkingnames(df$branch[[index]])
- If unnamed --> slice unnest; if named --> slice
- Finally,
unnest_wider()
library(tidyr)
library(dplyr)
library(purrr)
map_df(1:nrow(df), function(x) {
if (is.null(names(df$branch[[x]]))) {
df %>% slice(x) %>% unnest_longer(branch)
} else {
df %>% slice(x)
}
}) %>%
unnest_wider(branch)
Which returns:
# A tibble: 4 × 3
id colA colB
<dbl> <chr> <chr>
1 1 NA NA
2 2 abc mno
3 3 def uvw
4 3 ghi xyz
CodePudding user response:
library(tidyverse)
df <- tibble(
id = c(1, 2, 3),
branch = list(
NULL, list(colA = "abc", colB = "mno"),
list(
list(colA = "def", colB = "uvw"),
list(colA = "ghi", colB = "xyz")
)
)
)
unnester <- function(x, grp) {
if (grp) {
x <- x |> unnest_longer(branch)
}
unnest_wider(x, branch)
}
df |>
rowwise() |>
mutate(grp = length(names(unlist(branch))) > 2) |>
ungroup() |>
split(~grp) |>
imap_dfr(~ unnester(.x, .y)) |>
select(-grp)