I have a nested tibble, that I would like to unnest. Two list columns (street_address and status) contain both character vectors and lists and one list column (country) contains only character vectors. When unnesting the tibble, an error is produced, apparently due to the fact that there are both character vectors and lists in the two columns with both types of entries.
df <- tibble::tribble(
~id, ~country, ~street_address, ~status,
"2008-002231-32-GB", c("United Kingdom", "Netherlands"), list(c(NA, NA)), list(c(NA, NA)),
"2020-001060-28-SE", c("Denmark", "Denmark", "Denmark", "Denmark"), c("Palle Juul Jensens Blvrd 67", "Palle Juul Jensens Boulevard 99", "Palle Juul Jensens Blvrd 67", "Palle Juul Jensens Boulevard 99"), c("Non-Commercial", "Non-Commercial", "Non-Commercial", "Non-Commercial")
)
df
# A tibble: 2 × 4
id country street_address status
<chr> <list> <list> <list>
1 2008-002231-32-GB <chr [2]> <list [1]> <list [1]>
2 2020-001060-28-SE <chr [4]> <chr [4]> <chr [4]>
df %>%
unnest(cols = c(country, street_address, status))
# >Error: Can't combine `..1$street_address` <list> and `..2$street_address` <character>.
Created on 2021-12-14 by the reprex package (v2.0.1)
It seems that it is the presence of the list entries in the columns that is the problem (all in the format of list(c(NA, NA))). One option could perhaps be to change those observations to a character vector (or make them NA, as they seem to all be NAs), but I cannot figure out how to do that or whether that would solve the problem. Any help would be greatly appreciated.
NB Please note, that this is an updated question, as the first data I submitted with the question, which I produced using dpasta() was not a good representation of my actual data.
The desired result should look something like this:
# A tibble: 8 × 4
id country street_address status
<chr> <chr> <chr> <chr>
1 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commercial
2 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commercial
3 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commercial
4 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commercial
5 2008-002231-32-GB United Kingdom NA NA
6 2008-002231-32-GB Netherlands NA NA
>
``
CodePudding user response:
You can unnest the data using unnest
function in tidyr
. The code looks like following:
library(tidyr)
df %>%
mutate(r = map(street_address, ~data.frame(t(.))), s = map(status, ~data.frame(t(.)))) %>%
unnest(r, s) %>%
select(-street_address, -status)
And the output looks like this:
# A tibble: 6 x 12
id country t... X1 X2 X3 X4 t...1 X11 X21 X31 X41
<chr> <chr> <list> <chr> <chr> <chr> <chr> <lis> <chr> <chr> <chr> <chr>
1 2008-~ United ~ <lgl ~ NA NA NA NA <lgl~ NA NA NA NA
2 2008-~ Netherl~ <lgl ~ NA NA NA NA <lgl~ NA NA NA NA
3 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
4 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
5 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
6 2020-~ Denmark <NULL> Palle~ Palle~ Palle~ Pall~ <NUL~ Non-~ Non-~ Non-~ Non-~
CodePudding user response:
library(tidyverse)
df <- tibble::tribble(
~ id,
~ country,
~ street_address,
~ status,
"2008-002231-32-GB",
c("United Kingdom", "Netherlands"),
list(c(NA, NA)),
list(c(NA, NA)),
"2020-001060-28-SE",
c("Denmark", "Denmark", "Denmark", "Denmark"),
c(
"Palle Juul Jensens Blvrd 67",
"Palle Juul Jensens Boulevard 99",
"Palle Juul Jensens Blvrd 67",
"Palle Juul Jensens Boulevard 99"
),
c(
"Non-Commercial",
"Non-Commercial",
"Non-Commercial",
"Non-Commercial"
)
)
df %>% mutate(res = map_chr(street_address, class)) %>%
group_split(res) %>%
map(~unnest(data = ., c(country, street_address, status))) %>%
map_df(~unnest(data = ., c(country, street_address, status))) %>%
select(-res)
#> # A tibble: 8 x 4
#> id country street_address status
#> <chr> <chr> <chr> <chr>
#> 1 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commerci~
#> 2 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commerci~
#> 3 2020-001060-28-SE Denmark Palle Juul Jensens Blvrd 67 Non-Commerci~
#> 4 2020-001060-28-SE Denmark Palle Juul Jensens Boulevard 99 Non-Commerci~
#> 5 2008-002231-32-GB United Kingdom <NA> <NA>
#> 6 2008-002231-32-GB United Kingdom <NA> <NA>
#> 7 2008-002231-32-GB Netherlands <NA> <NA>
#> 8 2008-002231-32-GB Netherlands <NA> <NA>
Created on 2021-12-14 by the reprex package (v2.0.1)