Home > Back-end >  How to merge two nested columns?
How to merge two nested columns?

Time:08-15

I have the following data set:

> ex
# A tibble: 98 × 5
   Country    pred_data         data             model     prdct     
   <chr>      <list>            <list>           <list>    <list>    
 1 Albania    <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 2 Angola     <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 3 Azerbaijan <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 4 Algeria    <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 5 Armenia    <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 6 Laos       <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 7 Argentina  <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 8 Austria    <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
 9 Barbados   <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
10 Belgium    <tibble [10 × 2]> <tibble [9 × 3]> <betareg> <dbl [10]>
# … with 88 more rows
# ℹ Use `print(n = ...)` to see more rows

I am trying to create a new column that is a merge or cbind of pred_data and prdct, which should look in the data set as a <tibble [10 × 3]>, but when I write the following code:

res <- ex %>%
  group_by(Country) %>%
  nest() %>%
  mutate(pr2 = map2(pred_data, prdct, merge)) %>%
  ungroup()

it gives me the following error:

Error in `mutate()`:
! Problem while computing `pr2 = map2(pred_data, prdct, merge)`.
ℹ The error occurred in group 1: Country = "Albania".
Caused by error in `map2()`:
! object 'pred_data' not found
Backtrace:
 1. ... %>% ungroup()
 8. purrr::map2(pred_data, prdct, merge)

Does someone know what I'm doing wrong? Can someone help me obtain the wanted result?

CodePudding user response:

The issue is that you nest your data before using map2 which is not necessary. Also, for your desired result use cbind not merge.

Using some fake example data based on the gapminder dataset:

library(gapminder)
library(dplyr, warn=FALSE)
library(tidyr)
library(purrr)

# example data
gap_nested <- gapminder %>%
  group_by(country) %>%
  nest() %>%
  mutate(prdct = map(country, ~ runif(12))) 

gap_nested <- gap_nested %>%
  mutate(data = map2(data, prdct, ~ cbind(.x, prdct = .y)))

head(gap_nested)
#> # A tibble: 6 × 3
#> # Groups:   country [6]
#>   country     data          prdct     
#>   <fct>       <list>        <list>    
#> 1 Afghanistan <df [12 × 6]> <dbl [12]>
#> 2 Albania     <df [12 × 6]> <dbl [12]>
#> 3 Algeria     <df [12 × 6]> <dbl [12]>
#> 4 Angola      <df [12 × 6]> <dbl [12]>
#> 5 Argentina   <df [12 × 6]> <dbl [12]>
#> 6 Australia   <df [12 × 6]> <dbl [12]>

gap_nested$data[[1]]
#>    continent year lifeExp      pop gdpPercap      prdct
#> 1       Asia 1952  28.801  8425333  779.4453 0.08012722
#> 2       Asia 1957  30.332  9240934  820.8530 0.82086328
#> 3       Asia 1962  31.997 10267083  853.1007 0.80174251
#> 4       Asia 1967  34.020 11537966  836.1971 0.31049317
#> 5       Asia 1972  36.088 13079460  739.9811 0.32704996
#> 6       Asia 1977  38.438 14880372  786.1134 0.68114361
#> 7       Asia 1982  39.854 12881816  978.0114 0.75686794
#> 8       Asia 1987  40.822 13867957  852.3959 0.40456933
#> 9       Asia 1992  41.674 16317921  649.3414 0.83969222
#> 10      Asia 1997  41.763 22227415  635.3414 0.64467924
#> 11      Asia 2002  42.129 25268405  726.7341 0.34743910
#> 12      Asia 2007  43.828 31889923  974.5803 0.54325215

CodePudding user response:

We could also do without map as long as we keep it as a list column:

library(dplyr)

ex |>
  group_by(country)
  mutate(pr2 = list(bind_cols(data, tibble(prdct = prdct)))) |>
  ungroup()

Output:

# A tibble: 142 × 4
# Groups:   country [142]
   country     data              prdct      pr2              
   <fct>       <list>            <list>     <list>           
 1 Afghanistan <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 2 Albania     <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 3 Algeria     <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 4 Angola      <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 5 Argentina   <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 6 Australia   <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 7 Austria     <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 8 Bahrain     <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
 9 Bangladesh  <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
10 Belgium     <tibble [12 × 5]> <dbl [12]> <tibble [12 × 6]>
# … with 132 more rows

Data (inspired by @stefan):

library(gapminder)
library(dplyr)

ex <-
  gapminder |>
  group_by(country) |>
  nest() |>
  mutate(prdct = list(runif(12))) |>
  ungroup()
  • Related