I have a vector and list of the same length. The list contains vectors of arbitrary lengths as such:
vec1 <- c("a", "b", "c")
list1 <- list(c(1, 3, 2),
c(4, 5, 8, 9),
c(5, 2))
What is the fastest, most effective way to create a dataframe such that the elements of vec1 are replicated the number of times corresponding to their index in list1?
Expected output:
# col1 col2
# 1 a 1
# 2 a 3
# 3 a 2
# 4 b 4
# 5 b 5
# 6 b 8
# 7 b 9
# 8 c 5
# 9 c 2
I have included a tidy solution as an answer, but I was wondering if there are other ways to approach this task.
CodePudding user response:
In base R
, set the names of the list
with 'vec1' and use stack
to return a two column data.frame
stack(setNames(list1, vec1))[2:1]
-output
ind values
1 a 1
2 a 3
3 a 2
4 b 4
5 b 5
6 b 8
7 b 9
8 c 5
9 c 2
If we want a tidyverse approach, use enframe
library(tibble)
library(dplyr)
library(tidyr)
list1 %>%
set_names(vec1) %>%
enframe(name = 'col1', value = 'col2') %>%
unnest(col2)
# A tibble: 9 × 2
col1 col2
<chr> <dbl>
1 a 1
2 a 3
3 a 2
4 b 4
5 b 5
6 b 8
7 b 9
8 c 5
9 c 2
CodePudding user response:
This tidy solution replicates the vec1 elements according to the nested vector's lengths, then flattens both lists into a tibble.
library(purrr)
library(tibble)
tibble(col1 = flatten_chr(map2(vec1, map_int(list1, length), function(x, y) rep(x, times = y))),
col2 = flatten_dbl(list1))
# # A tibble: 9 × 2
# col1 col2
# <chr> <dbl>
# 1 a 1
# 2 a 3
# 3 a 2
# 4 b 4
# 5 b 5
# 6 b 8
# 7 b 9
# 8 c 5
# 9 c 2
CodePudding user response:
A tidyr
/tibble
-approach could also be unnest_longer
:
library(dplyr)
library(tidyr)
tibble(vec1, list1) |>
unnest_longer(list1)
Output:
# A tibble: 9 × 2
vec1 list1
<chr> <dbl>
1 a 1
2 a 3
3 a 2
4 b 4
5 b 5
6 b 8
7 b 9
8 c 5
9 c 2
CodePudding user response:
Another possible solution, based on purrr::map2_dfr
:
library(purrr)
map2_dfr(vec1, list1, ~ data.frame(col1 = .x, col2 =.y))
#> col1 col2
#> 1 a 1
#> 2 a 3
#> 3 a 2
#> 4 b 4
#> 5 b 5
#> 6 b 8
#> 7 b 9
#> 8 c 5
#> 9 c 2