I have a data frame with one column as a list-column, i.e. it is a column that has, for each row, two vectors contained in that column. I would like to be able to make another column in my data frame that is also a list-column, but that only contains a single sub-list (rather than two), and I would like that list to be the first three elements of one of the sub-lists of the column with two sub-lists.
A simple reproducible example is provided below:
df <- data.frame(state = c(rep("Alabama", 5), rep("Alaska", 5), rep("Arizona", 5), rep("Arkansas", 5), rep("California", 5)),
letter = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k", "l", "m", "n", "o", "p", "q", "r", "s", "t", "u", "v", "w", "x", "y"),
freq = c(8, 7, 4, 3, 1, 19, 15, 7, 4, 2, 10, 6, 3, 2, 2, 11, 10, 10, 5, 4, 50, 33, 22, 11, 1))
df <- nest(df, letter_list = c(letter, freq))
In the context of this reprex, I would like to have a third column in df
that has, for each state, a list of the first three elements of letter
(which is contained in letter_list
).
I have attempted to use purrr functions, such as map()
, in conjunction with the head()
function to mutate
a new variable, but this has been unsuccessful; my new column is populated with lists of length 0.
If possible, a solution using the tidyverse
would be ideal.
Any help would be greatly appreciated!
CodePudding user response:
Use map
to loop over the list
column, select
the 'letter', get the first 3 with either Extract ([
) or use slice_head
library(dplyr)
library(purrr)
df %>%
mutate(letter_new = map(letter_list, ~
.x %>%
select(letter) %>%
slice_head(n = 3) %>%
pull(letter)))
-output
# A tibble: 5 × 3
state letter_list letter_new
<chr> <list> <list>
1 Alabama <tibble [5 × 2]> <chr [3]>
2 Alaska <tibble [5 × 2]> <chr [3]>
3 Arizona <tibble [5 × 2]> <chr [3]>
4 Arkansas <tibble [5 × 2]> <chr [3]>
5 California <tibble [5 × 2]> <chr [3]>
NOTE: if it needs to be kept as tibble
, we don't need the last pull
step
Or using base R
df$letter_new <- lapply(df$letter_list, \(x) head(x$letter, 3))