Create a Column with Unique values from Lists Columns-CodePudding

I have a dataset on Rstudio made of columns that contains lists inside them. Here is an example where column "a" and column "c" contain lists in each row.

¿What I am looking for?

I need to create a new column that collects unique values from columns a b and c and that skips NA or null values

Expected result is column "desired_result".

 test <- tibble(a = list(c("x1","x2"), c("x1","x3"),"x3"),
               b = c("x1", NA,NA),
               c = list(c("x1","x4"),"x4","x2"),
               desired_result = list(c("x1","x2","x4"),c("x1","x3","x4"),c("x2","x3")))

What i have tried so far?

I tried the following but do not produces the expected result as in column "desired_result

test$attempt_1_ <-lapply(apply((test[, c("a","b","c"), drop = T]),
MARGIN = 1, FUN= c, use.names= FALSE),unique)

CodePudding user response：

We may use pmap to loop over each of the corresponding elements of 'a' to 'c', remove the NA (na.omit) and get the unique values to store as a list in 'desired_result'

library(dplyr)
library(purrr)
test <- test %>% 
    mutate(desired_result2 = pmap(across(a:c), ~ sort(unique(na.omit(c(...))))))

-checking with OP's expected

> all.equal(test$desired_result, test$desired_result2)
[1] TRUE