Home > front end >  Create a Column with Unique values from Lists Columns
Create a Column with Unique values from Lists Columns

Time:03-09

I have a dataset on Rstudio made of columns that contains lists inside them. Here is an example where column "a" and column "c" contain lists in each row.

¿What I am looking for?

I need to create a new column that collects unique values from columns a b and c and that skips NA or null values

Expected result is column "desired_result".

 test <- tibble(a = list(c("x1","x2"), c("x1","x3"),"x3"),
               b = c("x1", NA,NA),
               c = list(c("x1","x4"),"x4","x2"),
               desired_result = list(c("x1","x2","x4"),c("x1","x3","x4"),c("x2","x3")))

What i have tried so far?

I tried the following but do not produces the expected result as in column "desired_result

test$attempt_1_ <-lapply(apply((test[, c("a","b","c"), drop = T]),
MARGIN = 1, FUN= c, use.names= FALSE),unique)

CodePudding user response:

We may use pmap to loop over each of the corresponding elements of 'a' to 'c', remove the NA (na.omit) and get the unique values to store as a list in 'desired_result'

library(dplyr)
library(purrr)
test <- test %>% 
    mutate(desired_result2 = pmap(across(a:c), ~ sort(unique(na.omit(c(...))))))

-checking with OP's expected

> all.equal(test$desired_result, test$desired_result2)
[1] TRUE
  • Related