Considering the data mtcars
something like
mtcars %>% select(mpg,cyl,disp,hp) %>% mutate_all(distinct())
I want to have all the distinct values only, I understand this will make the length of data- frame column unequal, so I wanted to also know if we can insert NAs for that?
in short, I want to apply unique() across all columns separately without having to write something like unique(mtcars$cyl) for each of the rows. This will make the length of df unequal,
CodePudding user response:
A base
solution:
lapply(mtcars, unique)
Here, unique()
accepts a vector x
and returns a (possibly shorter) vector consisting of the unique values. As you noted, the lengths of each unique collection will differ, so we use lapply()
to obtain the answer as a list.
Given what I think you're trying to do, this might be a more sensible approach than padding NA
entries, because it seems like the only thing you want is the list of unique values.
CodePudding user response:
If I understand correct you are looking for this:
To achieve your aim first transform the dataframe columns to list of vectors.
Then replace the duplicates with NA to get the same length and wrap it around map_dfr
:
library(tidyverse)
mtcars %>%
dplyr::select(mpg,cyl,disp,hp) %>%
as.list() %>%
map_dfr(~replace(., duplicated(.), NA))
mpg cyl disp hp
<dbl> <dbl> <dbl> <dbl>
1 21 6 160 110
2 NA NA NA NA
3 22.8 4 108 93
4 21.4 NA 258 NA
5 18.7 8 360 175
6 18.1 NA 225 105
7 14.3 NA NA 245
8 24.4 NA 147. 62
9 NA NA 141. 95
10 19.2 NA 168. 123
# ... with 22 more rows