Home > Mobile >  How to extract unique values WITH NAs
How to extract unique values WITH NAs

Time:06-26

I need to extract unique values from a column, but I need to keep NAs...when I use unique() it only returns values different from NAs...is there a way to filter unique values, keeping NAs?

My columns consist of article's DOIs, but there are some NAs within it. I'm using unique() in order to account for duplicates articles in my sample, but if it keeps on ignoring NAs, then it automatically excludes the ones which haven't provided a DOI...

I've seen similar questions here, but I wasn't able to solve my problem with the solutions from other posts...

my column

in the function's documentation, I get:

Missing values ("NA") are regarded as equal, numeric and complex ones differing from NaN; character strings will be compared in a “common encoding”; for details, see match (and duplicated) which use the same concept.

What should I do?

CodePudding user response:

unique on the data column returns only the unique values including the NA, thus if there are 100 NAs, it returns a single NA. We could correct that by using duplicated with is.na

subset(dat, !duplicated(D1)|is.na(D1))

!duplicated(D1) - returns TRUE for all unique values and FALSE for duplicates, is.na returns TRUE for all NA and FALSE for non-NA. Using |, returns TRUE if either one of the conditions are TRUE, thus the logical expression returns TRUE for unique elements as well as NA elements which is used in subset to filter the rows

  • Related