Home > database >  R: How to get factor levels found in data
R: How to get factor levels found in data

Time:10-03

How can I list factor levels that exist in the data?

Let's initialize data:

df <- data.frame(
    v1 = c( 'b', 'c', 'a', NA ),
    v2 = c( 2, 1, 3, 1 )
)
df$v1 <- factor(
    df$v1,
    levels = c( 'a', 'b', 'c', 'd' ),
    ordered = TRUE
)

Now, levels() gives all the levels, including those not found in the data:

> levels( df$v1 )
[1] "a" "b" "c" "d"

And unique() gives all the unique values, including NA:

> unique( df$v1 )
[1] b    c    a    <NA>
Levels: a < b < c < d

What I'm looking for, is something in the lines of:

> levels( df$v1, indata = TRUE )
[1] "a" "b" "c"

CodePudding user response:

We may use droplevels

levels(droplevels(df$v1))
[1] "a" "b" "c"

Or just wrap factor again to reset the levels

levels( factor(df$v1) )
[1] "a" "b" "c"

CodePudding user response:

why not just use unique, then subset to filter out the NAs?

Filter(unique(df$v1, \(x) !is.na(x))

we can also use base subsetting:

unique(df$v1[!is.na(df$v1)])

or we can use purrr::discard

library(purrr)

unique(df$v1) %>% discard(is.na)
  • Related