Home > Back-end >  R: Is There a Better Way Than ifelse to Eval a Number Over Multiple Ranges?
R: Is There a Better Way Than ifelse to Eval a Number Over Multiple Ranges?

Time:08-28

Consider the following code, used to identify an age range:

ages <- c(5,10,15,20,25,30,35,40,45,50,55,60)
get_age_group <- function(age) {
  label <- ifelse(between(age, 0, 12), "Kid", 
              ifelse(between(age, 13, 19), "Teenager", 
                ifelse(between(age, 20, 28), "Young Adult", 
                       "Older Than That") 
              )
  )
  label
  }

get_age_group(ages)

It works fine...

 [1] "Kid"             "Kid"             "Teenager"        "Older Than That" "Older Than That"     "Older Than That" "Older Than That"
 [8] "Older Than That" "Older Than That" "Older Than That" "Older Than That" "Older Than That"

But that's a nested ifelse statement from hell, and I plan to extended this to accommodate ages into the 80s.

This is not my first version. The earlier function looked like this...

get_age_group <- function(age) {
  label <- ""
  if (between(age, 0, 12)) { label <- "Kid" }
  else if (between(age, 13, 19)) { label <- "Teenager" }
  ... # And so on
  label
  }

That works fine for a single integer, such as get_age_group(10), but I found out the hard way that if statements cannot evaluate conditions with more than one element (i.e., vectors), which raises the dreaded the condition has length > 1 error.

My question is: is there a more elegant solution than nested ifelse statements?

CodePudding user response:

The cut function is designed for this.

ages <- c(5,10,15,20,25,30,35,40,45,50,55,60)
get_age_group <- function(age) {
  cut(age, c(0, 13, 20, 29, Inf), 
      labels = c('Kid', 'Teenager', 'Young Adult', 'Older Than That'), 
      right = FALSE)
}

get_age_group(ages)

This gives:

[1] Kid             Kid             Teenager        Young Adult     Young Adult     Older Than That
 [7] Older Than That Older Than That Older Than That Older Than That Older Than That Older Than That
Levels: Kid Teenager Young Adult Older Than That

Note this returns a factor. If you don't want that, you can add as.character to convert it.

CodePudding user response:

Instead of nested ifelse an option is cut with breaks

out <- as.character(cut(ages, breaks = c(-Inf, 13, 19, 28, Inf),
    labels = c("Kid", "Teenager", "Young Adult", "Older Than That")))

-testing with OP's ifelse

> all.equal(out, get_age_group(ages))
[1] TRUE

CodePudding user response:

If age derives from a column in data frame, consider merging to an age_label data frame.

age_label <- data.frame(
    age = c(0:12, 13:19, 20:28, 29:100),
    label = c(
        rep("child", 13), 
        rep("teenager", 7), 
        rep("young adult", 9), 
        rep("older adult", 72)
    )
)

mydataframe <- merge(
    mydataframe, age_label, by = "age", all.x = TRUE
)
  • Related