Consider the following code, used to identify an age range:
ages <- c(5,10,15,20,25,30,35,40,45,50,55,60)
get_age_group <- function(age) {
label <- ifelse(between(age, 0, 12), "Kid",
ifelse(between(age, 13, 19), "Teenager",
ifelse(between(age, 20, 28), "Young Adult",
"Older Than That")
)
)
label
}
get_age_group(ages)
It works fine...
[1] "Kid" "Kid" "Teenager" "Older Than That" "Older Than That" "Older Than That" "Older Than That"
[8] "Older Than That" "Older Than That" "Older Than That" "Older Than That" "Older Than That"
But that's a nested ifelse
statement from hell, and I plan to extended this to accommodate ages into the 80s.
This is not my first version. The earlier function looked like this...
get_age_group <- function(age) {
label <- ""
if (between(age, 0, 12)) { label <- "Kid" }
else if (between(age, 13, 19)) { label <- "Teenager" }
... # And so on
label
}
That works fine for a single integer, such as get_age_group(10)
, but I found out the hard way that if
statements cannot evaluate conditions with more than one element (i.e., vectors), which raises the dreaded the condition has length > 1
error.
My question is: is there a more elegant solution than nested ifelse
statements?
CodePudding user response:
The cut
function is designed for this.
ages <- c(5,10,15,20,25,30,35,40,45,50,55,60)
get_age_group <- function(age) {
cut(age, c(0, 13, 20, 29, Inf),
labels = c('Kid', 'Teenager', 'Young Adult', 'Older Than That'),
right = FALSE)
}
get_age_group(ages)
This gives:
[1] Kid Kid Teenager Young Adult Young Adult Older Than That
[7] Older Than That Older Than That Older Than That Older Than That Older Than That Older Than That
Levels: Kid Teenager Young Adult Older Than That
Note this returns a factor. If you don't want that, you can add as.character
to convert it.
CodePudding user response:
Instead of nested ifelse
an option is cut
with breaks
out <- as.character(cut(ages, breaks = c(-Inf, 13, 19, 28, Inf),
labels = c("Kid", "Teenager", "Young Adult", "Older Than That")))
-testing with OP's ifelse
> all.equal(out, get_age_group(ages))
[1] TRUE
CodePudding user response:
If age derives from a column in data frame, consider merging to an age_label
data frame.
age_label <- data.frame(
age = c(0:12, 13:19, 20:28, 29:100),
label = c(
rep("child", 13),
rep("teenager", 7),
rep("young adult", 9),
rep("older adult", 72)
)
)
mydataframe <- merge(
mydataframe, age_label, by = "age", all.x = TRUE
)