I have a dataset like this.
ID Yr Month
1 3 NA
2 4 23
3 NA 46
4 1 19
5 NA NA
I like to create a new column , Age
where
Case1 : Age = Year, if Month is missing
Case2 : Age = Year Month/12 , if Year and Month are not missing
Case3 : Age = Month/12 , if Year is missing
Case4 : Age = NA, if both Year and Month are missing.
The final expected dataset should look like this.
ID Yr Month Age
1 3 NA 3
2 4 23 5.91
3 NA 46 3.83
4 1 19 2.58
5 NA NA NA
I am able to accomplish this with 30 lines of code, but I am looking for a simple and efficient solution to this problem. Any suggestions , much appreciated, thanks in advance.
CodePudding user response:
You may include the conditions in case_when
statement.
library(dplyr)
df %>%
mutate(Age = case_when(is.na(Month) & is.na(Yr) ~ NA_real_,
is.na(Month) ~ as.numeric(Yr),
is.na(Yr) ~ Month/12,
TRUE ~ Yr Month/12))
# ID Yr Month Age
#1 1 3 NA 3.000000
#2 2 4 23 5.916667
#3 3 NA 46 3.833333
#4 4 1 19 2.583333
#5 5 NA NA NA