Home > Software engineering >  Dealing with NA value while calculating age from the Date of Birth
Dealing with NA value while calculating age from the Date of Birth

Time:10-27

I am in a big problem. I have to calculate age and add a new column in the following table. I tried eeptools but can not deal with NA value

id DOB
1  5/22/1951
2  NA
3  8/18/1984
4  5/1/1994
5  NA

I tried the following code btw and it comes with an error. I want to deal with this NA value

Age= age_calc(as.Date(na.omit(Merged_data$DOB),"%m/%d/%Y"),units = "years")

Error in if (any(enddate < dob)) { : 
  missing value where TRUE/FALSE needed

Please help, I have a deadline today :(

CodePudding user response:

Do you mean age as of today? If so, you could try the following base R approach. If by a certain date, just change Sys.Date() to a date-formatted date of interest:

df$age_years <- as.numeric((Sys.Date() - as.Date(df$DOB, "%m/%d/%Y")) / 365.25)

Output:

  id       DOB age_years
1  1 5/22/1951  71.43053
2  2      <NA>        NA
3  3 8/18/1984  38.18754
4  4  5/1/1994  28.48734
5  5      <NA>        NA

CodePudding user response:

Here is how we could do it: I added an enddate. First thing is to transform character into date format, we use mdy() function from lubridate. then we can use %–% Operator that creates a time interval from the date of the birth to the enddate:

library(dplyr)
library(lubridate)
df %>% 
  mutate(across(-id, mdy),
         age = trunc((DOB %--% enddate) / years(1)))
    id DOB        enddate      age
  <int> <date>     <date>     <dbl>
1     1 1951-05-22 2022-01-01    70
2     2 NA         2022-01-01    NA
3     3 1984-08-18 2022-01-01    37
4     4 1994-05-01 2022-01-01    27
5     5 NA         2022-01-01    NA

CodePudding user response:

library(lubridate)
df$age <- time_length(interval(as.Date(df$DOB, "%m/%d/%Y"), today()), unit = "years")

Output

  id       DOB      age
1  1 5/22/1951 71.43014
2  2      <NA>       NA
3  3 8/18/1984 38.18904
4  4  5/1/1994 28.48767
5  5      <NA>       NA
  • Related