Home > Back-end >  R conditional statements: how to add conditional labels given two sets of dates
R conditional statements: how to add conditional labels given two sets of dates

Time:11-16

employee <- c('Alfred','Mary','Susie', "Joan", "Dave")

startdate <- as.Date(c("2019-05-12","2020-09-23","2020-07-12","2021-11-10","2021-09-12"))

endate <- as.Date(c("2021-11-15","2021-11-15","2021-11-15","2021-11-15","2021-11-15")) 

date_R <- data.frame(employee,startdate,endate)

date_R2 <- date_R %>% 

  mutate(date_dif = endate - startdate)

Newbie here, excuses in advance...

I am trying to get around this table with two dates i get to calculate the difference of days between them, pretty straightforward.. all good so far lol

date_R2
  employee  startdate     endate date_dif
1   Alfred 2019-05-12 2021-11-15 918 days
2     Mary 2020-09-23 2021-11-15 418 days
3    Susie 2020-07-12 2021-11-15 491 days
4     Joan 2021-11-10 2021-11-15   5 days
5     Dave 2021-09-12 2021-11-15  64 days

But I need to add labels for each row, based on the number of days.

lets say: less than 10 days i would give a specific label;

more than 30 days another label;

more than 100 days another, and so on...

Is it possible to do this with the mutate verb? (just because it seems much more simpler than creating a function. If so, which is the way I must follow?

If i need to dive in the functions realm, as a newbie like me, can anyone give me an example on how to achieve my goal, or something around that?

Many thanks

CodePudding user response:

We could use cut creating the breaks and labelling them. The trick is to get out the numbers out of date_dif:

library(dplyr)
library(readr)
date_R2 %>%
  mutate(category = cut(parse_number(as.character(date_dif)), 
                        breaks = c(0,10,30,1000),
                        labels = c("<10", "10-30","30-1000")
                        ))
  employee  startdate     endate date_dif category
1   Alfred 2019-05-12 2021-11-15 918 days  30-1000
2     Mary 2020-09-23 2021-11-15 418 days  30-1000
3    Susie 2020-07-12 2021-11-15 491 days  30-1000
4     Joan 2021-11-10 2021-11-15   5 days      <10
5     Dave 2021-09-12 2021-11-15  64 days  30-1000

CodePudding user response:

You can use the case_when inside mutate

date_R2 <- date_R %>% 
  
  mutate(date_dif = endate - startdate,
         label = case_when(
           date_dif < 30 ~ 'lower_30',
           date_dif>= 30 ~ "upper_30"
         )) 
  • Related