Home > Enterprise >  R dplyr | Categorize days of a month into four factors: Mon, Weekdays, Fri, Weekend
R dplyr | Categorize days of a month into four factors: Mon, Weekdays, Fri, Weekend

Time:10-10

I'm working on a timestamp data frame. An excerpt of the date-related variables from the January sample of the data frame:

sample_dates <- data.frame(date = c("2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04", "2021-01-05", "2021-01-06", "2021-01-07", "2021-01-08", "2021-01-09", "2021-01-10", "2021-01-11", "2021-01-12", "2021-01-13", "2021-01-14", "2021-01-15", "2021-01-16", "2021-01-17", "2021-01-18", "2021-01-19", "2021-01-20", "2021-01-21", "2021-01-22", "2021-01-23", "2021-01-24", "2021-01-25", "2021-01-26", "2021-01-27", "2021-01-28", "2021-01-29", "2021-01-30", "2021-01-31"))

sample_dates <- sample_dates %>% 
    mutate(date = as.POSIXct(date)) %>% 
    mutate(day = factor(format(date, "%a")))

I want to add a new factor variable day_cat, the pseudo-code for which could be something like this:

sample_dates <- sample_dates %>% 
    # the month could start on any day and this function should identify it
    # for the sample, I know January 2021 started on Friday
    
    mutate(day_cat = while(month is not over)
        
        if(day == "Fri") {"Fri1"},
        else if(day == "Sat" | day == "Sun") {"Weekend1"},
        else if(day == "Mon") {"Mon1"},
        else if(day == "Tue" | day == "Wed" | day == "Thu") {"Weekdays1"},
        
        # now we're onto the next Friday of the month
        else if(day == "Fri") {"Fri2"},
        else if(day == "Sat" | day == "Sun") {"Weekend2"},
        else if(day == "Mon") {"Mon2"},
        else if(day == "Tue" | day == "Wed" | day == "Thu") {"Weekdays2"},
        ...
        ...
        
        # reached the end of month
        )

    mutate(day_cat = factor(day_cat, levels = c("Mon", "Weekdays", "Fri", "Weekend")))

So, the factors are Mon = {Mon}; Weekdays = {Tue, Wed, Thu}; Fri = {Fri}; Weekend = {Sat, Sun}. And, I want to number these factors as Mon1, Weekdays1, Fri1, Weekend1, Mon2, Weekdays2, Fri1, Weekend2, Mon3, and so on, in the day_cat variable (say if the month started from Monday).

The levels of the day_cat variable should be in the same order (for plotting purpose).

If a month starts on Wednesday, day_cat would take only that Wednesday and Thursday (the next day) as "Weekdays1". If the month ends on Saturday, day_cat would take only that Saturday as "Weekend4" or "Weekend5", whichever it might me.

CodePudding user response:

Here, day_cat is a factor in chronological order, although as specified the three weekday and two weekend values will each week have the same factor level. Is that what you want?

library(dplyr); library(lubridate)
sample_dates %>%
  mutate(day = wday(date, label = TRUE),
         group = case_when(day == "Mon" ~ "Mon",
                           day == "Fri" ~ "Fri",
                           day %in% c("Sat", "Sun") ~ "Weekend",
                           TRUE ~ "Weekday"),
         weeknum = (day(date)-1) %/% 7   1,
         day_cat = paste0(group, weeknum) %>% fct_inorder()) 

Result

         date day   group weeknum  day_cat
1  2021-01-01 Fri     Fri       1     Fri1
2  2021-01-02 Sat Weekend       1 Weekend1
3  2021-01-03 Sun Weekend       1 Weekend1
4  2021-01-04 Mon     Mon       1     Mon1
5  2021-01-05 Tue Weekday       1 Weekday1
6  2021-01-06 Wed Weekday       1 Weekday1
7  2021-01-07 Thu Weekday       1 Weekday1
8  2021-01-08 Fri     Fri       2     Fri2
9  2021-01-09 Sat Weekend       2 Weekend2
10 2021-01-10 Sun Weekend       2 Weekend2
11 2021-01-11 Mon     Mon       2     Mon2
12 2021-01-12 Tue Weekday       2 Weekday2
13 2021-01-13 Wed Weekday       2 Weekday2
14 2021-01-14 Thu Weekday       2 Weekday2
15 2021-01-15 Fri     Fri       3     Fri3
16 2021-01-16 Sat Weekend       3 Weekend3
17 2021-01-17 Sun Weekend       3 Weekend3
18 2021-01-18 Mon     Mon       3     Mon3
19 2021-01-19 Tue Weekday       3 Weekday3
20 2021-01-20 Wed Weekday       3 Weekday3
21 2021-01-21 Thu Weekday       3 Weekday3
22 2021-01-22 Fri     Fri       4     Fri4
23 2021-01-23 Sat Weekend       4 Weekend4
24 2021-01-24 Sun Weekend       4 Weekend4
25 2021-01-25 Mon     Mon       4     Mon4
26 2021-01-26 Tue Weekday       4 Weekday4
27 2021-01-27 Wed Weekday       4 Weekday4
28 2021-01-28 Thu Weekday       4 Weekday4
29 2021-01-29 Fri     Fri       5     Fri5
30 2021-01-30 Sat Weekend       5 Weekend5
31 2021-01-31 Sun Weekend       5 Weekend5
  • Related