I'm working on a timestamp data frame. An excerpt of the date-related variables from the January sample of the data frame:
sample_dates <- data.frame(date = c("2021-01-01", "2021-01-02", "2021-01-03", "2021-01-04", "2021-01-05", "2021-01-06", "2021-01-07", "2021-01-08", "2021-01-09", "2021-01-10", "2021-01-11", "2021-01-12", "2021-01-13", "2021-01-14", "2021-01-15", "2021-01-16", "2021-01-17", "2021-01-18", "2021-01-19", "2021-01-20", "2021-01-21", "2021-01-22", "2021-01-23", "2021-01-24", "2021-01-25", "2021-01-26", "2021-01-27", "2021-01-28", "2021-01-29", "2021-01-30", "2021-01-31"))
sample_dates <- sample_dates %>%
mutate(date = as.POSIXct(date)) %>%
mutate(day = factor(format(date, "%a")))
I want to add a new factor variable day_cat
, the pseudo-code for which could be something like this:
sample_dates <- sample_dates %>%
# the month could start on any day and this function should identify it
# for the sample, I know January 2021 started on Friday
mutate(day_cat = while(month is not over)
if(day == "Fri") {"Fri1"},
else if(day == "Sat" | day == "Sun") {"Weekend1"},
else if(day == "Mon") {"Mon1"},
else if(day == "Tue" | day == "Wed" | day == "Thu") {"Weekdays1"},
# now we're onto the next Friday of the month
else if(day == "Fri") {"Fri2"},
else if(day == "Sat" | day == "Sun") {"Weekend2"},
else if(day == "Mon") {"Mon2"},
else if(day == "Tue" | day == "Wed" | day == "Thu") {"Weekdays2"},
...
...
# reached the end of month
)
mutate(day_cat = factor(day_cat, levels = c("Mon", "Weekdays", "Fri", "Weekend")))
So, the factors are Mon = {Mon}; Weekdays = {Tue, Wed, Thu}; Fri = {Fri}; Weekend = {Sat, Sun}. And, I want to number these factors as Mon1, Weekdays1, Fri1, Weekend1, Mon2, Weekdays2, Fri1, Weekend2, Mon3, and so on, in the day_cat
variable (say if the month started from Monday).
The levels of the day_cat
variable should be in the same order (for plotting purpose).
If a month starts on Wednesday, day_cat
would take only that Wednesday and Thursday (the next day) as "Weekdays1". If the month ends on Saturday, day_cat
would take only that Saturday as "Weekend4" or "Weekend5", whichever it might me.
CodePudding user response:
Here, day_cat
is a factor in chronological order, although as specified the three weekday and two weekend values will each week have the same factor level. Is that what you want?
library(dplyr); library(lubridate)
sample_dates %>%
mutate(day = wday(date, label = TRUE),
group = case_when(day == "Mon" ~ "Mon",
day == "Fri" ~ "Fri",
day %in% c("Sat", "Sun") ~ "Weekend",
TRUE ~ "Weekday"),
weeknum = (day(date)-1) %/% 7 1,
day_cat = paste0(group, weeknum) %>% fct_inorder())
Result
date day group weeknum day_cat
1 2021-01-01 Fri Fri 1 Fri1
2 2021-01-02 Sat Weekend 1 Weekend1
3 2021-01-03 Sun Weekend 1 Weekend1
4 2021-01-04 Mon Mon 1 Mon1
5 2021-01-05 Tue Weekday 1 Weekday1
6 2021-01-06 Wed Weekday 1 Weekday1
7 2021-01-07 Thu Weekday 1 Weekday1
8 2021-01-08 Fri Fri 2 Fri2
9 2021-01-09 Sat Weekend 2 Weekend2
10 2021-01-10 Sun Weekend 2 Weekend2
11 2021-01-11 Mon Mon 2 Mon2
12 2021-01-12 Tue Weekday 2 Weekday2
13 2021-01-13 Wed Weekday 2 Weekday2
14 2021-01-14 Thu Weekday 2 Weekday2
15 2021-01-15 Fri Fri 3 Fri3
16 2021-01-16 Sat Weekend 3 Weekend3
17 2021-01-17 Sun Weekend 3 Weekend3
18 2021-01-18 Mon Mon 3 Mon3
19 2021-01-19 Tue Weekday 3 Weekday3
20 2021-01-20 Wed Weekday 3 Weekday3
21 2021-01-21 Thu Weekday 3 Weekday3
22 2021-01-22 Fri Fri 4 Fri4
23 2021-01-23 Sat Weekend 4 Weekend4
24 2021-01-24 Sun Weekend 4 Weekend4
25 2021-01-25 Mon Mon 4 Mon4
26 2021-01-26 Tue Weekday 4 Weekday4
27 2021-01-27 Wed Weekday 4 Weekday4
28 2021-01-28 Thu Weekday 4 Weekday4
29 2021-01-29 Fri Fri 5 Fri5
30 2021-01-30 Sat Weekend 5 Weekend5
31 2021-01-31 Sun Weekend 5 Weekend5