Home > Mobile >  Converting to Date Month Format without the actual day
Converting to Date Month Format without the actual day

Time:12-22

I have a dataframe that consist of the Month, Week of the Month and the Days of the Week (i.e Monday, Tuesday, etc)

I would like to convert them into Date Month format (e.g 02-Dec). How do I retrieve the date with just the week of the month and the days?

dput

structure(list(Month = c("Dec", "Jan", "Oct", "Jun", "Jan", "Oct", 
"Feb", "Nov", "Dec", "Apr", "Mar", "Mar", "Jan", "Jan", "Jan", 
"Aug", "Apr", "Jul", "May", "Apr"), WeekOfMonth = c(5, 3, 5, 
2, 5, 4, 1, 1, 4, 3, 2, 5, 3, 5, 5, 4, 4, 5, 4, 4), DayOfWeek = c("Wednesday", 
"Wednesday", "Friday", "Saturday", "Monday", "Friday", "Saturday", 
"Friday", "Saturday", "Tuesday", "Sunday", "Monday", "Friday", 
"Friday", "Monday", "Tuesday", "Thursday", "Sunday", "Thursday", 
"Monday"), Year = c(1994, 1994, 1994, 1994, 1994, 1994, 1994, 
1994, 1994, 1994, 1994, 1994, 1994, 1994, 1994, 1994, 1994, 1994, 
1994, 1994)), row.names = c(NA, -20L), class = c("tbl_df", "tbl", 
"data.frame"))

CodePudding user response:

Try this:

dat |>
  transform(
    DayOfWeekNum = match(DayOfWeek, wdays), 
    date01 = as.Date(paste(Year, Month, "01", sep = "-"), format = "%Y-%b-%d")
  ) |>
  transform(weekday01 = as.integer(format(date01, format = "%w"))   1L) |>
  transform(realdate = date01   (WeekOfMonth - 1L) * 7L   (DayOfWeekNum - weekday01))
#    Month WeekOfMonth DayOfWeek Year DayOfWeekNum     date01 weekday01   realdate
# 1    Dec           5 Wednesday 1994            4 1994-12-01         5 1994-12-28
# 2    Jan           3 Wednesday 1994            4 1994-01-01         7 1994-01-12
# 3    Oct           5    Friday 1994            6 1994-10-01         7 1994-10-28
# 4    Jun           2  Saturday 1994            7 1994-06-01         4 1994-06-11
# 5    Jan           5    Monday 1994            2 1994-01-01         7 1994-01-24
# 6    Oct           4    Friday 1994            6 1994-10-01         7 1994-10-21
# 7    Feb           1  Saturday 1994            7 1994-02-01         3 1994-02-05
# 8    Nov           1    Friday 1994            6 1994-11-01         3 1994-11-04
# 9    Dec           4  Saturday 1994            7 1994-12-01         5 1994-12-24
# 10   Apr           3   Tuesday 1994            3 1994-04-01         6 1994-04-12
# 11   Mar           2    Sunday 1994            1 1994-03-01         3 1994-03-06
# 12   Mar           5    Monday 1994            2 1994-03-01         3 1994-03-28
# 13   Jan           3    Friday 1994            6 1994-01-01         7 1994-01-14
# 14   Jan           5    Friday 1994            6 1994-01-01         7 1994-01-28
# 15   Jan           5    Monday 1994            2 1994-01-01         7 1994-01-24
# 16   Aug           4   Tuesday 1994            3 1994-08-01         2 1994-08-23
# 17   Apr           4  Thursday 1994            5 1994-04-01         6 1994-04-21
# 18   Jul           5    Sunday 1994            1 1994-07-01         6 1994-07-24
# 19   May           4  Thursday 1994            5 1994-05-01         1 1994-05-26
# 20   Apr           4    Monday 1994            2 1994-04-01         6 1994-04-18

where realdate is (I believe) the date you're looking for.

(Since you have a tibble, I'm inferring dplyr, to which this can be translated trivially, a naive start would replace all transform(.) to mutate(.), and then reduced to a single mutate call.)


Verification

Using the calendar below (in linux, cal 1994), I'll randomly verify rows 5, 14, and 17 (chosen by sort(sample(nrow(dat), size = 3))).

  • Row 5: Jan, week 5, Monday, 1994: the 5th week starts on Jan 23 (Sunday-based), that Monday is 1994-01-24, check.
  • Row 14: Jan, week 5, Friday, 1994: same start, Friday is 1994-01-28, check.
  • Row 17: Apr, week 4, Thursday, 1994: the fourth week starts on Apr 17, so Thursday is 1994-04-21, check.

I'll leave it as an exercise to verify the remaining rows.

Disclaimer: while I didn't find anything here, there may be room for off-by-one (seven, sire) errors if the locale used to determine WeekOfMonth is different (i.e., not Sunday-based), in which case we might need to adjust ...

                            1994
      January               February               March
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
                   1         1  2  3  4  5         1  2  3  4  5
 2  3  4  5  6  7  8   6  7  8  9 10 11 12   6  7  8  9 10 11 12
 9 10 11 12 13 14 15  13 14 15 16 17 18 19  13 14 15 16 17 18 19
16 17 18 19 20 21 22  20 21 22 23 24 25 26  20 21 22 23 24 25 26
23 24 25 26 27 28 29  27 28                 27 28 29 30 31
30 31

       April                  May                   June
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
                1  2   1  2  3  4  5  6  7            1  2  3  4
 3  4  5  6  7  8  9   8  9 10 11 12 13 14   5  6  7  8  9 10 11
10 11 12 13 14 15 16  15 16 17 18 19 20 21  12 13 14 15 16 17 18
17 18 19 20 21 22 23  22 23 24 25 26 27 28  19 20 21 22 23 24 25
24 25 26 27 28 29 30  29 30 31              26 27 28 29 30


        July                 August              September
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
                1  2      1  2  3  4  5  6               1  2  3
 3  4  5  6  7  8  9   7  8  9 10 11 12 13   4  5  6  7  8  9 10
10 11 12 13 14 15 16  14 15 16 17 18 19 20  11 12 13 14 15 16 17
17 18 19 20 21 22 23  21 22 23 24 25 26 27  18 19 20 21 22 23 24
24 25 26 27 28 29 30  28 29 30 31           25 26 27 28 29 30
31

      October               November              December
Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa  Su Mo Tu We Th Fr Sa
                   1         1  2  3  4  5               1  2  3
 2  3  4  5  6  7  8   6  7  8  9 10 11 12   4  5  6  7  8  9 10
 9 10 11 12 13 14 15  13 14 15 16 17 18 19  11 12 13 14 15 16 17
16 17 18 19 20 21 22  20 21 22 23 24 25 26  18 19 20 21 22 23 24
23 24 25 26 27 28 29  27 28 29 30           25 26 27 28 29 30 31
30 31

CodePudding user response:

Some definitions:

  • assuming year "2022" "1994"
  • counting weeks within months by start of the week day (if first day in a month is Sunday it counts as the first week)
  • choosing "Monday" as start of the week, so this might be off by a day.

A simple validity check would be line 9:

  • December - Week 4 - Saturday: 24-Dec
library(dplyr)
library(lubridate)

df %>% 
  rowwise() %>% 
  mutate(Year = 1994, start = ym(paste(Year, Month)), 
    end_date = start   (days_in_month(start) - 1), 
    weekdays = list(format(seq(start, end_date, by="day"), "%A")), 
    week_seg = list(cumsum(c(T, (weekdays == "Monday")[2:length(weekdays)]))), 
    first = grep(WeekOfMonth, week_seg)[1], 
    act = grep(DayOfWeek, weekdays[week_seg == WeekOfMonth]), 
    date = format(ymd(paste(Year, Month, (first   act) - 1)), "%d-%b")) %>% 
  ungroup() %>% 
  select(1:4, date)
# A tibble: 20 × 5
   Month WeekOfMonth DayOfWeek  Year date  
   <chr>       <dbl> <chr>     <dbl> <chr> 
 1 Dec             5 Wednesday  1994 28-Dec
 2 Jan             3 Wednesday  1994 12-Jan
 3 Oct             5 Friday     1994 28-Oct
 4 Jun             2 Saturday   1994 11-Jun
 5 Jan             5 Monday     1994 24-Jan
 6 Oct             4 Friday     1994 21-Oct
 7 Feb             1 Saturday   1994 05-Feb
 8 Nov             1 Friday     1994 04-Nov
 9 Dec             4 Saturday   1994 24-Dec
10 Apr             3 Tuesday    1994 12-Apr
11 Mar             2 Sunday     1994 13-Mar
12 Mar             5 Monday     1994 28-Mar
13 Jan             3 Friday     1994 14-Jan
14 Jan             5 Friday     1994 28-Jan
15 Jan             5 Monday     1994 24-Jan
16 Aug             4 Tuesday    1994 23-Aug
17 Apr             4 Thursday   1994 21-Apr
18 Jul             5 Sunday     1994 31-Jul
19 May             4 Thursday   1994 19-May
20 Apr             4 Monday     1994 18-Apr
  • Related