Here is my dataframe
structure(list(Date = structure(c(19016, 19017, 19018, 19019,
19020, 19021, 19022, 19023, 19024, 19025, 19026, 19027, 19028,
19029, 19030, 19031, 19032, 19033, 19034, 19035, 19036, 19037,
19038, 19039, 19040, 19041, 19042, 19043, 19044, 19045, 19046,
19047, 19048, 19049, 19050, 19051, 19052, 19053, 19054, 19055,
19056, 19057, 19058, 19059, 19060, 19061, 19062, 19063, 19064,
19065, 19066, 19067, 19068, 19069, 19070, 19071, 19072, 19073,
19074, 19075, 19076, 19077, 19078, 19079, 19080, 19081, 19082
), class = "Date"), Sales = c(17230.95, 15356.85, 21123.41, 19811.86,
24899.7, 18670.08, 14129.03, 18454.97, 20942.28, 23324.83, 20284.81,
24456.1, 19884.73, 15473.03, 19809.02, 16871.57, 0, 25492.58,
27801.39, 21106.03, 16089.1, 20173.34, 18964.01, 37685.96, 19639.33,
22188.08, 18944.28, 18166.25, 21962.05, 22927.46, 22181.48, 22115.81,
24914.86, 22782.64, 19192.26, 25431.16, 24078.59, 24288, 22021.68,
28372.21, 20201.48, 20209.54, 23325.9, 21977.64, 23368.25, 21879.24,
0, 21430.3, 18440.64, 24777.07, 22030.41, 25575.51, 23089.05,
23139.19, 28243.81, 21652.69, 21952.06, 18940.91, 24881.95, 19458.29,
25724.01, 19723.64, 20525.86, 22713.51, 0, 23620.43, 23755.2),
Day = structure(c(2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L,
5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L, 6L, 7L,
1L, 2L, 3L, 4L, 5L, 6L, 7L, 1L, 2L, 3L, 4L, 5L), .Label = c("Sunday",
"Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday"
), class = c("ordered", "factor")), Month = structure(c(18993,
18993, 18993, 18993, 18993, 18993, 18993, 18993, 19024, 19024,
19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024,
19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024,
19024, 19024, 19024, 19024, 19024, 19024, 19024, 19024, 19052,
19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052,
19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052,
19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052, 19052,
19052, 19052, 19052), class = "Date")), row.names = c(NA,
-67L), class = c("tbl_df", "tbl", "data.frame"))
And a screenshot so you can see it at a glance.
I originally had the Date
and Sales
columns. By using these lines of code I was able to get the Day
and Month
columns.
time_series_df_sample$Day <- wday(time_series_df_sample$Date, label=TRUE, abbr=FALSE)
time_series_df_sample <- time_series_df_sample %>%
mutate(Month = floor_date(Date, 'month'))
I'm trying to create another column called Day of Week Position in Month
, or some other name. For Date == 2022-02-01
, for example, Day of Week Position in Month
should say 1
because this is the first Tuesday in the month and where Date == 2022-02-08
it should say 2
because this is the second Tuesday in the month. So this new column will always have values 1-5
.
The tally()
function might not be the right approach, as least as I'm using it. It just says 4 or 5 all the way through, except for the first month where I don't have as much data. And the min()
function isn't doing the job - it just returns the day of week value.
time_series_and_other_stories_fl %>%
group_by(Month, Day) %>%
tally()
time_series_and_other_stories_fl %>%
group_by(Month, Day) %>%
mutate(min = min(Day))
CodePudding user response:
out <- time_series_df_sample%>%
group_by(Month)%>%
arrange(Date)%>%
mutate(`Day of Week Position in Month`=ave(Day==Day,Day,FUN=cumsum))