I have data like this:
library(lubridate)
library(dplyr)
set.seed(2021)
gen_date <- seq(ymd_h("2021-01-01-00"), ymd_h("2021-09-30-23"), by = "hours")
hourx <- hour(gen_date)
datex <- date(gen_date)
sales <- round(runif(length(datex), 10, 50), 0)*100
mydata <- data.frame(datex, hourx, sales)
How do i get the last three months data using dplyr? or How do i get the last six months data using dplyr?. What i want is full data from "2021-06-01" to "2021-09-30". Thank You.
CodePudding user response:
We may get the max
value of 'datex', create a seq
unece of 6 or 3 months with seq
backwards, and create a logical vector with 'datex' to filter
library(dplyr)
n <- 6
out <- mydata %>%
filter(datex >= seq(floor_date(max(datex), 'month'),
length.out = n 1, by = '-1 month'))
-checking
> head(out)
datex hourx sales
1 2021-03-01 4 5000
2 2021-03-01 11 3200
3 2021-03-01 18 1500
4 2021-03-02 1 4400
5 2021-03-02 8 4400
6 2021-03-02 15 4400
> max(mydata$datex)
[1] "2021-09-30"
For 3 months
n <- 3
out2 <- mydata %>%
filter(datex >= seq(floor_date(max(datex), 'month'),
length.out = n 1, by = '-1 month'))
> head(out2)
datex hourx sales
1 2021-06-01 3 2100
2 2021-06-01 7 1300
3 2021-06-01 11 4800
4 2021-06-01 15 1500
5 2021-06-01 19 3200
6 2021-06-01 23 3400
CodePudding user response:
You may try
library(xts)
x <- mydata %>%
mutate(month = month(datex)) %>%
filter(month %in% last(unique(month), 3))
unique(x$month)
[1] 7 8 9