my problem:
I have a time series, which look like this:
v1 v2 t v3 day
1 46 33 2005-06-04 00:00:00 13 2005-06-04
2 25 24 2005-06-04 01:00:00 15 2005-06-04
3 18 9 2005-06-04 02:00:00 11 2005-06-04
4 11 22 2005-06-04 03:00:00 1 2005-06-04
5 11 31 2005-06-04 04:00:00 0 2005-06-04
6 12 27 2005-06-04 05:00:00 3 2005-06-04
7 46 33 2005-06-04 06:00:00 13 2005-06-04
8 25 24 2005-06-04 07:00:00 15 2005-06-04
9 18 9 2005-06-04 08:00:00 11 2005-06-04
10 11 22 2005-06-04 09:00:00 1 2005-06-04
11 11 31 2005-06-04 10:00:00 12 2005-06-04
12 12 27 2005-06-04 11:00:00 13 2005-06-04
13 46 33 2005-06-04 12:00:00 13 2005-06-04
14 25 24 2005-06-04 13:00:00 15 2005-06-04
15 18 9 2005-06-04 14:00:00 11 2005-06-04
16 11 22 2005-06-04 15:00:00 1 2005-06-04
17 11 31 2005-06-04 16:00:00 0 2005-06-04
18 12 27 2005-06-04 17:00:00 3 2005-06-04
I want the dates ( format "2005-06-04") for the days when "v3" for 10:00:00, 11:00:00, 12:00:00 and 13:00:00 is greater than 10.
I have no idea how to implement this.
Thanks a lot.
CodePudding user response:
Using dplyr and lubridate, see comments in code for explanation.
library(dplyr) library(lubridate)
df1 %>%
# create a counter
mutate(outcome = if_else(between(hour(t), 10, 13) & v3 > 10, 1, 0)) %>%
group_by(day) %>%
# rollup counter per day and filter the ones where the sum = 4
summarise(outcome = sum(outcome)) %>%
filter(outcome == 4) %>%
select(day)
# A tibble: 1 × 1
day
<date>
1 2005-06-04
data:
df1 <- structure(list(v1 = c(46L, 25L, 18L, 11L, 11L, 12L, 46L, 25L,
18L, 11L, 11L, 12L, 46L, 25L, 18L, 11L, 11L, 12L), v2 = c(33L,
24L, 9L, 22L, 31L, 27L, 33L, 24L, 9L, 22L, 31L, 27L, 33L, 24L,
9L, 22L, 31L, 27L), t = structure(c(1117843200, 1117846800, 1117850400,
1117854000, 1117857600, 1117861200, 1117864800, 1117868400, 1117872000,
1117875600, 1117879200, 1117882800, 1117886400, 1117890000, 1117893600,
1117897200, 1117900800, 1117904400), class = c("POSIXct", "POSIXt"
), tzone = "UTC"), v3 = c(13L, 15L, 11L, 1L, 0L, 3L, 13L, 15L,
11L, 1L, 12L, 13L, 13L, 15L, 11L, 1L, 0L, 3L), day = structure(c(12938,
12938, 12938, 12938, 12938, 12938, 12938, 12938, 12938, 12938,
12938, 12938, 12938, 12938, 12938, 12938, 12938, 12938), class = "Date")), row.names = c(NA,
-18L), class = "data.frame")