I am looking for an efficient way to replicate a process below, for example using dplyr.
Suppose I have this df:
df <- data.frame(id = c("A", "A", "A", "A", "A", "A", "A", "A"),
date = as.Date(c("2022-01-01", "2022-01-01", "2022-01-01", "2022-01-01", "2022-01-02", "2022-01-02", "2022-01-02", "2022-01-02")),
timestamp = c("08:59:00", "09:00:00", "09:01:00", "09:02:00", "08:59:00", "09:00:00", "09:01:00", "09:02:00"),
result = c(55, 56, 54, 52, 58, 57, 60, 62))
> df
id date timestamp result
1 A 2022-01-01 08:59:00 55
2 A 2022-01-01 09:00:00 56
3 A 2022-01-01 09:01:00 54
4 A 2022-01-01 09:02:00 52
5 A 2022-01-02 08:59:00 58
6 A 2022-01-02 09:00:00 57
7 A 2022-01-02 09:01:00 60
8 A 2022-01-02 09:02:00 62
I would like to group by date
and create a new column result1
based on the value of result
at timestamp
== "09:00:00" such that resulting data look like this:
> df
id date timestamp result result1
1 A 2022-01-01 08:59:00 55 56
2 A 2022-01-01 09:00:00 56 56
3 A 2022-01-01 09:01:00 54 56
4 A 2022-01-01 09:02:00 52 56
5 A 2022-01-02 08:59:00 58 57
6 A 2022-01-02 09:00:00 57 57
7 A 2022-01-02 09:01:00 60 57
8 A 2022-01-02 09:02:00 62 57
CodePudding user response:
Something like this?
df %>%
group_by(date) %>%
mutate(result1 = result[timestamp=="09:00:00"])
# A tibble: 8 × 5
# Groups: date [2]
id date timestamp result result1
<chr> <date> <chr> <dbl> <dbl>
1 A 2022-01-01 08:59:00 55 56
2 A 2022-01-01 09:00:00 56 56
3 A 2022-01-01 09:01:00 54 56
4 A 2022-01-01 09:02:00 52 56
5 A 2022-01-02 08:59:00 58 57
6 A 2022-01-02 09:00:00 57 57
7 A 2022-01-02 09:01:00 60 57
8 A 2022-01-02 09:02:00 62 57