In my new dataset, I would like each observation for New to continue until a new number occurs. I would like this to happen by group.
DF1:
Date | Group | New |
---|---|---|
2021-04-20 | 1001 | 0 |
2021-04-21 | 1001 | 0 |
2021-04-22 | 1001 | 9 |
2021-04-23 | 1001 | 0 |
2021-04-24 | 1001 | 0 |
2021-04-25 | 1001 | 12 |
2021-04-26 | 1001 | 0 |
2021-04-27 | 1001 | 0 |
2021-04-28 | 1001 | 0 |
2021-04-20 | 1002 | 0 |
2021-04-22 | 1002 | 1 |
2021-04-23 | 1002 | 0 |
2021-04-24 | 1002 | 0 |
2021-04-25 | 1002 | 3 |
2021-04-26 | 1002 | 0 |
Dfdesired:
Date | Group | New |
---|---|---|
2021-04-20 | 1001 | 0 |
2021-04-21 | 1001 | 0 |
2021-04-22 | 1001 | 9 |
2021-04-23 | 1001 | 9 |
2021-04-24 | 1001 | 9 |
2021-04-25 | 1001 | 12 |
2021-04-26 | 1001 | 12 |
2021-04-27 | 1001 | 12 |
2021-04-28 | 1001 | 12 |
2021-04-20 | 1002 | 0 |
2021-04-22 | 1002 | 1 |
2021-04-23 | 1002 | 1 |
2021-04-24 | 1002 | 1 |
2021-04-25 | 1002 | 3 |
2021-04-26 | 1002 | 3 |
CodePudding user response:
An option is to replace the 0 with NA
, then use fill
library(dplyr)
library(tidyr)
DF1 %>%
mutate(New = na_if(New, 0)) %>%
group_by(Group) %>%
fill(New) %>%
ungroup %>%
mutate(New = replace_na(New, 0))
-output
# A tibble: 15 × 3
Date Group New
<chr> <int> <int>
1 2021-04-20 1001 0
2 2021-04-21 1001 0
3 2021-04-22 1001 9
4 2021-04-23 1001 9
5 2021-04-24 1001 9
6 2021-04-25 1001 12
7 2021-04-26 1001 12
8 2021-04-27 1001 12
9 2021-04-28 1001 12
10 2021-04-20 1002 0
11 2021-04-22 1002 1
12 2021-04-23 1002 1
13 2021-04-24 1002 1
14 2021-04-25 1002 3
15 2021-04-26 1002 3
If the values are already ordered, then can use cummax
DF1 %>%
group_by(Group) %>%
mutate(New = cummax(New)) %>%
ungroup
-output
# A tibble: 15 × 3
Date Group New
<chr> <int> <int>
1 2021-04-20 1001 0
2 2021-04-21 1001 0
3 2021-04-22 1001 9
4 2021-04-23 1001 9
5 2021-04-24 1001 9
6 2021-04-25 1001 12
7 2021-04-26 1001 12
8 2021-04-27 1001 12
9 2021-04-28 1001 12
10 2021-04-20 1002 0
11 2021-04-22 1002 1
12 2021-04-23 1002 1
13 2021-04-24 1002 1
14 2021-04-25 1002 3
15 2021-04-26 1002 3
data
DF1 <- structure(list(Date = c("2021-04-20", "2021-04-21", "2021-04-22",
"2021-04-23", "2021-04-24", "2021-04-25", "2021-04-26", "2021-04-27",
"2021-04-28", "2021-04-20", "2021-04-22", "2021-04-23", "2021-04-24",
"2021-04-25", "2021-04-26"), Group = c(1001L, 1001L, 1001L, 1001L,
1001L, 1001L, 1001L, 1001L, 1001L, 1002L, 1002L, 1002L, 1002L,
1002L, 1002L), New = c(0L, 0L, 9L, 0L, 0L, 12L, 0L, 0L, 0L, 0L,
1L, 0L, 0L, 3L, 0L)), class = "data.frame", row.names = c(NA,
-15L))