I have a data frame that looks somewhat like this:
a = c(seq(as.Date("2020-08-01"), as.Date("2020-11-01"), by="months"), seq(as.Date("2021-08-01"), as.Date("2021-11-01"), by="months"),
seq(as.Date("2022-08-01"), as.Date("2022-11-01"), by="months"))
b = rep(LETTERS[1:3], each = 4)
df = data_frame(ID = b, Date = a)
> df
ID Date
<chr> <date>
1 A 2020-08-01
2 A 2020-09-01
3 A 2020-10-01
4 A 2020-11-01
5 B 2021-08-01
6 B 2021-09-01
7 B 2021-10-01
8 B 2021-11-01
9 C 2022-08-01
10 C 2022-09-01
11 C 2022-10-01
12 C 2022-11-01
And I want to create a new variable that replaces Date
with the smallest value in Date
for each ID
, the resulting data frame should look like this:
c = c(rep(as.Date("2020-08-01"), each = 4), rep(as.Date("2021-08-01"), each = 4), rep(as.Date("2022-08-01"), each = 4))
df$NewDate = c
> df
# A tibble: 12 × 3
ID Date NewDate
<chr> <date> <date>
1 A 2020-08-01 2020-08-01
2 A 2020-09-01 2020-08-01
3 A 2020-10-01 2020-08-01
4 A 2020-11-01 2020-08-01
5 B 2021-08-01 2021-08-01
6 B 2021-09-01 2021-08-01
7 B 2021-10-01 2021-08-01
8 B 2021-11-01 2021-08-01
9 C 2022-08-01 2022-08-01
10 C 2022-09-01 2022-08-01
11 C 2022-10-01 2022-08-01
12 C 2022-11-01 2022-08-01
Can someone please help me do it? Thank you very much in advance.
CodePudding user response:
Frist group, then mutate & min:
library(dplyr)
df %>%
group_by(ID) %>%
mutate(NewDate = min(Date)) %>%
ungroup()
#> # A tibble: 12 × 3
#> ID Date NewDate
#> <chr> <date> <date>
#> 1 A 2020-08-01 2020-08-01
#> 2 A 2020-09-01 2020-08-01
#> 3 A 2020-10-01 2020-08-01
#> 4 A 2020-11-01 2020-08-01
#> 5 B 2021-08-01 2021-08-01
#> 6 B 2021-09-01 2021-08-01
#> 7 B 2021-10-01 2021-08-01
#> 8 B 2021-11-01 2021-08-01
#> 9 C 2022-08-01 2022-08-01
#> 10 C 2022-09-01 2022-08-01
#> 11 C 2022-10-01 2022-08-01
#> 12 C 2022-11-01 2022-08-01