I want to select the second lowest DATE grouped per ID (if more than one DATE) and then mutate the result into an new column (NEW).
DF<-tibble::tribble(
~ID, ~DATE, ~TYPE,
1L, "2001-01-01", "P",
2L, "2002-02-02", "P",
2L, "2005-05-05", "R",
3L, "2001-10-10", "R",
4L, "2008-08-08", "P",
4L, "2010-10-10", "R"
)
# A tibble: 6 × 3
ID DATE TYPE
<int> <date> <chr>
1 1 2001-01-01 P
2 2 2002-02-02 P
3 2 2005-05-05 R
4 3 2001-10-10 R
5 4 2008-08-08 P
6 4 2010-10-10 R
Desired output:
ID DATE TYPE NEW
<int> <date> <chr> <chr>
1 1 2001-01-01 P NA
2 2 2002-02-02 P 2005-05-05
3 2 2005-05-05 R 2005-05-05
4 3 2001-10-10 R NA
5 4 2008-08-08 P 2010-10-10
6 4 2010-10-10 R 2010-10-10
I´ve tried the code below, but can´t figure how to mutate NEW column with NA if just one DATE.
DF%>%group_by(ID)%>%arrange(DATE)%>%slice_head(n=2)%>%slice_max(order_by=DATE,n=1)%>%pull(DATE)
Best regards, H
CodePudding user response:
If your dates are arranged in order, you can select the second DATE
as DATE[2]
within a group.
library(tidyverse)
DF %>%
mutate(DATE = as.Date(DATE)) %>%
group_by(ID) %>%
arrange(ID, DATE) %>%
mutate(NEW = DATE[2])
Output
ID DATE TYPE NEW
<int> <date> <chr> <date>
1 1 2001-01-01 P NA
2 2 2002-02-02 P 2005-05-05
3 2 2005-05-05 R 2005-05-05
4 3 2001-10-10 R NA
5 4 2008-08-08 P 2010-10-10
6 4 2010-10-10 R 2010-10-10
CodePudding user response:
Using ave
and order
with a case handling to produce the NA's
.
transform(DF, NEW=ave(DATE, ID, FUN=\(x) {
if (NROW(x) == 1) NA
else x[order(as.Date(x)) == 2]
}))
# ID DATE TYPE NEW
# 1 1 2001-01-01 P <NA>
# 2 2 2002-02-02 P 2005-05-05
# 3 2 2005-05-05 R 2005-05-05
# 4 3 2001-10-10 R <NA>
# 5 4 2008-08-08 P 2010-10-10
# 6 4 2010-10-10 R 2010-10-10
Data:
DF <- structure(list(ID = c(1L, 2L, 2L, 3L, 4L, 4L), DATE = c("2001-01-01",
"2002-02-02", "2005-05-05", "2001-10-10", "2008-08-08", "2010-10-10"
), TYPE = c("P", "P", "R", "R", "P", "R")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -6L))