library(tidyverse)
dat <- data.frame(id = c(1, 2, 3, 4, 5),
class = c("A", "A", "B", "C", "C"),
value = c(12, 34, 24, 24, 11),
date = c(as.Date("02/27/92", "%m/%d/%y"), as.Date("02/27/99", "%m/%d/%y"),
as.Date("03/01/91", "%m/%d/%y"), as.Date("02/27/99", "%m/%d/%y"), as.Date("06/28/91", "%m/%d/%y")))
> dat
id class value date
1 1 A 12 1992-02-27
2 2 A 34 1999-02-27
3 3 B 24 1991-03-01
4 4 C 24 1999-02-27
5 5 C 11 1991-06-28
I would like to do the following:
- for
class = A
, keep the rows wherevalue > 10
- for
class = C
, keep the rows wherevalue > 11
- for each distinct
class
, keep the row with the largest (or more recent)date
Here's my attempt and error:
> dat %>% filter((class %in% "A" & value > 10) | (class %in% "C" & value > 11)) %>%
group_by(class) %>% filter(max(date))
Error in `filter()`:
! Problem while computing `..1 = max(date)`.
x Input `..1` must be a logical vector, not a date.
ℹ The error occurred in group 1: class = "A".
Run `rlang::last_error()` to see where the error occurred.
CodePudding user response:
max
is not logical. We may need date == max(date)
to get the logical vector
library(dplyr)
dat %>%
filter((class %in% "A" & value > 10) | (class %in% "C" & value > 11)) %>%
group_by(class) %>%
filter(date == max(date))
-output
# A tibble: 2 × 4
# Groups: class [2]
id class value date
<dbl> <chr> <dbl> <date>
1 2 A 34 1999-02-27
2 4 C 24 1999-02-27
Or use slice_max
dat %>%
filter((class %in% "A" & value > 10) | (class %in% "C" & value > 11)) %>%
group_by(class) %>%
slice_max(n = 1, order_by = date)
# A tibble: 2 × 4
# Groups: class [2]
id class value date
<dbl> <chr> <dbl> <date>
1 2 A 34 1999-02-27
2 4 C 24 1999-02-27
CodePudding user response:
filter(max(date)
should be filter(date == max(date))