If this is my dataset
Id Group1 Col1 Col2
1 Red 2/1/1999 3
1 Red 4/11/1998 4
2 Black 7/8/1995 NA
2 Black 11/2/2000 1
3 Black 11/1/2994 2
3 Black 5/18/1997 6
How can I implement slice_min
OR filter(Col2= min(Col2)|is.na(Col2))
to retain only rows with minimum of Col2 only where Group1=Red ?
Expected output
Id Group1 Col1 Col2
1 Red 2/1/1999 3
2 Black 7/8/1995 NA
2 Black 11/2/2000 1
3 Black 11/1/2994 2
3 Black 5/18/1997 6
Thanks in advance for any suggestions.
CodePudding user response:
You can use slice_min
only on the subset you want, and rbind
the result to the rest of the df
rbind(
df %>% filter(Group1 == "Red") %>% slice_min(Col2),
df %>% filter(Group1 != "Red")
)
Output:
Id Group1 Col1 Col2
<int> <chr> <chr> <int>
1 1 Red 2/1/1999 3
2 2 Black 7/8/1995 NA
3 2 Black 11/2/2000 1
4 3 Black 11/1/2994 2
5 3 Black 5/18/1997 6
CodePudding user response:
You could use a slightly more complex filter
condition:
library(dplyr)
df %>%
filter((Group1 == "Red" & (Col2 == min(.[.$Group1 == "Red",]$Col2) | is.na(Col2))) | Group1 != "Red")
This returns
# A tibble: 5 x 4
Id Group1 Col1 Col2
<dbl> <chr> <chr> <dbl>
1 1 Red 2/1/1999 3
2 2 Black 7/8/1995 NA
3 2 Black 11/2/2000 1
4 3 Black 11/1/2994 2
5 3 Black 5/18/1997 6
Or a more elegant way as proposed by Ritchie Sacramento:
df %>%
group_by(Group1) %>%
filter(Col2 == min(Col2, na.rm = TRUE) | is.na(Col2) | Group1 != "Red")
Data
structure(list(Id = c(1, 1, 2, 2, 3, 3), Group1 = c("Red", "Red",
"Black", "Black", "Black", "Black"), Col1 = c("2/1/1999", "4/11/1998",
"7/8/1995", "11/2/2000", "11/1/2994", "5/18/1997"), Col2 = c(3,
4, NA, 1, 2, 6)), class = c("spec_tbl_df", "tbl_df", "tbl", "data.frame"
), row.names = c(NA, -6L), spec = structure(list(cols = list(
Id = structure(list(), class = c("collector_double", "collector"
)), Group1 = structure(list(), class = c("collector_character",
"collector")), Col1 = structure(list(), class = c("collector_character",
"collector")), Col2 = structure(list(), class = c("collector_double",
"collector"))), default = structure(list(), class = c("collector_guess",
"collector")), skip = 1L), class = "col_spec"))