In an experiment, I'm trying to find the time to first birth. There are four animals as given by id and rep (A1, A2, B1, B2), their age and babies. For each id and rep, I want to only retain the rows where babies were first born
id <- c("A","A","A","A","A","A","B","B","B","B","B","B","B","B","B")
rep <- c(1,1,1,2,2,2,1,1,1,1,2,2,2,2,2)
age <- c(0,1,2,0,1,2,0,1,2,3,0,1,2,3,4)
babies <- c(0,0,1,0,1,0,0,0,0,1,0,0,0,1,1)
df <- data.frame(id,rep,age,babies)
So in here, the final dataframe should look like this
id <- c("A","A","B","B")
rep <- c(1,2,1,2)
age <- c(2,1,3,3)
babies <- c(1,1,1,1)
df <- data.frame(id,rep,age,babies)
CodePudding user response:
library(dplyr)
df %>%
group_by(id, rep) %>%
slice_max(babies, n = 1, with_ties = FALSE) %>%
ungroup
df %>%
group_by(id, rep) %>%
filter(row_number() == which(babies > 0)[1]) %>%
ungroup
CodePudding user response:
Here is one with arrange
:
library(dplyr)
df %>%
group_by(id, rep) %>%
arrange(-babies, .by_group = TRUE) %>%
slice(1)
id rep age babies
<chr> <dbl> <dbl> <dbl>
1 A 1 2 1
2 A 2 1 1
3 B 1 3 1
4 B 2 3 1