I have a dataset that looks somewhat as follows.
data <- data.frame(
id = c(1,1,1,2,2,2,3,3,3,4,4,4),
death = c(0,0,1,0,0,0,0,1,0,0,0,0),
other = letters[1:12])
I need to create a new data frame that includes all rows with the unique IDs for any ID that has had a death, much like this:
ID | Death | Other |
---|---|---|
1 | 0 | a |
1 | 0 | b |
1 | 1 | c |
3 | 0 | g |
3 | 1 | h |
3 | 0 | i |
I feel like I'm missing something simple, but any time I try to subset by ID, I get error messages about length and not being able to subset with longer/shorter vectors. Any help would be much appreciated!
CodePudding user response:
Here's a dplyr
approach. It treats id
as a group, and filter away any group that do not have death > 0
.
library(dplyr)
data %>% group_by(id) %>% filter(any(death > 0))
# A tibble: 6 x 3
# Groups: id [2]
id death other
<dbl> <dbl> <chr>
1 1 0 a
2 1 0 b
3 1 1 c
4 3 0 g
5 3 1 h
6 3 0 i
CodePudding user response:
As @thelatemail points out in the comments, this can be done in base R
with:
data <- data.frame(id = c(1,1,1,2,2,2,3,3,3,4,4,4), death = c(0,0,1,0,0,0,0,1,0,0,0,0), other = letters[1:12])
data[data$id %in% data$id[data$death==1],]
#> id death other
#> 1 1 0 a
#> 2 1 0 b
#> 3 1 1 c
#> 7 3 0 g
#> 8 3 1 h
#> 9 3 0 i
Created on 2022-02-18 by the reprex package (v2.0.1)