Home > Software design >  How can filter group id those have only single value
How can filter group id those have only single value

Time:10-02

I have the following data and I need to filter group id those have only single yes value for column category.

data <- data.frame(id=c(1,1,1,1,1,1,2,2,2,2,2,3,3,3,3,3,4,4,4,4,4,5,5,5,5,6,6,6,6,6),
category=c(NA,'no',NA,'yes',NA,'yes',NA,NA,'no',NA,'no','yes',NA,'no','yes','no','no',NA,'no',NA,'yes','no',NA,'yes','no',NA,'no','yes','no','yes'))

Expected output:

  id category
1  4       no
2  4     <NA>
3  4       no
4  4     <NA>
5  4      yes
6  5       no
7  5     <NA>
8  5      yes
9  5       no

I try it using

library(dplyr)
data1 <- data %>% 
  group_by(id) %>% 
  filter((cumsum(category == 'yes')<= 1), na.rm=TRUE) %>%
  ungroup

CodePudding user response:

You can follow your original approach with a minor edit:

data %>% 
  group_by(id) %>% 
  filter(sum(category=="yes",na.rm=T)==1)

Output:

     id category
  <dbl> <chr>   
1     4 no      
2     4 NA      
3     4 no      
4     4 NA      
5     4 yes     
6     5 no      
7     5 NA      
8     5 yes     
9     5 no  

CodePudding user response:

FYI, here is a base R approach:

#find counts for the yes/no
out<-table(data$category, data$id)

#extract the ids meeting the criteria
ids <- which(out["yes",]==1)

#Get the values
data[data$id %in% ids, ]
  • Related