Home > Blockchain >  Filter groups where all rows in a column are the same dplyr
Filter groups where all rows in a column are the same dplyr

Time:11-03

I was trying to do something kind of simple. My dataframe looks like this:

ID    value
1       a
2       b
2       c
3       d
3       d
4       e
4       e
4       e

What I wanted to do is to filter groups with more than one row and where all the values in the value column are the same:

df %>% group_by(ID) %>% filter(n() > 1 & all(mysterious_condition))

So mysterious_condition is what I'm lacking. What I'm trying to achieve is this:

ID    value
3       d
3       d
4       e
4       e
4       e

Any thoughts on how to accomplish this?

Thanks!

CodePudding user response:

We may use n_distinct to check for the count of unique elements

library(dplyr)
df %>%
    group_by(ID) %>%
    filter(n() >1, n_distinct(value) == 1) %>%
    ungroup

-output

# A tibble: 5 × 2
     ID value
  <int> <chr>
1     3 d    
2     3 d    
3     4 e    
4     4 e    
5     4 e    

data

df <- structure(list(ID = c(1L, 2L, 2L, 3L, 3L, 4L, 4L, 4L), value = c("a", 
"b", "c", "d", "d", "e", "e", "e")), class = "data.frame", row.names = c(NA, 
-8L))
  • Related