Home > Software design >  Subset groups that have each observation from another column
Subset groups that have each observation from another column

Time:11-04

My data are as follows:

group observation
A     red
A     blue
A     green
B     red
B     red
B     green
C     blue
C     red
C     green

I would like to subset groups that have at least one of each observation. My desired output is as follows:

group observation 
A     red
A     blue
A     green
C     blue
C     red
C     green

CodePudding user response:

We may group by 'group' column, check if all the unique 'observation' elements from the whole data is %in% the 'observation' in grouped rows to filter those 'group's

library(dplyr)
df1 %>%
    group_by(group) %>%
    filter(all(unique(df1$observation) %in% observation)) %>%
    ungroup

-output

# A tibble: 6 × 2
  group observation
  <chr> <chr>      
1 A     red        
2 A     blue       
3 A     green      
4 C     blue       
5 C     red        
6 C     green    

data

df1 <- structure(list(group = c("A", "A", "A", "B", "B", "B", "C", "C", 
"C"), observation = c("red", "blue", "green", "red", "red", "green", 
"blue", "red", "green")), class = "data.frame", row.names = c(NA, 
-9L))

CodePudding user response:

Here's a solution in base r:

df1[with(df1, ave(observation, group, 
                  FUN = function(x) length(unique(x))) >= length(unique(observation))),]
  • Related