Suppose we have a data frame:
Event <- c("A", "A", "A", "B", "B", "C" , "C", "C")
Model <- c( 1, 2, 3, 1, 2, 1, 2, 3)
df <- data.frame(Event, Model)
print (df)
Which looks like this:
event | Model |
---|---|
A | 1 |
A | 2 |
A | 3 |
B | 1 |
B | 2 |
C | 1 |
C | 2 |
C | 3 |
We can see that event B only has 2 models of data. As the actual data frame I am using has thousands of rows and 17 columns, how can I remove all events that do not have 3 models? My guess is to use a subset however I am not sure how to do it when we have more than one condition.
I tried:
filtered_df = subset(df, Event & Model = "1", Event & Model = "2", Event & Model = "3")
However, No joy.
Many thanks.
CodePudding user response:
Using dplyr
,
df %>% group_by(Event) %>%
filter(max(Model)==3)
the result would be
# A tibble: 6 × 2
# Groups: Event [2]
Event Model
<chr> <dbl>
1 A 1
2 A 2
3 A 3
4 C 1
5 C 2
6 C 3
or using data.table
,
df[df[, .I[max(Model)==3],by=Event]$V1]
the result is same as below.
Event Model
1: A 1
2: A 2
3: A 3
4: C 1
5: C 2
6: C 3