Suppose we have a data frame:
Event <- c("A", "A", "A", "B", "B", "C" , "C", "C")
Model <- c( 1, 2, 3, 1, 2, 1, 2, 3)
df <- data.frame(Event, Model)
Which looks like this:
event | Model |
---|---|
A | 1 |
A | 2 |
A | 3 |
B | 1 |
B | 2 |
C | 1 |
C | 2 |
C | 3 |
We can see that event B only has 2 models of data. As the actual data frame I am using has thousands of rows and 17 columns, how can I remove all events that do not have 3 models? My guess is to use a subset however I am not sure how to do it when we have more than one condition.
I tried the suggested code from YH Jang below:
df %>% group_by(Event) %>%
filter(max(Model)==3)
However, this would miss out entries in the data that looked like this.
event | Model |
---|---|
A | 1 |
A | 3 |
example:
# A tibble: 6 × 2
# Groups: Event [2]
Event Model
<chr> <dbl>
1 A 1
2 A 3
4 C 1
5 C 2
6 C 3
CodePudding user response:
Using dplyr
,
df %>% group_by(Event) %>%
filter(max(Model)=3)
the result would be
# A tibble: 6 × 2
# Groups: Event [2]
Event Model
<chr> <dbl>
1 A 1
2 A 2
3 A 3
4 C 1
5 C 2
6 C 3
or using data.table
,
df[df[,.I[max(Model)==3],by=Event]$V1]
the result is same as below.
Event Model
1: A 1
2: A 2
3: A 3
4: C 1
5: C 2
6: C 3
EDIT
I misunderstood the question.
Here's the edited answer.
# with dplyr
df %>% group_by(Event) %>%
filter(length(Model)>=3)
or
# with data.table
df[df[,.I[length(Model)>=3],by=Event]$V1]
CodePudding user response:
Try this:
library(dplyr)
df %>% group_by(Event) %>%
filter(length(Model) >= 3)
This removes rows that have fewer than three Model
types