Home > database >  Removing rows based on column conditions R-Studio
Removing rows based on column conditions R-Studio

Time:04-22

Suppose we have a data frame:

Event <- c("A", "A", "A", "B", "B", "C" , "C", "C")
Model <- c( 1, 2, 3, 1, 2, 1, 2, 3)

df <- data.frame(Event, Model)

print (df)

Which looks like this:

event Model
A 1
A 2
A 3
B 1
B 2
C 1
C 2
C 3

We can see that event B only has 2 models of data. As the actual data frame I am using has thousands of rows and 17 columns, how can I remove all events that do not have 3 models? My guess is to use a subset however I am not sure how to do it when we have more than one condition.

I tried:

filtered_df = subset(df, Event & Model = "1", Event & Model = "2", Event & Model = "3")

However, No joy.

Many thanks.

CodePudding user response:

Using dplyr,

df %>% group_by(Event) %>% 
  filter(max(Model)==3) 

the result would be

# A tibble: 6 × 2
# Groups:   Event [2]
  Event Model
  <chr> <dbl>
1 A         1
2 A         2
3 A         3
4 C         1
5 C         2
6 C         3

or using data.table,

df[df[, .I[max(Model)==3],by=Event]$V1]

the result is same as below.

   Event Model
1:     A     1
2:     A     2
3:     A     3
4:     C     1
5:     C     2
6:     C     3
  • Related