Home > Back-end >  Filtering out groups based on one condition in R
Filtering out groups based on one condition in R

Time:11-21

I am trying to filter out groups based on the condition that the group does not contain a Submit or Cancel. Please see the following dataset:

df <- structure(list(
  session = c(1, 1, 1, 2, 2, 2, 2, 3, 3, 3), 
  event = c("pg1", "click1", "submit", "pg2", "click1", "click2", "cancel", "pg1", "click1", "click3")),
  .Names = c("session", "event"),
  row.names = c(NA, -10L),
  class = "data.frame")
session event
1 pg1
1 click1
1 submit
2 pg2
2 click1
2 click2
2 cancel
3 pg1
3 click1
3 click3

I would like to filter out all the sessions that contain a submit or cancel. The resulting dataset should look this:

session event
3 pg1
3 click1
3 click3

This code does not work:

df %>%
group_by(session) %>%
filter(any (event != "submit" | event != "cancel"))

CodePudding user response:

You may try

df %>%
  group_by(session) %>%
  filter(!any(event %in% c("submit", "cancel")))

  session event 
    <dbl> <chr> 
1       3 pg1   
2       3 click1
3       3 click3

CodePudding user response:

Using ave.

df[with(df, as.logical(ave(event, session, FUN=\(x) !any(grepl('submit|cancel', x))))), ]
#    session  event
# 8        3    pg1
# 9        3 click1
# 10       3 click3

CodePudding user response:

solution using square bracket selector

df[!df$session %in% df[df$event %in% c("submit","cancel"),"session"], ]
  • Related