Home > Software engineering >  filter in one group only instead of all
filter in one group only instead of all

Time:09-22

I was wondering if there is a way to only filter/exclude inside one group after using the group_by function in dplyr. Atm the filter functions is run on all groups, although I only want to exclude certain values inside one group.

CodePudding user response:

This keeps only rows for which Petal.Length > 1.5 for the setosa Species and keeps all rows for the other Species.

1) cur_group() returns a one-row tibble with a column for each grouping variable giving its current value.

library(dplyr)

iris %>%
  group_by(Species) %>%
  filter(Petal.Length > 1.5 | cur_group()$Species != "setosa") %>%
  ungroup

2) Within a group_modify using a formula argument .x refers to the non-group columns and .y is similar to cur_group in (1).

iris %>%
  group_by(Species) %>%
  group_modify(
    ~ if (.y$Species == "setosa") filter(.x, Petal.Length > 1.5) else .x
  ) %>%
  ungroup

CodePudding user response:

Maybe in this case you can formulate it without group_by()?

iris |> filter(Species != 'setosa' | Petal.Length > 1.5)

# Perhaps clearer:
iris |> filter(!(Species == 'setosa' & Petal.Length <= 1.5))
  • Related