Home > database >  conditional filtering based on grouped data in R using dplyr
conditional filtering based on grouped data in R using dplyr

Time:03-19

Consider the following dataset:

data <- tibble(
  group = rep(1:4, 40),
  year = rep(1980:2019, 4),
  col= rnorm(160)
)

I want to filter the data as such:

Obtain the subset where value in col is larger than zero for group 1 and 2 and smaller than zero for group 3 and 4

CodePudding user response:

One way to do this is:

data %>% filter(col > 0 & group %in% c(1,2) | col < 0 & group %in% c(3,4))

CodePudding user response:

Here's another method that selects directly using math rather than %in%

data %>% filter(col * sign((group < 3) - 0.5) > 0)
#> # A tibble: 76 x 3
#>    group  year    col
#>    <int> <int>  <dbl>
#>  1     2  1985  2.20 
#>  2     3  1986 -0.205
#>  3     4  1991 -2.10 
#>  4     3  1994 -0.113
#>  5     2  1997  1.90 
#>  6     1  2000  1.37 
#>  7     3  2002 -0.805
#>  8     4  2003 -0.535
#>  9     1  2004  0.792
#> 10     3  2006 -1.28 
#> # ... with 66 more rows
  • Related