Using dplyr
functions, I want to remove rows in which only column b
equals 1
and the rest of columns are all 0
.
Although I can do this:
library(dplyr, warn.conflicts = FALSE)
trb <-
tribble(~a, ~b, ~c,
1, 1, 1,
1, 1, 0,
1, 0, 1,
0, 1, 0, # <~~~ remove this
0, 0, 0,
0, 1, 0 # <~~~ remove this
)
trb %>%
filter(!(b == 1 & a == 0 & c == 0))
#> # A tibble: 4 x 3
#> a b c
#> <dbl> <dbl> <dbl>
#> 1 1 1 1
#> 2 1 1 0
#> 3 1 0 1
#> 4 0 0 0
I'm looking for a more scalable solution to account for data such as:
trb_2 <-
tibble::tribble(
~a, ~b, ~c, ~d, ~e, ~f, ~g, ~h, ~i, ~j, ~k, ~l, ~m, ~n, ~o, ~p, ~q, ~r, ~s, ~t, ~u, ~v, ~w, ~x, ~y, ~z,
0, 0, 1, 0, 1, 1, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1, 1, 0, 0, 1, 0, 0, 0, 0,
1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 0, 1, 1, 1,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 0, 0, 0, 0, 1, 1, 1, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1,
0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
1, 1, 0, 0, 0, 1, 0, 1, 1, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0, 0, 1,
1, 0, 0, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 1, 1, 0, 0, 0, 0, 0,
0, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 0, 1, 0, 1, 0, 1, 0, 0, 0, 0, 0,
0, 0, 1, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0,
0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 0, 0, 0, 0, 1, 0, 0, 1, 1, 0, 0, 1, 1
)
In trb_2
I still want to remove the rows in which b
equals 1
and all the rest are 0
.
Is there a scalable way to achieve this using dplyr::filter()
?
CodePudding user response:
Yes, using the new helper function dplyr::if_all()
you can do this for no matter how many columns you have:
trb %>%
filter(!(b == 1 & if_all(-b, ~ .x == 0)))
Result:
# A tibble: 4 x 3
a b c
<dbl> <dbl> <dbl>
1 1 1 1
2 1 1 0
3 1 0 1
4 0 0 0
Breakdown of !(b == 1 & if_all(-b, ~ .x == 0))
:
b == 1
will match rows where b is 1if_all(-b, ~ .x == 0)
will match rows where all columns except b are exactly 0!(b == 1 & if_all(-b, ~ .x == 0))
combines these two expressions and removes the rows where both are true
CodePudding user response:
trb %>%
filter(b != 1 | rowSums(. == 1) != 1)
# # A tibble: 4 x 3
# a b c
# <dbl> <dbl> <dbl>
# 1 1 1 1
# 2 1 1 0
# 3 1 0 1
# 4 0 0 0