I have
df = structure(list(`Q4-21` = c(0L, 1L, 0L, 1L, 0L, 1L, 0L, 1L, 0L,
1L, 0L, 1L, 0L, 1L, 0L, 1L), `Q1-22` = c(0L, 0L, 1L, 1L, 0L,
0L, 1L, 1L, 0L, 0L, 1L, 1L, 0L, 0L, 1L, 1L), `Q2-22` = c(0L,
0L, 0L, 0L, 1L, 1L, 1L, 1L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L),
`Q3-22` = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L), Name = c("A", "B", "C", "D", "E", "F", "G",
"H", "I", "J", "K", "L", "M", "N", "O", "P")), row.names = c(NA,
-16L), class = "data.frame")
I want
- to filter where columns 3,4, and 5 are not simultaneously 0
- to refer to each column by position, as column names will change repeatedly
I have tried:
cols_of_interest = colnames(df)[2:4]
df %>%
filter_at(which(colnames(df) %in% cols_of_interest), all_vars(. !=0))
but this filters on each column separately not being 0
I need to filter out rows A
and B
Am aware of df[df[,2] !=0 | df[,3] !=0 | df[,4]!= 0,]
, but would prefer tidyverse method
Any suggestions?
CodePudding user response:
Use if_all
:
library(dplyr)
df %>%
filter(!if_all(2:4, ~ .x == 0))
CodePudding user response:
Use rowSums
.
df[rowSums(df[2:4]) != 0, ]
Or
subset(df, rowSums(df[2:4]) != 0)
# Q4-21 Q1-22 Q2-22 Q3-22 Name
# 3 0 1 0 0 C
# 4 1 1 0 0 D
# 5 0 0 1 0 E
# 6 1 0 1 0 F
# 7 0 1 1 0 G
# 8 1 1 1 0 H
# 9 0 0 0 1 I
# 10 1 0 0 1 J
# 11 0 1 0 1 K
# 12 1 1 0 1 L
# 13 0 0 1 1 M
# 14 1 0 1 1 N
# 15 0 1 1 1 O
# 16 1 1 1 1 P
CodePudding user response:
A possible approach would be to calculate a sum of these 3 columns and then filter the rows whose sum is greater than 0, with the following code:
# in a single line of code
filter(df, rowSums(df[,cols_of_interest]) > 0)
The same, but in several lines and with apply (keeping track of the col' created for filter out) =>
df$sum_of_3_cols = apply(df[,cols_of_interest],
MARGIN = 1, FUN = sum, na.rm = T)
# ↑ compute a sum of these 3 col of interest
df %>% filter(sum_of_3_cols > 0 ) %>% select(-sum_of_3_cols)
# ↑ filter out ↑ remove the column used to filter
CodePudding user response:
You can use the pmax
:
library(dplyr)
df %>%
filter(do.call(pmax, .[2:4])>0)
Q4-21 Q1-22 Q2-22 Q3-22 Name
1 0 1 0 0 C
2 1 1 0 0 D
3 0 0 1 0 E
4 1 0 1 0 F
5 0 1 1 0 G
6 1 1 1 0 H
7 0 0 0 1 I
8 1 0 0 1 J
9 0 1 0 1 K
10 1 1 0 1 L
11 0 0 1 1 M
12 1 0 1 1 N
13 0 1 1 1 O
14 1 1 1 1 P