I want to apply a condition to different columns (specified in a vector) of a data frame.
For example if I have
trial_m <- as.data.frame(cbind(c(0,1,1), c(0, 1, 0), c(0, 1, 1), c(0, 1, 0)))
aux_v <- c("V1","V2","V3")
I would like to know where in trial_m
all of the values are 1
for each of the rows.
I know I can do this:
trial_m[, aux_v[1]] == 1 & trial_m[, aux_v[2]] == 1 & trial_m[, aux_v[3]] == 1
But aux_v
can have a different lengths, for example aux_v = c("V1")
. How can I apply a condition to all columns specified in the aux_v
vector without typing all the conditions as above (the condition will be the same for all of the elements of the vector, to be equal 1)?
CodePudding user response:
We can use if_all
library(dplyr)
trial_m %>%
filter(if_all(all_of(aux_v), ~ .x == 1))
-output
V1 V2 V3 V4
1 1 1 1 1
CodePudding user response:
Maybe with data.table
in two steps.
library(data.table)
setDT(trial_m)
temp <- trial_m[, lapply(.SD, function(x) x == 1), .SDcols = aux_v]
temp <- rowSums(temp) == nrow(temp)
temp
[1] FALSE TRUE FALSE
CodePudding user response:
I'm not sure what your expected output would be, but to check the conditions in your sample (all values for a row == 1 across variable number of columns), you could try:
trial_m <- as.data.frame(cbind(c(0,1,1), c(0, 1, 0), c(0, 1, 1), c(0, 1, 0)))
aux_v <- c("V1","V2","V3")
apply(trial_m[aux_v], 1, function(x) all(x == 1))
#[1] FALSE TRUE FALSE
# change to only one column
aux_v <- c("V1")
apply(trial_m[aux_v], 1, function(x) all(x == 1))
#[1] FALSE TRUE TRUE
If you wanted to create a column that would be included in the dataset, and name the column the variables included:
# cols 1 and 3
aux_v <- c("V1","V3")
trial_m[,paste0(paste(aux_v, collapse = "_"), "_check")] <- apply(trial_m[aux_v],1, function(x) all(x == 1))
# V1 V2 V3 V4 V1_V3_check
#1 0 0 0 0 FALSE
#2 1 1 1 1 TRUE
#3 1 0 1 0 TRUE