I want to select rows in which (a or b) == (c or d) without having to write out all the combinations. For example:
a b c d
1 2 3 4
1 1 2 2
1 2 1 3
2 5 3 2
4 5 5 4
df$equal <- df$a == df$c | df$a == df$d | df$b == df$c | df$b == df$d
would result in:
a b c d equal
1 2 3 4 FALSE
1 1 2 2 FALSE
1 2 1 3 TRUE
2 5 3 2 TRUE
4 5 5 4 TRUE
Is there a way to condense the statement, (a or b) == (c or d) so that one might not have to write out all four combinations? I need this for more complications situations in which there are more combinations. e.g., (a or b) == (c or d) == (e or f) == (g or h)
CodePudding user response:
We could select the columns of interest and do the ==
df$equal <- Reduce(`|`, lapply(df[1:2], \(x) rowSums(df[3:4] == x) > 0))
-output
> df
a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3 TRUE
4 2 5 3 2 TRUE
5 4 5 5 4 TRUE
Or using if_any
library(dplyr)
df %>%
mutate(equal = if_any(a:b, ~.x == c|.x == d))
a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3 TRUE
4 2 5 3 2 TRUE
5 4 5 5 4 TRUE
If there are more columns and the comparison is based on 'a', 'b' columns
df %>%
mutate(equal = if_any(-c(a, b), ~ .x == a|.x == b))
data
df <- structure(list(a = c(1L, 1L, 1L, 2L, 4L), b = c(2L, 1L, 2L, 5L,
5L), c = c(3L, 2L, 1L, 3L, 5L), d = c(4L, 2L, 3L, 2L, 4L)),
class = "data.frame", row.names = c(NA,
-5L))
CodePudding user response:
Or, as further version of @akrun's answer:
df <- data.frame(
a = c(1L, 1L, 1L, 2L, 4L),
b = c(2L, 1L, 2L, 5L, 5L),
c = c(3L, 2L, 1L, 3L, 5L),
d = c(4L, 2L, 3L, 2L, 4L)
)
cbind(df, equal = sapply(1:nrow(df), \(i) any(df[i, 1:2] %in% df[i, 3:4])))
resulting in:
a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3 TRUE
4 2 5 3 2 TRUE
5 4 5 5 4 TRUE
CodePudding user response:
Hope this base R option with apply
Reduce
split
intersect
could give a solution for general cases, ,e.g., a
,b
,c
,d
,e
and f
columns
df$equal <- apply(
df,
1,
function(v) {
length(
Reduce(
intersect,
split(v, gl(length(v) / 2, 2, length(v)))
)
) > 0
}
)
such that
> df
a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3 TRUE
4 2 5 3 2 TRUE
5 4 5 5 4 TRUE