Home > OS >  Is there a way in R to write out (a or b) == (c or d) in R?
Is there a way in R to write out (a or b) == (c or d) in R?

Time:09-12

I want to select rows in which (a or b) == (c or d) without having to write out all the combinations. For example:

a  b  c  d
1  2  3  4
1  1  2  2
1  2  1  3
2  5  3  2
4  5  5  4

df$equal <- df$a == df$c | df$a == df$d | df$b == df$c | df$b == df$d

would result in:

a  b  c  d equal
1  2  3  4 FALSE
1  1  2  2 FALSE
1  2  1  3 TRUE
2  5  3  2 TRUE
4  5  5  4 TRUE

Is there a way to condense the statement, (a or b) == (c or d) so that one might not have to write out all four combinations? I need this for more complications situations in which there are more combinations. e.g., (a or b) == (c or d) == (e or f) == (g or h)

CodePudding user response:

We could select the columns of interest and do the ==

df$equal <- Reduce(`|`, lapply(df[1:2], \(x) rowSums(df[3:4] == x) > 0))

-output

> df
  a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3  TRUE
4 2 5 3 2  TRUE
5 4 5 5 4  TRUE

Or using if_any

library(dplyr)
df %>%
  mutate(equal = if_any(a:b,  ~.x == c|.x == d))
  a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3  TRUE
4 2 5 3 2  TRUE
5 4 5 5 4  TRUE

If there are more columns and the comparison is based on 'a', 'b' columns

df %>%
    mutate(equal = if_any(-c(a, b), ~ .x == a|.x == b))

data

df <- structure(list(a = c(1L, 1L, 1L, 2L, 4L), b = c(2L, 1L, 2L, 5L, 
5L), c = c(3L, 2L, 1L, 3L, 5L), d = c(4L, 2L, 3L, 2L, 4L)),
 class = "data.frame", row.names = c(NA, 
-5L))

CodePudding user response:

Or, as further version of @akrun's answer:

df <- data.frame(
  a = c(1L, 1L, 1L, 2L, 4L), 
  b = c(2L, 1L, 2L, 5L, 5L), 
  c = c(3L, 2L, 1L, 3L, 5L), 
  d = c(4L, 2L, 3L, 2L, 4L)
)

cbind(df, equal = sapply(1:nrow(df), \(i) any(df[i, 1:2] %in% df[i, 3:4])))

resulting in:

  a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3  TRUE
4 2 5 3 2  TRUE
5 4 5 5 4  TRUE

CodePudding user response:

Hope this base R option with apply Reduce split intersect could give a solution for general cases, ,e.g., a,b,c,d,e and f columns

df$equal <- apply(
  df,
  1,
  function(v) {
    length(
      Reduce(
        intersect,
        split(v, gl(length(v) / 2, 2, length(v)))
      )
    ) > 0
  }
)

such that

> df
  a b c d equal
1 1 2 3 4 FALSE
2 1 1 2 2 FALSE
3 1 2 1 3  TRUE
4 2 5 3 2  TRUE
5 4 5 5 4  TRUE
  • Related