Home > Software design >  How to use a condition on a whole data.frame with a vector as comparison?
How to use a condition on a whole data.frame with a vector as comparison?

Time:05-12

This might be quite simple but it's been bugging me for a while. I lack the terminology for this type of indexing/conditional check - so it was hard for me to look it up.

Have a data frame dat.

X1 X2 X3
1   1  2
1   1  3
3   1  2

dat == 1 gives me:

 X1    X2    X3
TRUE  TRUE FALSE
TRUE  TRUE FALSE
FALSE TRUE FALSE

dat == 1 | dat == 3 gives me:

 X1   X2   X3
TRUE TRUE FALSE
TRUE TRUE TRUE
TRUE TRUE FALSE

However, dat %in% c(1,3) gives me a single line vector back, and doesn't work "as intended". How can I get it to work on a vector without some fancy loop?

Thank you in advance.

CodePudding user response:

apply(dat, 1, function(x) x %in% c(1,3))

     [,1] [,2]  [,3]
[1,] TRUE TRUE FALSE
[2,] TRUE TRUE  TRUE
[3,] TRUE TRUE FALSE

CodePudding user response:

Here's answer with a full reprex and some further explanation.

Create the data.

dat <- data.frame(
  x1 = c(1, 1, 3),
  x2 = c(1, 1, 1),
  x3 = c(2, 3, 2)
)

As with other answers, using the apply family works.

sapply(dat, function(x) x %in% c(1, 3))
#       x1   x2    x3
# [1,] TRUE TRUE FALSE
# [2,] TRUE TRUE  TRUE
# [3,] TRUE TRUE FALSE

The reason %in% does not work as you might expect can be found in the documentation, which states.

"Factors, raw vectors and lists are converted to character vectors,
and then x and table are coerced to a common type"

We can check this with our reprex.

as.character(dat)
# [1] "c(1, 1, 3)" "c(1, 1, 1)" "c(2, 3, 2)"

as.character(dat) %in% c("c(1, 1, 3)")
#[1]  TRUE FALSE FALSE

Which gives the same as

dat %in% c("c(1, 1, 3)")
#[1]  TRUE FALSE FALSE
  • Related