strange result in vector comparison-CodePudding

I am a beginner just with 2 weeks experience.

my current system is R 4.1.2 windows on WIN11

when I run below code,

y <- c(1, 2, 3, 4)
z <- c(1, 4)
y == z

The output is TRUE FALSE FALSE TRUE, which is quite acceptable.

However, when I changed the vector y sequence to

y <- c(2, 1, 3, 4) 
z <- c(1, 4) 
y == z

Then, surprisingly, the output is FALSE FALSE FALSE TRUE, while, I supporsed the result would be TRUE FALSE FALSE TRUE.

What's wrong with me? can any one help me to explain the scheme of the vector comparison and why I have above result?

CodePudding user response：

In your example the smaller vector is being 'recycled' and this is leading to your "strange" result, i.e.

y <- c(1, 2, 3, 4)
z <- c(1, 4)
y == z
#> [1]  TRUE FALSE FALSE  TRUE

y <- c(2, 1, 3, 4) 
z <- c(1, 4) 
y == z
#> [1] FALSE FALSE FALSE  TRUE

When the smaller vector is 'manually' recycled you get the same result:

y <- c(1, 2, 3, 4)
z <- c(1, 4, 1, 4)
y == z
#> [1]  TRUE FALSE FALSE  TRUE

y <- c(2, 1, 3, 4) 
z <- c(1, 4, 1, 4) 
y == z
#> [1] FALSE FALSE FALSE  TRUE

I suspect you are looking for the %in% value matching operator (i.e. for each number in the vector "y", does it exist in "z"?):

y <- c(2, 1, 3, 4) 
z <- c(1, 4) 
y %in% z
#> [1] FALSE  TRUE FALSE  TRUE

Does that solve your issue?

CodePudding user response：

z gets

This has implications for when you use the tidyverse, for example:

tibble::tibble(a = c(1,2,3,4), b = c(1,2))

Throws an error, as tibbles will only recycle vectors of length one.

In the rare case that you don't want recycling to happen, see Moody_Mudskippers' answer.