I am a beginner just with 2 weeks experience.
my current system is R 4.1.2 windows on WIN11
when I run below code,
y <- c(1, 2, 3, 4)
z <- c(1, 4)
y == z
The output is TRUE FALSE FALSE TRUE, which is quite acceptable.
However, when I changed the vector y sequence to
y <- c(2, 1, 3, 4)
z <- c(1, 4)
y == z
Then, surprisingly, the output is FALSE FALSE FALSE TRUE, while, I supporsed the result would be TRUE FALSE FALSE TRUE.
What's wrong with me? can any one help me to explain the scheme of the vector comparison and why I have above result?
CodePudding user response:
In your example the smaller vector is being 'recycled' and this is leading to your "strange" result, i.e.
y <- c(1, 2, 3, 4)
z <- c(1, 4)
y == z
#> [1] TRUE FALSE FALSE TRUE
y <- c(2, 1, 3, 4)
z <- c(1, 4)
y == z
#> [1] FALSE FALSE FALSE TRUE
When the smaller vector is 'manually' recycled you get the same result:
y <- c(1, 2, 3, 4)
z <- c(1, 4, 1, 4)
y == z
#> [1] TRUE FALSE FALSE TRUE
y <- c(2, 1, 3, 4)
z <- c(1, 4, 1, 4)
y == z
#> [1] FALSE FALSE FALSE TRUE
I suspect you are looking for the %in%
value matching operator (i.e. for each number in the vector "y", does it exist in "z"?):
y <- c(2, 1, 3, 4)
z <- c(1, 4)
y %in% z
#> [1] FALSE TRUE FALSE TRUE
Does that solve your issue?
CodePudding user response:
This has implications for when you use the tidyverse
, for example:
tibble::tibble(a = c(1,2,3,4), b = c(1,2))
Throws an error, as tibbles will only recycle vectors of length one.
In the rare case that you don't want recycling to happen, see Moody_Mudskippers' answer.