Home > Software engineering >  Strange behaviour of == with formula
Strange behaviour of == with formula

Time:12-17

I am a bit puzzled by the following. I have two formulas and would like to check whether they are the same. Here I expect to get FALSE returned.

fm1 <- formula(schades ~ termijn   zipcode   provincie   regionvormgemeente   energielabel   trede)
fm2 <- formula(schades ~ termijn   zipcode   provincie   regionvormgemeente   energielabel)
fm1 == fm2
#> [1] TRUE

identical(fm1, fm2)
#> [1] FALSE

What is the reason that fm1 == fm2 returns TRUE?

Created on 2021-12-17 by the reprex package (v2.0.1)

CodePudding user response:

== is designed to compare values in atomic vectors, not formulars.

Furthermore, see the following example from ?== :

x1 <- 0.5 - 0.3
x2 <- 0.3 - 0.1
x1 == x2                   # FALSE on most machines
isTRUE(all.equal(x1, x2))  # TRUE everywhere

Applied to your example you can find:

    > fm1 <- formula(schades ~ termijn   zipcode   provincie   regionvormgemeente   energielabel   trede)
> fm2 <- formula(schades ~ termijn   zipcode   provincie   regionvormgemeente   energielabel)
> fm1 == fm2
[1] TRUE
> 
> all.equal(fm1, fm2)
[1] "formulas differ in contents"
> isTRUE(all.equal(fm1,fm2))
[1] FALSE

But apparently reducing the number of predictors returns the expected result. It just illustrates that == should not be used for this type of comparison as its behaviour is not coherent:

> fm1 <- formula(schades ~ termijn   zipcode   provincie)
> fm2 <- formula(schades ~ termijn   zipcode)
> fm1 == fm2
[1] FALSE
> isTRUE(all.equal(fm1,fm2))
[1] FALSE
  • Related