Can someone please explain Intellij's default equals implementation?-CodePudding

I get this suggestion from IntelliJ IDEA when using @Data annotation from lombok.

The class in question is an @Entity. Can someone explain:

what does it do exactly (especially the part with Hibernate)
Is this method preferred over comparing every field one-by-one? If yes, why?

    @Override
    public boolean equals(Object o) {
        if (this == o)
            return true;
        if (o == null || Hibernate.getClass(this) != Hibernate.getClass(o))
            return false;
        MyObject that = (MyObject ) o;
        return id != null && Objects.equals(id, that.id);
    }

The project contains/uses Spring boot, Hibernate, Lombok.

Thank you

CodePudding user response：

There's a fundamental problem at work, one inherent to JPA/Hibernate. For this example, let's say we have a db table named User, and we have a class also named User that models it.

The problem boils down to simply this:

What does the java class User represent? Does it represent 'a row in the database table "User"', or does it represent a User?

Depending on your answer, you get a wildly different requirement for the equals method. Depending on which equals method you chose, answering this question incorrectly leads to code bugs. As far as I know, there is no actual 'standard', people just sort of do something and most aren't aware that this is a fundamental problem.

It represents a row in the DB

Such an interpretation would then suggest the following implementation of your equals method:

If all fields that model the primary key columns in the DB definition are equal between the two instances, then they are equal, even if the other (non-primary-key) fields are different. After all, that's how the DB determines equality, so java code should match it.
The java code should be like SQL when dealing with NULLs. That is to say, quite unlike just about every equality definition, equals method code generator (including lombok, intellij, and eclipse), and even the Objects.equals method, in this mode, null == null should be FALSE, as it is in SQL! Specifically, if any of the primary key fields have a null value, that object cannot be equal to any other, even a carbon copy of itself; to stick to java rules, it can (must, really) be equal to its own reference.

In other words:

Any 2 objects are equal if either [A] they are literally the same object (this == other), or [B] BOTH object's unid field is initialized and equal. Whether you use null or 0 to track 'not written to DB yet', that value instantly disqualifies that row from being equal to any other, even another one with 100% identical values.

After all, if you make 2 separate new objects and save() them both, they would turn into 2 separate rows.

It represents a user object

Then what happens is that the equals rules do a 180. The primary key, assuming its an unid style primary key and not a natural primary key, are inherently an implementation detail. Imagine that somehow in your DB you end up with 2 rows for the exact same user (presumably somebody messed up and failed to add a UNIQUE constraint on username, perhaps). In the semantic model of users on the system, users are uniquely identified by their username, therefore, equality is defined by username alone. 2 objects with identical username but different unid values are nevertheless equal.

So which one do I take?

I have no idea. Fortunately, your question asked for explanation and not an answer!

What IntelliJ is telling you is to go with the first interpretation (row in the DB), and even applies the wonky null stuff correctly, so whomever wrote the suggestion tool in intellij at least seems to understand what's going on.

For what its worth, I think 'represents a row in the DB' is the more 'useful' interpretation (because not doing this involves invoking getters which make equality checks incredibly pricey, as it may result in hundreds of SELECT calls and a gigantic bunch of heap mem as you pull half the DB in!), however, the 'an instance of class User represents a user in the system' is the more java-like interpretation and the one that most java programmers would (erroneously then, if you use intellij's suggestion here) silently presume.

I've solved this problem in my own programming endeavours by never using hibernate/JPA in the first place, and using tools like JOOQ or JDBI instead. But, the downside is that generally you end up with more code – you really do sometimes have an object, e.g. called UserRow, representing a user row, and an object e.g. called User that represents a user on-system.

Another trick could be to decide to name all your Hibernate model classes as XRow. Names are important and the best documentation around: This makes no bones about it and clues in all users of this code about how they are to interpret its semantic meaning: Row in DB. Thus, the intellij suggestion would then be your equals implementation.

NB: Lombok is java and not Hibernate specific, so it makes the 'represents a user in the system' choice. You can try to push lombok towards the 'row in DB' interpretation by telling lombok to only use the id field (stick an @EqualsAndHashCode.Include on that field), but lombok would still consider 2 null values / 2 0 values identical even though it shouldn't. This is on hibernate, as it is breaking all sorts of rules and specs.

(NB: Added due to a comment on another answer)

Why is `.getClass()` being invoked?

Java has sensible rules about what equals is supposed to mean. This is in the javadoc of the equals method and these rules can be relied upon (and are, by e.g. HashSet and co). The rules are:

If aequals(b) is true , a.hashCode() == b.hashCode() must also be true.
a.equals(a) must be true.
If a.equals(b) then b.equals(a) must also be true.
If a.equals(b) and b.equals(c) then a.equals(c) must also be true.

Sensible and simple, right?

Nope. That's actually really complex.

Imagine you make a subclass of ArrayList: You decide to give lists a colour. You can have a blue list of strings and a red list of strings.

Right now the equality method of ArrayList checks if the that is a list and if so, compares elements. Seems sensible, right? We can see it in action:

List<String> a = new ArrayList<String>();
a.add("Hello");
List<String> b = new LinkedList<String>();
b.add("Hello");
System.out.println(a.equals(b));

This prints true.

Let's now make our coloured arraylist implementation: class ColoredList<T> extends ArrayList<T> { .. }. Surely, a red empty list is no longer equal to a blue empty list right?

Nope, you'd be breaking rules if you do that!

List<String> a = new ArrayList<String>();
List<String> b = new ColoredList<String>(Color.RED);
List<String> c = new ColoredList<String>(Color.BLUE);
System.out.println(a.equals(b));
System.out.println(a.equals(c));
System.out.println(b.equals(c));

That prints true/true/false which is invalid. The conclusion is that it is in fact impossible to make any list subclass that adds some semantically relevant information. The only subclasses that can exist are those which either actively break spec (bad idea), or whose additions have no impact on equality.

There is a different view of things which says that you ought to be able to make such classes. Again we're struggling, just like with the JPA/Hibernate case, about what equals is even supposed to mean.

A more common and far better default behaviour for your equals implementations is to simply state that any 2 objects can only be equal if they are of the exact same type: An instance of Dog cannot be equal to an instance of Animal.

The only way to accomplish this, given that the rule a.equals(b)? Then b.equals(a) exists, is that animal checks the class of that and returns false if it isn't exactly Animal. In other words:

Animal a = new Animal("Betsy");
Cow c = new Cow("Betsy");
a.equals(c); // must return false!!

The .getClass() check accomplishes this.

Lombok gives you the best of both worlds. It can't perform miracles, so it won't take away the rule that at the type level you need to choose extensibility, but lombok has the canEqual system to deal with this: The equals code of Animal will ask the that code if the two things can be equal. In this mode, if you have some non-semantically-different subclass of animal (such as ArrayList, which is a subclass of AbstractList and doesn't change the semantics at all, it just adds implementation details that have no bearing on equality), it can say that it can be equal, whereas if you have one that is semantically different, such as your coloured list, it can say that none are.

In other words, going back to the coloured lists, IF ArrayList and co were written with lombok's canEqual system, this could have worked out, you could have had the results (where a is an arraylist, b is a red list, and c is a blur list):

a.equals(b); // false, even though same items
a.equals(c); // false, same reason.
b.equals(c); // false and now it's not a conflict.

Lombok's default behaviour is that all subtypes add semantic load and therefore any X cannot be equal to any Y where Y is a subclass of X, but you can override this by writing out the canEqual method in Y. You would do that if you write a subclass that doesn't add semantic load.

This isn't going to help you in the slightest with the problems above about hibernate.

Who knew something as seemingly simple as equality is hiding 2 intractably difficult philosophical treatises, huh?

For more info on canEqual,

The IntelliJ default uses a file 'equalsHelper.vm'. I found a possible source of that file version at https://github.com/JetBrains/intellij-community/blob/master/java/java-impl/src/com/intellij/codeInsight/generation/equalsHelper.vm

It contains this:

#macro(addInstanceOfToText)
  #if ($checkParameterWithInstanceof)
  if(!($paramName instanceof $classname)) return false;
  #else
  if($paramName == null || getClass() != ${paramName}.getClass()) return false;
  #end
#end

So apparently you have a different version of that file? Or you use a different template? Maybe some plugin changed it?

CodePudding user response：

Two objects are not equal if they are of different class.

For 'preferred', it depends on what an 'id' is. The last line seems a little redundant; it could have been

return Objects.equals(id, that.id);

since the null case is handled by Objects.equals. But to my taste, it's clearer to write

return id != null && id.equals(that.id);

The extra layer adds nothing that I can see in the example.

It represents a row in the DB

It represents a user object

So which one do I take?

Why is .getClass() being invoked?

Why is `.getClass()` being invoked?