Home > Net >  equals and hashCode with many fields in Java?
equals and hashCode with many fields in Java?

Time:01-11

In Java apps, I prefer to use unique fields in equals and hashCode methods instead of adding only id field or all the fields. However, I am confused about the following points:

1. By considering object states in Hibernate, I think it is good practice not using id field in equals and hashCode methods, right?

2. When there is a unique field in a class, is it enough to use only one of the unique fields in equals and hashCode methods (except from id field)?

3. Should I add all the fields except from id field when there is not any unique field except from id field in a class? Or should I only add some numeric field instead of adding text fields?

CodePudding user response:

JPA and Hibernate don't specify or rely on any particular semantics for entities' equals() and hashCode() methods, so you can do what you want.

Good alternatives

With that said, there is a handful of alternatives for equality that make much more sense to me than any others:

  1. Equality corresponds to object identity. This is of course the default provided by Object.equals(), and it can serve perfectly well for entities. OR

  2. Equality corresponds to persistent identity. That is, entities are equal if and only if they have the same entity type and primary key. OR

  3. Equality corresponds to (only) value equality. That is, equality of all corresponding persistent fields except the ID. There are additional variations around how that applies to mapped relationships. OR

  4. Equality corresponds to persistent identity AND value equality. Again, there are variations around how the value equality part applies to mapped relationships.

General advice

In general, you would do well to follow a fule rules of thumb:

  1. As with most other classes, especially mutable ones, default to just inheriting Object.equals() and Object.hashCode(). Have a specific purpose and plan before you do otherwise, and remember that you get only one choice for this. And that it is impactful.

  2. If you do override equals() (and therefore hashCode() as well) then do it in a consistent way across all your entities.

  3. Think carefully before you go with an option involving value equality. This is usually a poor choice for mutable classes in general, and entities are no exception.

Specific Questions

1. By considering object states in Hibernate, I think it is good practice not using id field in equals and hashCode methods, right?

I think using the ID is fine. It's simply a question of what you want equality to represent for your entities. You absolutely can have distinct entity objects with the same type and ID, and you might want to be able to detect that with equals(). The other persistent fields might or might not factor into that.

In particular, an equals() method based solely on entity ID might make sense for entities that appear on the "many" side of a one-to-many relationship when that is mapped to a Set.

2. When there is a unique field in a class, is it enough to use only one of the unique fields in equals and hashCode methods (except from id field)?

I see no good reason to consider only a proper subset of unique fields, except that subset consisting only of the entity ID. Or if all the fields are unique then the one consisting of all the fields except the ID. The logic that suggests that you might be able to consider other proper subsets revolves around the persistent identity of the entity, which is completely and best represented by its ID.

3. Should I add all the fields except from id field when there is not any unique field except from id field in a class? Or should I only add some numeric field instead of adding text fields?

If your sense of equality is to be based on entity value then I don't see how it makes much sense to omit any persistent fields except, possibly, the ID. Do not arbitrarily omit the ID -- it may very well be something you want to include. Again, it depends on what equals() is intended to mean for your entities.

CodePudding user response:

That's a tricky question that hibernate itself doesn't have a clear answer on.

John Bollinger's answer covers your specific question, but there is some additional context about how to think about equality and hibernate that should help figure out what to do. After all, given that hibernate doesn't require you to do anything particular, you can do whatever you want, which leads to the obvious question: ... okay, so what should I do, then?

That question boils down to (using Person as an arbitrary example of a model class associated table; furthermore, lets say the person table has a single unique ID that is generated (A random UUID or auto-sequenced integer value).

What does an instance of Person represent?

There are in broad strokes 2 answers:

  • It represents a person. A row in the person table also represents a person; these 2 things aren't related.
  • It represents a row in the person table.
  • It represents a state in my application, nothing more.

Even though these things sound quite similar, they result in opposite meanings as to equality.

Which choice is correct? That's up to you.

When reading on, remember:

Any Person instance which isn't "saved" yet, would have a null value for id, because upon insertion, hibernate will ask the DB to generate a value for it or generates one itself and only then fills it in.

An instance represents a row

  • Equality under the second model (an instance of Person represents a row in the table) should look only at the id column, because that defines row uniqueness; any 2 representations of a row in the person table are guaranteed to be referring to the same row (hence, equal) if and only if the id is equal. That is a necessary and sufficient condition: If they are equal the 2 objects are necessarily referring to the same row, and if they aren't equal, then they are necessarily referring to different rows.
  • Notably, if id is still null, then they cannot be equal, not even to themselves: More generally the question: "Is this object-representing-a-row equal to this other object-representing-a-row" is a meaningless question if these objects are representing rows-to-be (unsaved rows). If you invoke save() on each object, you end up with 2 rows. Optimally such an object should be considered in a state such that attempting to invoke equals on it is a failure, but the spec of equals states that they can't throw, therefore, false is the best answer. This would mean you want:
class Person {
  // fields
  @Override public boolean equals(Object other) {
    if (other == this) return true;
    if (other == null || other.getClass() != Person.class) return false;
    UUID otherId = ((Person) other).id;
    return id == null ? false : id.equals(otherId);
  }
}

This defines your equals method as 'ends up representing the same row'. This holds even if you change meaningful state:

  • Change the name and save the object? It's... still the same row, and this equality implementation reflects this.
  • Call save() on each in the comparison when they were unsaved? Then you get 2 rows - and this equality implementation reflects this before and after attempting to save it.
  • If invoking on self (a.equals(a)) this returns true as the equality spec demands; it also works out in the 'modelling a row' view: If you invoke save() on the same object twice, it's still just one row.

An instance represents a person

The nature of what a person is is entirely unrelated to the autosequence/autogen ID it gets; the fact that we're using hibernate is an implementation detail that should play no part at all in considering equality; after all, this object represents the notion of a person, and that notion exists entirely independent of the database. The database is one thing that is modelling persons; instances of this class are another.

In this model you should do the exact opposite: Find something that uniquely identifies a person itself, and compare against that. After all, if you have 2 rows in a database that both contain the same social security number, then you have only 1 person.. and you just happen to have 2 rows that are both referring to the same person. Given that we chose our instance to imply that it represents a person, then an instance loaded from row A, and an instance loaded from row B, ought to be considered as equal - after all, they are representing the same individual.

In this case, you write an equals method that considers all relevant fields except the autoseq/autogen ID field! If there is a separate unique id such as social security number, use that. If there isn't, essentially it boils down to an equals method that compares all fields, except ID. Because that's the one field that definitely has zero bearing on what defines a person.

An instance defines a state in your application

This is almost a cop-out, and in general means equality is irrelevant / not applicable. It's like asking how to implement an equals method to an InputStream implementation - mostly, you.. don't.

Here, the default behaviour (Object's own impls) are what you want, and therefore, you don't implement either hashCode or equals. Any instance of Person is equal to itself (as in, a.equals(a), same reference), and not equal to any other, even if the other has identical values for each and every field, even the id field isn't null (represents the same row).

Such an object cannot meaningfully be used as a value object. For example, it would be pointless to stuff such things in a hashmap (at best, you can stuff them in an IdentityHashMap, as those semantics would apply. Only way to do any lookups is to have a ref that was .put() into it before and call .get() with that).

Which one is right? Up to you. But document it clearly, because in my experience, lots of hibernate users are absolutely convinced either the first or second model is the one, and only, right answer, and consider the other answer utterly bonkers. This is problematic - they'd be writing their code assuming all hibernate model classes work precisely as they want, and would therefore not even be thinking of checking docs/impl to know how it actually works.

For what its worth, objects are objects and database rows do not neatly map to the notion of an object. SQL's and java's notion of null are utterly incompatible, and the notion of 'a query' does not neatly map to tables (between selecting expressions, selecting on views, and JOINs, that should be obvious) - hibernate is tilting at windmills. It is a leaky abstraction and this is one of its many, many leaks. Leaky abstractions can be useful, just, be aware that at the 'edges' the principle hibernate tries to peddle you (that objects can represent query results and rows) has limits you will run into. A lot.

  • Related