Home > database >  LINQ Distinct does not invoke IEquatable<T>.Equals
LINQ Distinct does not invoke IEquatable<T>.Equals

Time:12-04

I have a set of domain object, deriving from a base, where I've overridden Equals, IEquatable<T>.Equals and equality operators. I've successfully used Contains, but now I am trying to use Distinct differently. Here's look at a sample code:

var a = new Test { Id = 1 };
var a2 = new Test { Id = 1 };
var list = new List<Test> { a, a2 };

var distinct = list.Distinct().ToList(); // both objects, Equal implementations not called
var containsA = list.Contains(a); // true, Equal implementations called
var containsA2 = list.Contains(a); // true
var containsNewObjectWithSameId = list.Contains(new Test { Id = 1 }); // true

public class Test : IEquatable<Test>
{
    public int Id { get; init; }
    public bool Equals(Test other)
    {
        if (ReferenceEquals(null, other))
            return false;
        if (ReferenceEquals(this, other))
            return true;
        if (this.GetType() != other.GetType())
            return false;
        return this.Id == other.Id;
    }

    public override int GetHashCode() => base.GetHashCode   this.Id;
}

Contains finds matches, but Distinct is feeling very inclusive and keeps them both. From MS docs:

The first search does not specify any equality comparer, which means FindFirst uses EqualityComparer.Default to determine equality of boxes. That in turn uses the implementation of the IEquatable.Equals method in the Box class.

What am I missing?

CodePudding user response:

Thanks @JonSkeet for your insight in the comments.

The problem in this case is the way I wrote my GetHashCode method. It has nothing to do with LINQ, as I originally thought.

Explanation

GetHashCode has to be identical for objects that compare equally. In my case - since the base implementation of object.Equals only checks for reference equality and I am comparing two separate objects - a and b, their base.GetHashCode would result in different values, which in turn would render those two objects as not equal.

Solution

In this case, simply returning the Id value is enough as is shown in MS docs:

One of the simplest ways to compute a hash code for a numeric value that has the same or a smaller range than the Int32 type is to simply return that value.

So changing the above code sample like this:

public override int GetHashCode() => this.Id;

would solve the issue. Please keep in mind that if the value of Id is not unique, this will cause ill behavior. In such cases you'll need another property to check and you will have to compose GetHashCode from ALL those properties. For further info refer to MS docs

  •  Tags:  
  • linq
  • Related