Home > Back-end >  Boxing type equality and dictionary keys
Boxing type equality and dictionary keys

Time:11-24

I have a bit of confusion about how dictionary compare keys when it comes to boxed types.

using System;
using System.Collections.Generic;
                
public class Program
{
     public static void Main()
     {
           int i = 5;
           int n = 5;
           
           object boxedI = i;
           object boxedN = n;
           
           Console.WriteLine("i == n ? "   (i == n) ); //true
           Console.WriteLine("bI == bN ? "   (boxedI == boxedN) ); //false
           
           Dictionary<object,int> _dict = new Dictionary<object,int> ();
           _dict.Add(boxedI,5);
           
           Console.WriteLine("_dict contains boxedI? "   _dict.ContainsKey(boxedI) ); //true
           Console.WriteLine("_dict contains boxedN? "   _dict.ContainsKey(boxedN) ); //!! also true, surprise me
           
           _dict.Add(boxedN,5);//exception
     }
}

I expected that since equality operator "failed" (AFAIK it's based on method GetHashCode the same method dictionary use to build it's internal hashtable form objects) then also the dictionary should "fail" the comparison of boxed I and N, but that's not the case.

Here the fiddle I use: https://dotnetfiddle.net/DW54nN

So I am asking if someone can explain to me what append here and what I am missing in my mental model.

CodePudding user response:

TLDR: Comparing boxed values using == uses reference equality of the boxing object, but comparing boxed values using Equals() uses the underlying values' Equals().


When a value type is boxed, the boxing object's GetHashCode() and Equals() method implementations call the boxed value's versions.

That is, given:

  • A value type VT that implements GetHashCode() and Equals() correctly.
  • An instance x of VT.
  • An instance y of VT with the same value as x.
  • A boxed instance of x: bx.
  • A boxed instance of y: by.

The following will be the case:

x.Equals(y)      == true             // Original values are equal
bx.Equals(by)    == true             // Boxed values are equal
x.GetHashCode()  == y.GetHashCode()  // Original hashes are equal
bx.GetHashCode() == by.GetHashCode() // Boxed hashes are equal
bx.GetHashCode() == x.GetHashCode()  // Original hash code == boxed hash code

However, the == operator is NOT delegated by the boxed version, and in fact it is implemented using reference equality, so:

(x == y)   == true  // Original values are equal using "=="
(bx == by) == false // Boxed values are not equal using "=="
ReferenceEquals(bx, by) == false // References differ

The Dictionary is using GetHashCode() and Equals() for comparing objects, and because they delegate to the underlying values it works correctly.


This is demonstrated by the following program:

using System;

namespace Demo
{
    struct MyStruct: IEquatable<MyStruct>
    {
        public int X;

        public bool Equals(MyStruct other)
        {
            return X == other.X;
        }

        public override bool Equals(object obj)
        {
            if (obj is not MyStruct other)
                return false;

            return X == other.X;
        }

        public override int GetHashCode()
        {
            return -X;
        }
    }

    class Program
    {
        static void Main()
        {
            var x = new MyStruct { X = 42 };
            var y = new MyStruct { X = 42 };
            object bx = x;
            object by = y;

            Console.WriteLine(bx.GetHashCode()); // -42
            Console.WriteLine(y.GetHashCode()); // -42

            Console.WriteLine(bx.Equals(by)); // True
            Console.WriteLine(bx == by); // False
            Console.WriteLine(object.ReferenceEquals(bx, by)); // False
        }
    }
}

CodePudding user response:

This is a reference vs value type thing:

int i = 5;
int n = 5;

These are value types and get put on the stack, so when we compare them we go to the stack and can say that the value of i and n are 5 which make them "equal".

object boxedI = i;
object boxedN = n;

When you put these values in an object you create a "reference" type which means that a value gets put into the heap, and a reference gets put on the stack so you can imagine that on the stack you have:

#0005 -> boxedI
#0006 -> boxedN

now when you do the equals you are comparing #0005 == #0006 which are not the same

but when you pass boxedI or boxedN into ContainsKey that method knows how to follow the reference (or pointer) to the value that is on the heap (5).

so when you ask for ContainsKey(boxedI) what you're doing is asking for ContainsKey(5) (in rough terms)

that's why that those two are "equal"

  • Related