Home > Enterprise >  Compare two lists of nodes with total match
Compare two lists of nodes with total match

Time:09-09

I create the two lists of object, but cannot do total match value which is

var inputNodes = new List<nodes>()
            {
                new node() { nodeName= "D100", DataLength = 1 },
                new node() { nodeName= "D101", DataLength = 1 },
                new node() { nodeName= "D102", DataLength = 1 },
                new node() { nodeName= "D103", DataLength = 1 },
                new node() { nodeName= "D104", DataLength = 1 },
                new node() { nodeName= "D105", DataLength = 1 },
                new node() { nodeName = "D106", DataLength = 1 }
            };

        var inputNodes2 = new List<nodes>()
        {
            new node() { nodeName= "D100", DataLength = 1 },
            new node() { nodeName= "D101", DataLength = 1 },
            new node() { nodeName= "D102", DataLength = 1 },
            new node() { nodeName= "D103", DataLength = 1 },
            new node() { nodeName= "D104", DataLength = 1 },
            new node() { nodeName= "D105", DataLength = 1 },
            new node() { nodeName= "D106", DataLength = 1 }
        };

I try to use check var isEqual = inputNodes.SequenceEqual(inputNodes2) It return false and I don't want to use loop or list.select function any idea for that ?

CodePudding user response:

Use a IEqualityComparer like below.

class NodeComparer : IEqualityComparer<node>
{
    public bool Equals(node? x, node? y)
    {
        if(x == null && y == null){
            return true;
        }

        if(x == null || y == null)
        {
            return false;
        }

        return string.Equals(x.nodeName, y.nodeName) && x.DataLength == y.DataLength;
    }

    public int GetHashCode([DisallowNull] node obj)
    {
        return obj.nodeName.GetHashCode() * obj.DataLength.GetHashCode();
    }
}

and then use it in the SequenceEquals

inputNodes.SequenceEqual(inputNodes2, new NodeComparer());

CodePudding user response:

It seems to me that you are not familiar with the concept of equality, and how you can change the definition of equality to your definition. Hence I'll explain default equality and how to write an equality comparer that holds your idea of equality.

By default equality of objects is reference equality: two objects are equal if they refer to the same object:

Node A = new Node {...} Node X = A; Node Y = A;

Objects X and Y refer to the same object, and thus:

Assert(X == Y) IEqualityComparer nodeComparer = EqualityComparer.Default; Assert(nodeComparer.Equals(x, y));

However, in your case inputNodes[0] and inputNodes2[0] do not refer to the same object. Hence they are not equal Nodes, and thus SequenceEqual will return false.

You don't want to use the standard Equality comparison, you want a special one. According to your definition, two Nodes are equal, if the properties of the Nodes are equal. This definition of equality is called "value equality", as contrast to "reference equality"

Because you don't want to use the default reference equality, you'll have to write the equality comparer yourself. The easiest way to do this, is to derive a class from EqualityComparer.

public class NodeComparer : EqualityComparer<Node>
{
    public static IEqualityComparer<Node> ValueComparer {get} = new NodeComparer();

    public override bool Equals(Node x, Node y) {... TODO: implement}
    public override int GetHashCode(node x) {... TODO: implement}
}

Usage will be as follows:

IEnumerable<Node> inputNodes1 = ...
IEnumerable<Node> inputNodes2 = ...

IEqualityComparer<Node> nodeComparer = NodeComparer.ValueComparer;
bool equalInputNodes = inputNodes1.SequenceEqual(inputNodes2, nodeComparer);

Equals

The definition depends on YOUR definition of equality. You can use any definition you need. In your case, you chose a straightforward "compare by value":

public override bool Equals(Node x, Node y)
{
    // The following statements are almost always the same for every equality
    if (x == null) return y == null;               // true if both null
    if (y == null) return false;                   // because x not null
    if (Object.ReferenceEquals(x, y)) return true; // because same object
    if (x.GetType() != y.GetType()) return false;  // different types

In some occassions, these statements might be different. For example, if you want to create a string comparer where a null string equals an empty string:

string x = null;
string y = String.Empty;
IEqualityComparer<string> stringComparer = MyStringComparer.EmptyEqualsNull;
Assert(stringComparer.Equals(x, y));

Of, if you think that Teachers are Persons, than in some cases you might want the if you compare a Teacher with a Person, you might not want to check on the type.

But all in all, most comparers will use these four initial lines.

Continuing your equality:

return x.NodeName == y.NodeName
    && x.DataLength == y.DataLength;

To be prepared for the future, consider the following:

private static readonly IEqualityComparer<string> nodeNameComparer = StringComparer.Default;

and in your equals method:

return nodeNameComparer.Equals(x.NodeName, y.NodeName)
    && x.DataLength == y.DataLength;

So if in future you want to do a case insensitive string comparison, you only have to change the static declaration of your nodeNameComparer:

private static readonly IEqualityComparer<string> nodeNameComparer = StringComparer.OrdinalIgnoreCase;

GetHashCode

GetHashCode is meant to create a fast method to separate most unequal objects. This is useful, if your Node has two hundred properties, and you know, that if they have equal value for property Id, that very likely all other elements will be equal.

Note that I use "very likely". It is not guaranteed for 100% that if X has the same hashcode as Y, they X will equal Y. But you are certain: if X has a different hashcode than Y, then they will not be equal.

The only requirement for GetHashCode is that if X equals Y, then MyComparer.GetHashCode(X) equals MyComparer.GetHashCode(Y);

If X is not equal to Y, then you don't know whether their hashcodes will be different, although it would be nice if so, because code will be more efficient.

GetHashcode is meant to be fast, if doesn't have to check everything, it might be handy if it separates most elements, but it does not have to be a complete equality check.

How about this one:

public override int GetHashCode(Node x)
{
    if (x == null) return 874283;      // just a number

    // for HashCode only use the NodeName:
    return x.NodeName.GetHashCode();
}

Or, if you use a string comparer in method Equals for NodeName:

private static readonly IEqualityComparer<string> nodeNameComparer = StringComparer.OrdinalIgnoreCase;

// this comparer is used in Equals

public override int GetHashCode(Node x)
{
    if (x == null) return 874283;      // just a number
    return nodenameComparer.GetHashCode(x.NodeName);
}

So if in future you change the comparison method for the nodename to CurrentCulture, then both Equals and GetHashCode will use the proper comparer.

Node a = new Node {nodeName= "X", DataLength = 1 };
Node b = new Node {nodeName= "X", DataLength = 1 };
Node c = new Node {nodeName= "X", DataLength = 2 };
Node d = new Node {nodeName= "Y", DataLength = 1 };

It is easy to see, that b equals a. c and d are different than a.

Although c is different, the comparer will return the same hashcode as for a. So GetHashCode is not enough for exact equality, but a good GetHashCode will separate most different objects.

  • Related