Home > Software design >  C# Compare two Arrays and perform Action on Matches
C# Compare two Arrays and perform Action on Matches

Time:10-13

my question is: How can i compare two arrays, and perform an action on the Elements that are in both ? I use C# / LINQ

What i'm trying to do: Loop throu a array of users. A other Array, containing rules for some / specific users. So for each user, which has a rule in the rules array, increment a field on the user object.

I already tried using Linq:

var array1 = context.SomeSecret.ToArray();
var array2 = anotherContext.AnotherSecret.ToArray();

(from rule in array2
 from user in array1
 where user.ID = rule.ID
 select user).ToObserveable().Subscribe<User>(x => x.MaxRules  );

What i'm trying to do: Loop throu a array of users. A other Array, containing rules for some / specific users. So for each user, which has a rule in the rules array, update a field on the user object.

This was the original Code:

var userDic = context.SomeSecret.ToDictionary(u => u.ID);
var rules = anotherContext.AnotherSecret.ToList();

foreach(var rule in rules)
{
    if(userDic.ContainsKey(rule.UserID))
    {
        userDic[rule.UserID]  ;
    }
}

user.ID and rule.UserID are the Same.

Note:
This is "meaningless" Code

Is there any "elegant" way to solve that ?

Thanks in advance.

CodePudding user response:

You are trying to do too much in a few statements. This makes your code difficult to read, difficult to reuse, difficult to change and difficult to unit test. Consider to make it a habit to make small reusable methods.

IEnumerable<Secret> GetSecrets() {...}
IEnumerable<Secret> GetOtherSecrets() {...}

How can i compare two arrays, and perform an action on the Elements that are in both?

LINQ can only extract data from your source data. LINQ cannot change the source data. To change the source data, you should enumerate the data that you extracted using LINQ. This is usually done using foreach.

So you have two sequences of Secrets, and you want to extract all Secrets that are in both sequences.

Define equality

First of all, you need to specify: when is a Secret in both sequences:

Secret a = new Secret();
Secret b = a;
Secret c = (Secret)a.Clone();

It is clear that a and b refer to the same object. Although the values of all properties and fields in Secret a and Secret c are equal, they are different instances.

The effect is, that if you change the value of one of the properties of Secret a, then the value is also changed in Secret b. However, Secret C remains unchanged.

Secret d = new Secret();
Secret e = new Secret();

IEnumerable<Secret> array1 = new Secret[] {a, d};
IEnumerable<Secret> array2 = new Secret[] {a, b, c, e};

It is clear that you want a in your end result. You also want b, because a and b refer to the same object. It is also clear that you don't want d, nor e in your end result. But are in your opinion a and c equal?

Another ambiguity in your requirements:

IEnumerable<Secret> array1 = new Secret[] {a};
IEnumerable<Secret> array2 = new Secret[] {a, a, a, a, a};

How many times do you want a in your end result?

Equality comparers

By default a and c are different objects, a == c yields false.

However if you want to define them equal, you need to say in your LINQ: do not use the standard definition for equality, use my definition of equality.

For this we need to write an Equality Comparer. Or to be more precise: create an object of a class that implement IEqualityComparer<Secret>.

Luckily this is usually quite straightforward.

Definition: Two objects of type Secret are equal if all properties return the same value.

class SecretComparer : EqualityComparer<Secret>
{
    public static IEqualityComparer<Secret> ByValue {get;} =  new SecretComparer();

    public override bool Equals (Secret x, Secret y)
    {
        ... // TODO: implement
    }

    public override int GetHashCode (Secret x)
    {
        ... // TODO: implement
    }

Implementation is below

The reason that I derive from class EqualityComparer<Secret>, and not just implement IEqualityComparer<Secret>, is that class EqualityComparer also give me property Default, which might be useful if you want to use the default definition when comparing two Secrets.

LINQ: get objects that are in two sequences

Once you have the equality comparer, LINQ will be straightforward. To extract the Secrets that are in both x and y, I use the overload of Enumerable.Intersect that uses an equality comparer:

IEnumerable<Secret> ExtractDuplicateSecrets(IEnumerable<Secret> x, IEnumerable<Secret> y)
{
    return  x.Intersect(y, SecretComparer.ByValue);
}

That's all. To perform an action on every remaining Secret, use foreach:

void PerformSecretAction(IEnumerable<Secret> secrets)
{
    foreach (Secret secret in secrets)
    {
        secret.Process();
    }
}

So your complete code:

IEnumerable<Secret> x = GetSecrets();
IEnumerable<Secret> y = GetOtherSecrets();
IEnumerable<Secret> secretsInXandY = ExtractDuplicateSecrets(x, y);
PerformSecretAction(secretsInXandY);

Or if you want to do this in one statement. Not sure if this improves readability:

PerformSecretAction(ExtractDuplicateSecrets(GetSecrets(), GetOtherSecrets());    

The nice thing about making small methods: creation of x and y, a SecretComparer, extract the common Secrets and perform the action on all remaining Secrets, is that most procedure will be quite small, hence easy to read. Also, all procedures can be reused for other purposes. You can easily change them (different definition of equality: just write a different comparer!), and easy to unit test.

Implement Secret Equality

public override bool Equals (Secret x, Secret y)
{
    // almost all equality comparers start with the following lines:
    if (x == null) return y == null;              // True if x and y both null
    if (y == null) return false;                  // because x not null
    if (Object.ReferenceEquals(x, y) return true; // same object

Most of the time often we don't want that different derived classes are equal: So a TopSecret (derived from Secret) is not equal to a Secret.

    if (x.GetType() != y.GetType()) return false;

The rest depends on your definition of when two Secrets are equal. Most of the time you check all properties. Sometimes you only check a subsection.

    return x.Id == y.Id
        && x.Description == y.Description
        && x.Date == y.Date
        && ...

Here you can see that the code depends on your definition of equality. Maybe the Description check is case insensitive:

private static IEqualityComparer<string> descriptionComparer {get;}
    = StringComparer.CurrentCultureIgnoreCase;

return x.Id == y.Id
    && descriptionComparer.Equals(x.Description, y.Description)
    && ...

Implement GetHashCode

This method is mainly used to have a fast method to determine that two objects are not equal. A good GetHashCode is fast, and throws away most unequal objects.

There is only one requirement: if x and y are considered equal, they should return the same HashCode. Not the other way round: different objects might have the same Hashcode, although it would be better if they have different HashCodes.

How about this:

public override int GetHashCode (Secret x)
{
    if (x == null)
        return 8744523; // just a number;
    else
        return x.Id.GetHashCode();  // only check Id
}

In the code above, I assume that the Id of a Secret is fairly unique. Probably only while updating a Secret you will find two non-equal Secrets with same Id:

Secret existingSecret = this.FindSecretById(42);
Secret secretToEdit = (Secret)existingSecret.Clone();
secretToEdit.Description = this.ReadNewDescription();

Now existingSecret and secretToEdit have the same value for Id, but a different Description. Hence they are not equal. Yet they have the same HashCode.

Still, by far, most Secrets will have a unique Id, GetHashCode will be a very fast method to detect that two Secrets are different.

  • Related