Home > database >  Is there a faster way to do this ForEach and Linq.Any?
Is there a faster way to do this ForEach and Linq.Any?

Time:11-22

This run gets me every first appearance of an object with certain conditions.

I run this code 900 times in a application run. It takes 10 minutes for 300.000 objects. I need to run it with a lot more objects (> 10 million objects) So it'll take a really long time.

This is the best code I could do:

string PlayerUrl { get; }

Position Position { get; }

public enum Position
    {
        GK,
        ...
    }

int ChemistryAmount { get; }
public static List<IFieldPlayerStatsForPosition> GetFirstFieldPlayerStatsForPositionInList(List<IFieldPlayerStatsForPosition> auxFieldPlayerStatsForPositions)
{
    List<IFieldPlayerStatsForPosition> fieldPlayerStatsForPositions = new List<IFieldPlayerStatsForPosition>();
    foreach (IFieldPlayerStatsForPosition fieldPlayerStatsForPosition in auxFieldPlayerStatsForPositions)
    {
        if (!fieldPlayerStatsForPositions.Any(cc => fieldPlayerStatsForPosition.FieldPlayerChemistryApplied.FieldPlayer.PlayerUrl == cc.FieldPlayerChemistryApplied.FieldPlayer.PlayerUrl &&
                                                    fieldPlayerStatsForPosition.Position == cc.Position &&
                                                    fieldPlayerStatsForPosition.FieldPlayerChemistryApplied.ChemistryAmount == cc.FieldPlayerChemistryApplied.ChemistryAmount))
        {
            fieldPlayerStatsForPositions.Add(fieldPlayerStatsForPosition);
        }
    }
    return fieldPlayerStatsForPositions;
}

I need to make it faster... What should I do? Is there a faster alternative to a foreach and Linq.Any?

CodePudding user response:

Your algorithm time complexity is O(n^2). In order to speed it up you need to use a different container. You can reduce it to O(n) using a HashSet and a custom IEqualityComparer:

public class FieldPlayerStatsForPositionComparer : IEqualityComparer<IFieldPlayerStatsForPosition> {
    public bool Equals(IFieldPlayerStatsForPosition x, IFieldPlayerStatsForPosition y) {
        if (ReferenceEquals(x, y))
            return true;
        if (ReferenceEquals(x, null))
            return false;
        if (ReferenceEquals(y, null))
            return false;
        if (x.GetType() != y.GetType())
            return false;
        return x.PlayerUrl == y.PlayerUrl && x.Position == y.Position && x.ChemistryAmount == y.ChemistryAmount;
    }

    public int GetHashCode(IFieldPlayerStatsForPosition obj) {
        return HashCode.Combine(obj.PlayerUrl, (int)obj.Position, obj.ChemistryAmount);
    }
}

public static List<IFieldPlayerStatsForPosition> GetFirstFieldPlayerStatsForPositionInList(List<IFieldPlayerStatsForPosition> auxFieldPlayerStatsForPositions)
{
    var fieldPlayerStatsForPositions = new HashSet<IFieldPlayerStatsForPosition>(auxFieldPlayerStatsForPositions, new FieldPlayerStatsForPositionComparer());
    return fieldPlayerStatsForPositions.ToList();
}

Please note that hash set does not keep the original order of items. If original order is a must for you, please leave a comment.

UPDATE

As Joel Coehoorn pointed out, to avoid double memory allocation you can use IEnumerable instead of creating a new List:

public static IEnumerable<IFieldPlayerStatsForPosition> GetFirstFieldPlayerStatsForPositionInList(IEnumerable<IFieldPlayerStatsForPosition> auxFieldPlayerStatsForPositions) {
    return auxFieldPlayerStatsForPositions.Distinct(new FieldPlayerStatsForPositionComparer());
}

The .Distinct() method also keeps the original order of the items.

Or alternatively, you can return the HashSet itself or ICollection interface:

public static ICollection<IFieldPlayerStatsForPosition> GetFirstFieldPlayerStatsForPositionInList(IEnumerable<IFieldPlayerStatsForPosition> auxFieldPlayerStatsForPositions) {
    return new HashSet<IFieldPlayerStatsForPosition>(auxFieldPlayerStatsForPositions, new FieldPlayerStatsForPositionComparer());
}

I would also suggest to use BenchmarkDotNet NuGet package to test runtime performance and memory consumption.

CodePudding user response:

Assuming you use at least .NET 6 :

All you function seems to be doing is remove elements that are duplicate by the 3 properties

You can replace you function by a DistinctBy:

public static List<IFieldPlayerStatsForPosition> GetFirstFieldPlayerStatsForPositionInList(List<IFieldPlayerStatsForPosition> auxFieldPlayerStatsForPositions)
{
    return auxFieldPlayerStatsForPositions.DistinctBy(p => new 
    { 
        p.FieldPlayerChemistryApplied.FieldPlayer.PlayerUrl,
        p.Position,
        p.FieldPlayerChemistryApplied.ChemistryAmount
    }).ToList();
}
  • Related