Home > other >  Check if a set exactly includes a subset using Linq taking into account duplicates
Check if a set exactly includes a subset using Linq taking into account duplicates

Time:09-27

var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };
subset.All(s => superset.Contains(s))

This code would return true, because 9 is included in the superset,but only once, I want an implementation that would take into account the duplicates, so it would return false

CodePudding user response:

My thought was that you could group both sets by count, then test that the super group list contained every key from the sub group list and, in each case, the super count was greater than or equal to the corresponding subcount. I think that I've achieved that with the following:

var subset = new[] { 9, 3, 9 };
var superset = new[] { 9, 10, 5, 3, 3, 3 };

var subGroups = subset.GroupBy(n => n).ToArray();
var superGroups = superset.GroupBy(n => n).ToArray();

var basicResult = subset.All(n => superset.Contains(n));
var advancedResult = subGroups.All(subg => superGroups.Any(supg => subg.Key == supg.Key && subg.Count() <= supg.Count()));

Console.WriteLine(basicResult);
Console.WriteLine(advancedResult);

I did a few extra tests and it seemed to work but you can test some additional data sets to be sure.

CodePudding user response:

Here is another solution :

            var subset = new[] { 9, 3, 9 };
            var superset = new[] { 9, 10, 5, 3, 3, 3 };

            var subsetGroup = subset.GroupBy(x => x).Select(x => new { key = x.Key, count = x.Count() });
            var supersetDict = superset.GroupBy(x => x).ToDictionary(x => x.Key, y => y.Count());

            Boolean results = subsetGroup.All(x => supersetDict[x.key] >= x.count);

CodePudding user response:

This works for me:

var subsetLookup = subset.ToLookup(x => x);
var supersetLookup = superset.ToLookup(x => x);

bool flag =
    subsetLookup
        .All(x => supersetLookup[x.Key].Count() >= subsetLookup[x.Key].Count());

CodePudding user response:

That's not how sets and set operations work. Sets cannot contain duplicates.

You should treat the two arrays not as sets, but as (unordered) sequences. A possible algorithm would be: make a list from the sequence superset, then remove one by one each element of the sequence subset from the list until you are unable to find such an element in the list.

bool IsSubList(IEnumerable<int> sub, IEnumerable<int> super)
{
    var list = super.ToList();
    foreach (var item in sub)
    {
        if (!list.Remove(item))
            return false; // not found in list, so sub is not a "sub-list" of super
    }
    return true; // all elements of sub were found in super
}

var subset = new[] { 9, 3 };
var superset = new[] { 9, 10, 5, 3,1, 3, 3 };

var isSubSet = IsSubList(subset, superset);
  • Related