Home > Mobile >  How to implement a specialized overload of the List.RemoveAll method, with an index parameter in the
How to implement a specialized overload of the List.RemoveAll method, with an index parameter in the

Time:01-06

The List<T>.RemoveAll is a quite useful method, that allows to remove efficiently multiple items from a list. Unfortunately in some scenarios I needed some extra features that the method doesn't have, and some guarantees that the documentation doesn't provide. It also has a questionable behavior in case the match predicate fails, that causes me PTSD and anxiety. So in this question I am asking for an implementation of the same method, in the form of an extension method, with these features and characteristics:

  1. Instead of a Predicate<T> it accepts a Func<T, int, bool> delegate, where the int is the zero-based index of the T item.
  2. It guarantees that the predicate will be invoked exactly once for each item, in a stricly ascending order.
  3. In case the predicate returns true for some items and then fails for another item, the items that have been elected for removal are removed from the list before the propagation of the exception.

Here is the signature of the extension method that I am trying to implement:

public static int RemoveAll<T>(this List<T> list, Func<T, int, bool> predicate);

It returns the number of elements that were removed.

I attempted to implement it using as starting point the existing implementation, but it has some performance optimizations that make it quite complex, and injecting the desirable "exceptional" behavior is not obvious. I am interested for an implementation that is simple and reasonably efficient. Using LINQ in the implementation is not desirable, because it implies memory allocations that I would like to avoid.


Regarding the behavior of the built-in List<T>.RemoveAll that I called "questionable" earlier, here is what happens. In case the predicate fails for an item in the middle of the list, the items that have already been elected for removal are either not removed, or they are replaced with duplicates of other elements. In any case the size of the list remains the same as before. Here is a minimal demonstration of this behavior:

List<int> list = new(Enumerable.Range(1, 15));
Console.WriteLine($"Before RemoveAll: [{String.Join(", ", list)}]");
try
{
    list.RemoveAll(item =>
    {
        if (item == 10) throw new Exception();
        bool removeIt = item % 2 == 1;
        if (removeIt) Console.WriteLine($"Removing #{item}");
        return removeIt;
    });
}
catch { } // Ignore the error for demonstration purposes
finally
{
    Console.WriteLine($"After RemoveAll: [{String.Join(", ", list)}]");
}

The list has 15 numbers, and the intention is to remove the odd numbers from the list. The predicate fails for the 10th number.

Output:

Before RemoveAll: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
Removing #1
Removing #3
Removing #5
Removing #7
Removing #9
After RemoveAll: [2, 4, 6, 8, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]

Online demo.

As you can see the numbers 1 and 3 have been removed, the 5, 7 and 9 are still there, and the numbers 6 and 8 have been duplicated (there are two occurrences of each). This is not a desirable behavior for my scenarios. The desirable output should be:

After RemoveAll: [2, 4, 6, 8, 10, 11, 12, 13, 14, 15]

I have reported this behavior to Microsoft, and the feedback that I've got is that in case of failure any outcome is equally corruptive. From their point of view there is no difference between the above outputs. Both represent a state that is neither the original nor the final/expected/correct. So they don't think that there is any bug that needs to be fixed. They also believe that this behavior is not surprising or unexpected, so there is no need to document it.

CodePudding user response:

This solution is based on the idea to separate the selection of the items to be removed from the removal itself.

This has the following advantages:

  • If during the selection process, an exception occurs, the list will be left untouched
  • The removal process can only fail in catastrophic cases (OutOfMemoryException etc.)

But of course also some disadantages:

  • it requires extra memory to store the intermediate selection result
  • some optimizations might not be as effective

Because of the mentioned optimizations, I chose to base the selection result on ranges instead of individual indexes, so we can use List.RemoveRange which if more effective than individual RemoveAt calls (assumed that there are in fact ranges with more than one element).

public static List<(int start, int count)> GetIndexRanges<T>(this List<T> list, 
    Func<T, int, bool> predicate)
{
    var result = new List<(int start, int count)>();
    int start = -1;
    for (var i = 0; i < list.Count; i  )
    {
        // see note 1 below
        bool toBeRemoved = predicate(list[i], i);
        if (toBeRemoved)
        {
            if (start < 0)
                start = i; // new range starts
        }
        else if (start >= 0)
        {
            // range finished
            result.Add((start, i - start));
            start = -1;
        }
    }
    if (start >= 0)
    {
        // orphan range at the end
        result.Add((start, list.Count - start));
    }
    return result;
}

public static int RemoveIndexRanges<T>(this List<T> list, 
    List<(int start, int count)> ranges)
{
    var removed = 0;
    foreach (var range in ranges)
    {
        // the "- removed" is there to take into account 
        // that deletion moves the indexes.
        list.RemoveRange(range.start - removed, range.count);
        removed  = range.count;
    }
    return removed;
}

Usage:

var ranges = list.GetIndexRanges((item, index) =>
    {
        //if (item == 10) throw new Exception();
        return item % 2 == 1;
    });
// See note 2 below
list.RemoveIndexRanges(ranges);

Note 1: As is, this code does not fulfill requirement #3 (which I would doubt to be a useful requirement). If wanted, this can be easily achieved by wrapping the predicate call in a try...catch:

bool toBeRemoved = false;
try { toBeRemoved = predicate(list[i], i); } catch { }

Note 2: As this is now a two-step process, it will fail if the list changes between the calls.

CodePudding user response:

So they don't think that there is any bug that needs to be fixed. They also believe that this behavior is not surprising or unexpected, so there is no need to document it.

They're correct. The method is documented as:

Removes all the elements that match the conditions defined by the specified predicate.

This supports two scenarios: the predicate returning true, removing an element, or false for leaving it as-is. A predicate throwing an exception is not a use case intended to be supported.

If you want to be able to pass a predicate that may throw, you could wrap it like this:

public static int RemoveAll<T>(this List<T> list, Func<T, int, bool> predicate)
{
    Exception? caught = null;
    int index = 0;
    int removed = 0;

    list.RemoveAll(item =>
    {
        // Ignore the rest of the list once thrown
        if (caught != null) return false;

        try
        {
            var remove = predicate(item, index);
            if (remove)
            {
                removed  ;
            }

            return remove;
        }
        catch (Exception e)
        {
            caught = e;
            return false;
        }

        index  ;
    });

    if (caught != null)
    {
        throw caught;
    }

    return removed;
}

CodePudding user response:

I don't know microsoft is how to wrote this method.

I tried some code block. And i found case.

Actually problem is your throw new Exception(). If you dont this code that time yo code will run perfect. Exception trigger some another case. But i dont know what is that.

if (item >= 10) return false;
bool removeIt = item % 2 == 1;
if (removeIt) Console.WriteLine($"Removing #{item}");
return removeIt;

I found this. EDIT

Actually Func<T, int, bool> property is not deleted some item. It return boolean. As if return true he succesful deleted from list. If return false. it is not deleted from list.

  • Related