Home > Net >  What's the danger of adding to a list in one thread and removing from another without locking i
What's the danger of adding to a list in one thread and removing from another without locking i

Time:01-12

Preamble: (I don't know if this is the right word for it)

I know that the List is not thread-safe and am aware of the existence of concurrent collections such as ConcurrentBag, ConcurrentQueue, etc., and know how to use locks, to some extent. I just want to know what's the danger(s) would be if I do such a thing.

Say, if I have 2 threads running concurrently, one adding value to a List, the other removing elements from the same List, what's the danger(s) would be?

I know that the danger might be critical if I add elements in both threads since the List internally resizing and reindexing the collection, and the "race" is going to corrupt that, but Add and Remove are essentially opposite operations, I don't see there is any "conflict" in it.

I also know that Remove internally resizing and reindexing the collection too, but to be honest I don't know how they internally work, I don't really know if "2 opposite operations will still corrupt the data". If yes, in what way? (Except for the obvious danger of "there might be nothing to be removed because the element is not being added yet")

So, could somebody please so kind enlighten me?

Much appreciated!

CodePudding user response:

if I have 2 threads running concurrently, one adding value to a List, the other removing elements from the same List, what's the danger(s) would be?

The most likely faults would be that added values are lost, or that removed values are not actually removed. There can also be exceptions since the list might get into an invalid state.

While add and remove could be considered opposite operations, they are not atomic, meaning they will do multiple things, and will not work correctly if something changes in the middle of the operation. Contrast with Interlocked.Increment & Interlocked.Decrement that are atomic, so can safely be used from multiple threads.

In the end the only thing you should be concerned about is if it is documented as threadsafe or not. If it is thread safe you can just use it. If it is not you need to use locks whenever it is used from multiple threads. The correct behavior of the program is much more important than minor things like the performance impact of locks. If correct behavior is not required you can gain an near infinite speedup by just replacing your whole program with return 0;

There is one possible exception, concurrently reading memory is safe, so if you are sure the method do not mutate the object in any way it could be used concurrently without any locking. However, most classes do not specify if a method is guaranteed to be read-only, so use as your own risk.

CodePudding user response:

It's easy to demonstrate that this use of List<T> will cause problems.

Consider the following program (also on DotNetFiddle):

public class Program
{
    public static void Main()
    {
        List<int> list = new();
        Task.Run(() => addItems(list));
        Task.Run(() => removeItems(list));

        Console.WriteLine("Started. Press <return> to exit");
        Console.ReadLine();
    }

    static void addItems(List<int> list)
    {
        for (int i = 0; i < 100_000;   i)
        {
            list.Add(i);
        }

        Volatile.Write(ref finished, true);

        Console.WriteLine("Finished writing items");
    }

    static void removeItems(List<int> list)
    {
        int n = 0;

        while (true)
        {
            if (list.Count > 0)
            {
                  n;
                list.RemoveAt(0);
            }
            else
            {
                bool done = Volatile.Read(ref finished);

                if (done)
                    break;
            }
        }

        Console.WriteLine($"Finished reading {n} items");
    }

    static bool finished;
}

This starts two threads: One which writes 100K items to a list, and another that reads the items from that list.

The finished field is just there to indicate that the addItems() task has completed adding items.

The readItems() task counts the number of items that it's read from the list. Clearly, the resulting number should equal the number of items added to the list.

However, you'll find that it sometimes reads more items than were written, because the multithreaded access to the list has caused a bug. (You may have to run it several times to see this problem.)

The moral of this story is DON'T DO THAT!


Just as a sidenote, if you add a lock around the list access, it will work correctly and print the correct number of items:

public class Program
{
    public static void Main()
    {
        List<int> list = new();
        Task.Run(() => addItems(list));
        Task.Run(() => removeItems(list));

        Console.WriteLine("Started. Press <return> to exit");
        Console.ReadLine();
    }

    static void addItems(List<int> list)
    {
        for (int i = 0; i < 100_000;   i)
        {
            lock (locker)
            {
                list.Add(i);
            }
        }

        Volatile.Write(ref finished, true);

        Console.WriteLine("Finished writing items");
    }

    static void removeItems(List<int> list)
    {
        int n = 0;

        while (true)
        {
            lock (locker)
            {
                if (list.Count > 0)
                {
                      n;
                    list.RemoveAt(0);
                }
                else
                {
                    bool done = Volatile.Read(ref finished);

                    if (done)
                        break;
                }
            }
        }

        Console.WriteLine($"Finished reading {n} items");
    }

    static object locker = new();

    static bool finished;
}
  • Related