Home > OS >  Parallel nested operations return weird results
Parallel nested operations return weird results

Time:02-26

I'm trying to use the Parallel library for my code and I'm facing a strange issue. I made a short program to demonstrate the behavior. In short, I make 2 loops (one inside another). The first loop generates a random array of 200 integers and the second loop adds all the arrays in a big list. The issue is, in the end, I don't get a multiple of 200 integers, instead I see some runs doesn't wait for the random array to fully be loaded. It's difficult to explain so here the sample code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace TestParallel
{
    class Program
    {
        static int RecommendedDegreesOfParallelism = 8;
        static int DefaultMaxPageSize = 200;

        static void Main(string[] args)
        {
            int maxPage = 50;
            List<int> lstData = new List<int>();
            Parallel.For(0, RecommendedDegreesOfParallelism, new ParallelOptions() { MaxDegreeOfParallelism = RecommendedDegreesOfParallelism },
                (index) =>
                {
                    int cptItems = 0;
                    int cptPage = 1 - RecommendedDegreesOfParallelism   index;
                    int idx = index;
                    do
                    {
                        cptPage  = RecommendedDegreesOfParallelism;
                        if (cptPage > maxPage) break;

                        int Min = 0;
                        int Max = 20;
                        Random randNum = new Random();
                        int[] test2 = Enumerable
                            .Repeat(0, DefaultMaxPageSize)
                            .Select(i => randNum.Next(Min, Max))
                            .ToArray();
                        var lstItems = new List<int>();
                        lstItems.AddRange(test2);
                        var lstRes = new List<int>();
                        lstItems.AsParallel().WithDegreeOfParallelism(8).ForAll((item) =>
                        {
                            lstRes.Add(item);
                        });

                        Console.WriteLine($"{Task.CurrentId} = {lstRes.Count}");
                        lstData.AddRange(lstRes);
                        cptItems = lstRes.Count;
                    } while (cptItems == DefaultMaxPageSize);
                }
            );
            Console.WriteLine($"END: {lstData.Count}");
            Console.ReadKey();
        }
    }
}

And here is an execution log :

4 = 200
1 = 200
2 = 200
3 = 200
6 = 200
5 = 200
7 = 200
8 = 200
1 = 200
6 = 194
2 = 191
5 = 200
7 = 200
8 = 200
4 = 200
5 = 200
3 = 182
4 = 176
8 = 150
7 = 200
5 = 147
1 = 200
7 = 189
1 = 200
1 = 198
END: 4827

We can see some loops return less than 200 items. How is it possible?

CodePudding user response:

This here is not threadsafe:

lstItems.AsParallel().WithDegreeOfParallelism(8).ForAll((item) =>
{
    lstRes.Add(item);
});

From the documentation for List<T>:

It is safe to perform multiple read operations on a List, but issues can occur if the collection is modified while it's being read. To ensure thread safety, lock the collection during a read or write operation. To enable a collection to be accessed by multiple threads for reading and writing, you must implement your own synchronization.

It doesn't explicitly mention it, but .Add() can also fail when called simultaneously by multiple threads.

The solution would be to lock the calls to List<T>.Add() in the loop above, but if you do that it will likely make it slower than just adding the items in a loop in a single thread.

var locker = new object();

lstItems.AsParallel().WithDegreeOfParallelism(8).ForAll((item) =>
{
    lock (locker)
    {
         lstRes.Add(item);
    }
});
  • Related