Asynchonous programming for method loops-CodePudding

I am trying to learn about asynchronous programming and how I can benefit from it.

My hope is that I can use it to improve performance whenever I'm looping over a method that takes an significant time to complete, like the following method.

string AddStrings()
{
    string result = "";

    for (int i = 0; i < 10000; i  )
    {
        result  = "hi";
    }

    return result;
}

Obviously this method doesn't have much value, and I purposely made it ineffecient, in order to test the benefits of asynchronous programming. The test is done by looping over the method 100 times, first synchronously and then asynchronously.

Stopwatch watch = new Stopwatch();
watch.Start();

List<string> results = WorkSync();
//List<string> results = await WorkAsyncParallel();

watch.Stop();
Console.WriteLine(watch.ElapsedMilliseconds);

List<string> WorkSync()
{
    var stringList = new List<string>();

    for (int i = 0; i < 100; i  )
    {
        stringList.Add(AddStrings());
    }

    return stringList;
}

async Task<List<string>> WorkAsyncParallel()
{
    var taskList = new List<Task<string>>();

    for (int i = 0; i < 100; i  )
    {
        taskList.Add(Task.Run(() => AddStrings()));
    }

    var results = (await Task.WhenAll(taskList)).ToList();
    return results;
}

Super optimistically (naively), I was hoping that the asynchronous loop would be 100 times as fast as the synchronous loop, as all the tasks are running at the same time. While that didn't exactly turn out to be the case, the time of the loop was decreased by more than two thirds, from around 5000 miliseconds to 1500 miliseconds!

Now my questions are:

What makes the asynchronous loop faster than the synchronous loop, but not nearly 100 times faster? I'm guessing each of the 100 tasks are fighthing for a limited amount of CPU?
Is this a valid method to improve performance when looping methods?

Thank you in advance.

CodePudding user response：

My hope is that I can use it to improve performance

Not really. Concurrency (parallel or asynchronous) can improve performance, but asynchronous code on its own is more about freeing up threads. Asynchronous code provides two main benefits:

Server-side apps get better scalability. By using fewer threads, asynchronous code can scale further and faster than synchronous code.
UI apps get better responsiveness. By freeing up the UI thread, asynchronous code provides a better user experience.

Neither of these have much to do with performance. E.g., when comparing a synchronous server-side request handler with its basic asynchronous counterpart, the asynchronous one is usually slower (slightly), but the server as a whole scales better.

That said, asynchronous code can enable natural concurrency (e.g., Task.WhenAll). And if you do end up adding concurrency, then you can see some performance benefits - sometimes quite drastic ones.

The test is done by looping over the method 100 times, first synchronously and then asynchronously.

Technically, you're comparing single-threaded and parallel code, here.

As Jeroen pointed out in the comments, Parallel or PLINQ are the proper tools to use if you have a parallel problem. While you can use Task.Run, it's a very low-level tool for parallel programming. Task.Run is more commonly used for "shove this one thing to the thread pool", not "shove these 100 things to the thread pool".

What makes the asynchronous loop faster than the synchronous loop, but not nearly 100 times faster? I'm guessing each of the 100 tasks are fighthing for a limited amount of CPU?

Yes. Most machines these days have multi-core CPUs, and each core can do one thing at a time. So if you have 4 or 8 cores, that's the limit of your parallelism.

Is this a valid method to improve performance when looping methods?

If you have a lot of work to do in parallel, then parallel programming is acceptable. Again, I recommend using higher-level constructs like Parallel or PLINQ, which have better built-in partitioning strategies and other optimizations, which will be more efficient than throwing a bunch of tasks at the thread pool.

Parallel programming does have its caveats:

If your work is too fine-grained, then parallel work can end up being slower, as the overhead of partitioning, queueing, and scheduling erases the gains from concurrency.
You generally want to avoid parallelism in some situations such as handling a request on the server side. It's just usually not a good idea to allow one request to consume all the CPU resources of the whole server.

Now, if you want to test out asynchronous benefits, then I recommend using an operation that is inherently asynchronous (e.g., a client web request). Say, if your code was hitting 100 URLs. That would be something more naturally asynchronous that doesn't require thread pool threads.