Home > Software engineering >  Run task collection concurrently and sequentially in C#10
Run task collection concurrently and sequentially in C#10

Time:07-05

I have a Task collection which must be run either concurrently or sequentially.

Concurrently:

public Task Foo(CancellationToken token)
{
  // ...
  IEnumerable<Task> tasks = GetTasks(token);
  // ...
  return Task.WhenAll(tasks);
}

But I'm unsure about sequentially:

public Task Bar(CancellationToken token)      // no `async` keyword by design
{
  // ...
  IEnumerable<Task> tasks = GetTasks(token);
  // ...

  // (1)
  return Task.Run(async () => {
    foreach (var task in tasks)
      await task.WaitAsync(token);
  }, token);

  // (2)
  return Task.Run(async () => {
    foreach (var task in tasks)
      await Task.Run(() => task, token);
  }, token);

  // (3)
  foreach (var task in tasks)
    yield return Task.Run(() => task, token);

  // (?)
}

Which is the safest and modern idiomatic way to do this in C#10: 1, 2 or 3? Or perhaps there's a better/safer way?

This code is run about every second, for about ten tasks - performance is not paramount, but nonetheless important. (This is run in an ASP.NET Core server application.)

CodePudding user response:

This depends heavily on how GetTasks is implemented.

Let's discuss two options:

// Option 1:
IEnumerable<Task> GetTasks(CancellationToken cancellationToken)
{
    foreach (var ... in ...)
       yield return Task.Run(<some sync method>);
       
    // alternatively
    foreach (var ... in ...)
       yield return <some real async method>();
}

// Option 2:
IEnumerable<Task> GetTasks(CancellationToken cancellationToken)
{
    return (from ... in ...
            select Task.Run(<some sync method>)).ToList();

    // alternatively
    return (from ... in ...
            select <some real async method>()).ToList();
}

In Option 1 the caller is able to decide whether the tasks should run concurrently or sequentially because each started task is yielded sequentially.

In Option 2 all tasks are currently running in parallel and it doesn't matter how you await them.

Let's assume Option 1 (or something similar) is used:
In that case you can run them in parallel by using:

public Task Bar(CancellationToken token)
{
    // ToList collects all Tasks without any await so all are running in parallel
    // Maybe ToList is not needed and Task.WhenAll does that for you - I'm not sure about that
    return Task.WhenAll(GetTasks(token).ToList());
}

Or you retrieve one task after another (sequential):

public async Task Bar(CancellationToken token)
{
    // This will await one task after another and so they're running sequential
    foreach (var item in GetTasks(token))
        await item;
}

I don't really understand why Bar is not async by design because async doesn't change the method signature and is completely transparent to the caller. But if needed an implementation might be:

public Task Bar(CancellationToken token)
{
    return Task.Run(Body);
    
    async Task Body()
    {
        // This will await one task after another and so they're running sequential
        foreach (var item in GetTasks(token))
            await item;
    }
}

The main difference between parallel-Bar and sequential-Bar is the exception handling. If an exception occurs in one of the task the parallel implementation will even run them all until they're finished. The sequential implementation will stop starting new tasks after the first exception occurs. This can be fixed using a try...catch and buffering all the tasks and raise an AggregateException after all tasks are finished.

According to your implementations: Please keep in mind that WaitAsync will only stop awaiting the task if the CancellationToken requests a stop. This will not stop the running task itself.
The same for Task.Run. If the CancellationToken is already set it will prevent the Task from beeing scheduled but the already running task in the context object will not be affected.
GetTask needs to be implemented in a way that the CancellationToken is respected.

There is no need to wrap an already running task with Task.Run.

EDIT

I searched for the source of Task.WhenAll and it looks that it does the parallel retrieving of all tasks for you. There is also a performance optimization if the parameter is already an array. So the shown implementation of the parallel-Bar should call ToArray instead of ToList or forward the IEnumerable<Task> without collecting all the items so Task.WhenAll will do this for you.

CodePudding user response:

// no ``async`` keyword by design is very bad smell; and options 1 and 2 won't compile without async.

Please note that at the point when you reach await the task may be well under way - GetTasks may have already started them all.


I think you're over-thinking this. In many situations when you want to run two tasks sequentially you write something like this:

await Method1Async();
await Method2Async();

If the tasks come as a collection and one way is wait for their completion to use the good old foreach.

public async Task BarAsync(CancellationToken token)     
{
  IEnumerable<Task> tasks = GetTasks(token); 

  // If GetTasks started the tasks then we are fooling ourselves 
  // that we're executing sequentially. GetTasks is in charge.
  foreach(var t in tasks)
  {
     await t[.ConfigureAwait(false));
  }
}
  • Related