I have a Task
collection which must be run either concurrently or sequentially.
Concurrently:
public Task Foo(CancellationToken token)
{
// ...
IEnumerable<Task> tasks = GetTasks(token);
// ...
return Task.WhenAll(tasks);
}
But I'm unsure about sequentially:
public Task Bar(CancellationToken token) // no `async` keyword by design
{
// ...
IEnumerable<Task> tasks = GetTasks(token);
// ...
// (1)
return Task.Run(async () => {
foreach (var task in tasks)
await task.WaitAsync(token);
}, token);
// (2)
return Task.Run(async () => {
foreach (var task in tasks)
await Task.Run(() => task, token);
}, token);
// (3)
foreach (var task in tasks)
yield return Task.Run(() => task, token);
// (?)
}
Which is the safest and modern idiomatic way to do this in C#10: 1, 2 or 3? Or perhaps there's a better/safer way?
This code is run about every second, for about ten tasks - performance is not paramount, but nonetheless important. (This is run in an ASP.NET Core server application.)
CodePudding user response:
This depends heavily on how GetTasks
is implemented.
Let's discuss two options:
// Option 1:
IEnumerable<Task> GetTasks(CancellationToken cancellationToken)
{
foreach (var ... in ...)
yield return Task.Run(<some sync method>);
// alternatively
foreach (var ... in ...)
yield return <some real async method>();
}
// Option 2:
IEnumerable<Task> GetTasks(CancellationToken cancellationToken)
{
return (from ... in ...
select Task.Run(<some sync method>)).ToList();
// alternatively
return (from ... in ...
select <some real async method>()).ToList();
}
In Option 1
the caller is able to decide whether the tasks should run concurrently or sequentially because each started task is yielded sequentially.
In Option 2
all tasks are currently running in parallel and it doesn't matter how you await them.
Let's assume Option 1
(or something similar) is used:
In that case you can run them in parallel by using:
public Task Bar(CancellationToken token)
{
// ToList collects all Tasks without any await so all are running in parallel
// Maybe ToList is not needed and Task.WhenAll does that for you - I'm not sure about that
return Task.WhenAll(GetTasks(token).ToList());
}
Or you retrieve one task after another (sequential):
public async Task Bar(CancellationToken token)
{
// This will await one task after another and so they're running sequential
foreach (var item in GetTasks(token))
await item;
}
I don't really understand why Bar
is not async
by design because async
doesn't change the method signature and is completely transparent to the caller. But if needed an implementation might be:
public Task Bar(CancellationToken token)
{
return Task.Run(Body);
async Task Body()
{
// This will await one task after another and so they're running sequential
foreach (var item in GetTasks(token))
await item;
}
}
The main difference between parallel-Bar
and sequential-Bar
is the exception handling. If an exception occurs in one of the task the parallel implementation will even run them all until they're finished. The sequential implementation will stop starting new tasks after the first exception occurs. This can be fixed using a try...catch
and buffering all the tasks and raise an AggregateException
after all tasks are finished.
According to your implementations: Please keep in mind that WaitAsync
will only stop awaiting the task if the CancellationToken requests a stop. This will not stop the running task itself.
The same for Task.Run
. If the CancellationToken is already set it will prevent the Task from beeing scheduled but the already running task in the context object will not be affected.
GetTask
needs to be implemented in a way that the CancellationToken is respected.
There is no need to wrap an already running task with Task.Run
.
EDIT
I searched for the source of Task.WhenAll
and it looks that it does the parallel retrieving of all tasks for you. There is also a performance optimization if the parameter is already an array. So the shown implementation of the parallel-Bar
should call ToArray
instead of ToList
or forward the IEnumerable<Task>
without collecting all the items so Task.WhenAll
will do this for you.
CodePudding user response:
// no ``async`` keyword by design
is very bad smell; and options 1 and 2 won't compile without async
.
Please note that at the point when you reach await
the task may be well under way - GetTasks
may have already started them all.
I think you're over-thinking this. In many situations when you want to run two tasks sequentially you write something like this:
await Method1Async();
await Method2Async();
If the tasks come as a collection and one way is wait for their completion to use the good old foreach
.
public async Task BarAsync(CancellationToken token)
{
IEnumerable<Task> tasks = GetTasks(token);
// If GetTasks started the tasks then we are fooling ourselves
// that we're executing sequentially. GetTasks is in charge.
foreach(var t in tasks)
{
await t[.ConfigureAwait(false));
}
}