I have a Task
collection which must be run either concurrently or sequentially.
Concurrently:
public Task Foo(CancellationToken token)
{
// ...
IEnumerable<Task> tasks = GetTasks(token);
// ...
return Task.WhenAll(tasks);
}
But I'm unsure about sequentially:
public Task Bar(CancellationToken token) // no `async` keyword by design
{
// ...
IEnumerable<Task> tasks = GetTasks(token);
// ...
// (1)
return Task.Run(async () => {
foreach (var task in tasks)
await task.WaitAsync(token);
}, token);
// (2)
return Task.Run(async () => {
foreach (var task in tasks)
await Task.Run(() => task, token);
}, token);
// (3)
foreach (var task in tasks)
yield return Task.Run(() => task, token);
// (?)
}
Which is the safest and modern idiomatic way to do this in C#10: 1, 2 or 3? Or perhaps there's a better/safer way?
This code is run about every second, for about ten tasks - performance is not paramount, but nonetheless important. (This is run in an ASP.NET Core server application.)
UPDATE
As mentioned in the comments, the implementation of GetTasks()
is important. Please indicate how you expect it should be implemented, so the calling code above can run the tasks sequentially.
CodePudding user response:
This depends heavily on how GetTasks
is implemented.
Let's discuss two options:
// Option 1:
IEnumerable<Task> GetTasks(CancellationToken cancellationToken)
{
foreach (var ... in ...)
yield return Task.Run(<some sync method>);
// alternatively
foreach (var ... in ...)
yield return <some real async method>();
}
// Option 2:
IEnumerable<Task> GetTasks(CancellationToken cancellationToken)
{
return (from ... in ...
select Task.Run(<some sync method>)).ToList();
// alternatively
return (from ... in ...
select <some real async method>()).ToList();
}
In Option 1
the caller is able to decide whether the tasks should run concurrently or sequentially because each started task is yielded sequentially.
In Option 2
all tasks are currently running in parallel and it doesn't matter how you await them.
Let's assume Option 1
(or something similar) is used:
In that case you can run them in parallel by using:
public Task Bar(CancellationToken token)
{
// ToList collects all Tasks without any await so all are running in parallel
// Maybe ToList is not needed and Task.WhenAll does that for you - I'm not sure about that
return Task.WhenAll(GetTasks(token).ToList());
}
Or you retrieve one task after another (sequential):
public async Task Bar(CancellationToken token)
{
// This will await one task after another and so they're running sequential
foreach (var item in GetTasks(token))
await item;
}
I don't really understand why Bar
is not async
by design because async
doesn't change the method signature and is completely transparent to the caller. But if needed an implementation might be:
public Task Bar(CancellationToken token)
{
return Task.Run(Body);
async Task Body()
{
// This will await one task after another and so they're running sequential
foreach (var item in GetTasks(token))
await item;
}
}
The main difference between parallel-Bar
and sequential-Bar
is the exception handling. If an exception occurs in one of the task the parallel implementation will even run them all until they're finished. The sequential implementation will stop starting new tasks after the first exception occurs. This can be fixed using a try...catch
and buffering all the tasks and raise an AggregateException
after all tasks are finished.
According to your implementations: Please keep in mind that WaitAsync
will only stop awaiting the task if the CancellationToken requests a stop. This will not stop the running task itself.
The same for Task.Run
. If the CancellationToken is already set it will prevent the Task from beeing scheduled but the already running task in the context object will not be affected.
GetTask
needs to be implemented in a way that the CancellationToken is respected.
There is no need to wrap an already running task with Task.Run
.
EDIT
I searched for the source of Task.WhenAll
and it looks that it does the parallel retrieving of all tasks for you. There is also a performance optimization if the parameter is already an array. So the shown implementation of the parallel-Bar
should call ToArray
instead of ToList
or forward the IEnumerable<Task>
without collecting all the items so Task.WhenAll
will do this for you.
CodePudding user response:
// no ``async`` keyword by design
is a very bad smell; and options 1 and 2 won't compile without async
.
Having
static async Task MethodAsync(int i) { WriteLine($"<{i}"); await Task.Delay(100); WriteLine($"{i}>"); }
static IEnumerable<Task> GetTasks()
{
for(int i=0 ; i<5;i )
{
yield return MethodAsync(i);
}
}
we can execute and await this tasks sequentially with a foreach
:
var tasks = GetTasks();
foreach (var t in tasks) // Please note it is the enumeration which starts the tasks one by one
{
await t;
}
This prints
<0
0>
<1
1>
<2
2>
<3
3>
<4
4>
To illustrate that it's the foreach - not await
- that starts the tasks let's use enumerators directly:
var tasks = GetTasks();
var enumerator = tasks.GetEnumerator();
var b = true;
while (b)
{
Console.WriteLine("Moving to next");
b = enumerator.MoveNext();
Console.WriteLine("Moved");
if(b)
{
await enumerator.Current;
Console.WriteLine("After await");
}
}
This prints:
Moving to next
<0
Moved
0>
After await
Moving to next
<1
Moved
1>
After await
Moving to next
<2
Moved
2>
After await
Moving to next
<3
Moved
3>
After await
Moving to next
<4
Moved
4>
After await
Moving to next
Moved
It's important to note that GetTasks
is in charge here.
With the following implementation the tasks will already be started (or scheduled to start if there are a lot of them) by the time GetTasks() returns.
static IEnumerable<Task> GetTasks()
{
List<Task> tasks = new ();
for(int i=0 ; i<5;i )
{
tasks.Add(MethodAsync(i));
}
return tasks;
}
In this case foreach will only wait sequentially (which in not that useful).
With
var tasks = GetTasks();
foreach (var t in tasks)
{
await t;
}
the result will something like this:
<0
<1
<2
<3
<4
1>
4>
3>
0>
2>