I thought I had a decent grasp of async/await in C#, but this code I am working on has me questioning what is happening behind the scenes.
I have a sync method that takes a bit of time to run, and I have a loop where I need to call this method anywhere from 500-2500 times. I basically want to call the method async so I can fire off all of the method calls and then pick up the results later.
This works as I would expect:
public async Task SomeMethod(List<Foo> fooList)
{
var taskList = new List<Task<MyData>>();
foreach (var foo in fooList)
{
taskList.Add(Task.Run(() => LongRunningMethod(foo)));
}
// do other stuff
await Task.WhenAll(taskList);
}
public MyData LongRunningMethod(Foo foo)
{
Thread.Sleep(10000);
return new MyData();
}
What I see happen - all of the tasks are added to the taskList very quickly, then I see all of the LongRunningMethod calls executing on a variety of threads, taking 10 seconds, and completing. When the first method hits the await Task.WhenAll, it sits there until all of the LongRunningMethod calls are complete.
However, as I was messing around I tried this iteration:
public async Task SomeMethod(List<Foo> fooList)
{
var taskList = new List<Task<MyData>>();
foreach (var foo in fooList)
{
taskList.Add(LongRunningMethod(foo));
}
// do other stuff
await Task.WhenAll(taskList);
}
public async Task<MyData> LongRunningMethod(Foo foo)
{
await Task.Delay(10000);
return new MyData();
}
The behavior is very different. Here, I see all the tasks added very quickly, but they are all on the same thread, and there is only about a 10 second delay overall before the await Task.WhenAll completes. It seems like each individual task executes to completion instantly, where I would expect it to run like the first interation where each task is taking 10 seconds to run.
What is the difference between how these two iterations work behind the scenes, and which is the proper way to accomplish my goal (firing off all of the LongRunningMethod calls, ideally have them all processing simultaneously to speed things up, and waiting for them all to finish before I move on)?
Edit to add details requested by Damien: This is a .Net 6 worker service (so there is no synchronization context, correct?) The goal is to try and cut down the amount of time this process takes to run by kicking off multiple LongRunningMethod calls at once instead of waiting for each one to complete before sending the next one.
CodePudding user response:
The difference you see is expected and does not depend on framework/runtime version: Sleep
imitates* behavior or CPU-intesive task while Delay
behaves as true asynchronous task.
Task.Delay(10000)
means "I'll come back to you in 10 seconds; Sleep(10000)
means I'll spend 10 second doing nothing and only after that I see what else need to be done.
It does not matter how many tasks you have if all start at the same time and "come back in 10 seconds" because they don't need to do anything else and hence all will "call back" in 10 seconds from start that all are completed.
The Sleep
version actually need to spend (10seconds * number of tasks) waiting - so as long as you create more tasks than available threads in the thread pool it will take longer than 10 seconds to finish. How much longer depends on thread pool size - if you start with just several tasks most likely all complete in 10 seconds as each task will get temporarily assigned free thread from the pool, with hundreds of tasks will get assigned a thread sequentially.
Kitchen explanation by Eric Lippert - What is the difference between asynchronous programming and multithreading? can be helpful, in this case:
- Delay version: set 100 of kitchen timers to ring in 10 seconds. In 10 seconds your ears will bleed when all timers go off but you are done at that point.
- Sleep version: cook 100 eggs individually using 20 pans and a stove with 4 burners. You tie a timer to each egg and start the timer when you start cooking that particular egg. You prepare 20 pans with one egg each and start cooking them 4 at a time with just 4 timers going off when each set of 4 is done - if all goes perfect 100/4 * 10 seconds = 250 seconds you are done (with some extra time spend in between to clean and organize pans - may very well take more time than just cooking one at a time - similar how scheduling overhead may dwarf time of short operations).
*Note that while Sleep
imitates "long running CPU intensive synchronous task" it still is not an exact match - sleeping threads don't take CPU time and hence 100 sleeping threads can sleep in parallel unlike only several threads can run in parallel doing real work (like computing digits of Pi). If you really need to model heavy CPU load use busy wait - like while(true)
checking time every iteration.
CodePudding user response:
The second sample worsk as following
- execution synchronously enters
LongRunningMethod
and startingDelay
. - after delay is started, execution synchronously returns in the
SomeMethod
and task is added into the list - meanwhile the delay is ticking. Actually, when the delay was started, the "finish time" was calculated and somewhere we have the record looks like "wake up me on this time or later".
- as the synchronous activivities mentioned above goes very quickly, the "wakeup time" is almmost the same for all tasks.
- while the
Delay
is in progress, the thread pool is free of work. - when the wakeup time is achieved, the
new MyData();
is executed andawait Task.WhenAll
is done after this.
In other hand the first code sample use Thread.Sleep
instead of Task.Delay
. And Sleep
is really freeze the thread for 10 seconds. So, the thread pool is busy and can't start other tasks. So, it will start the first butch of tasks and all the threads will freeze. After 10 seconds the threads will be freed and the second butch of tasks will be started and will be freeze the threads for another 10 seconds.
So, the first sample emulates the situation with cpu-bounds tasks (calcluations). And the second one is a good sample of io-bound tasks, when we need just to wait for some data without any calculations (and we can free the thread during this awaiting).
If you have mixed task with fetching from DB and further calculations, then you can try the following to emulate such a behaviour:
public async Task<MyData> LongRunningMethod(Foo foo)
{
//non-blocking waiting for data from DB.
//During this wait the thread may be used for other work.
await Task.Delay(5000);
//and here we have CPU-bound calculations
//the thread is busy and can't do any other work.
Thread.Sleep(5000);
return new MyData();
}