Say I'm using EF and I query the database for multiple objects:
var person = await Context.People.FindAsync(1);
person.Name = "foo";
var animal = await Context.Animals.FindAsync(2);
animal.Name = "bar";
var hybrid = await Context.Hybrids.FindAsync(3);
hybrid.Name = "godless abomination";
I can see two ways this could execute:
- We get stuck at the first await, it gets a person and sets its name, then moves on to the next await (and so on)
- The compiler knows that each awaited task can be executed at the same time because they don't depend on each other. So it tries to get person, animal, and hybrid at the same time and then sets the person, animal and hybrid names synchronously.
I assume number 2 is correct, but I'm not sure. Advice for working these sorts of questions out for myself is also very welcome :)
CodePudding user response:
async
and await
work with worker pool threads to improve responsiveness of code, not parallelism. In a thread-safe implementation you can parallelize multiple calls using Task.WhenAll
instead of await
, however the big caveat here is that the EF DbContext is not thread safe and will throw when it detects simultaneous access between multiple threads. (It doesn't matter if the tables referenced are related or not)
As a general breakdown of what sort-of happens with awaited async code:
var person = await Context.People.FindAsync(1);
person.Name = "foo";
var animal = await Context.Animals.FindAsync(2);
animal.Name = "bar";
Context.People.FindAsync(1)
gets set up to be executed on a worker thread which will be given an execution pointer for a resumption point to resume executing all following code once it is completed since it was awaited.
So a request like this comes in on a Web Request worker thread, which gets to an awaited line, requests a new worker thread to take over, giving it a resumption pointer to the remaining code, then since the call is awaited, the Web Request worker thread knocks off back into the pool to serve more requests while that runs in the background.
When the Context.People.FindAsync()
call completes, that worker thread continues the execution, and eventually hits the async
Context.Animals.FindAsync(2)
call which does the same thing, spawning that call off to a worker thread with a resumption point given it's awaited, and knocks off back into the worker thread pool. The actual thread handoff behaviour can vary depending on the configured synchronization context. (which can be set up to have the resumption code return to the original thread, for example the UI thread in a WinForms project)
The EF DbContext is fine with this since operations against it are only occurring from one thread at any given time. The accidental alternative would be something like this:
var person = Context.People.FindAsync(1);
var animal = Context.Animals.FindAsync(2);
person.Result.Name = "foo";
animal.Result.Name = "bar";
Context.People.FindAsync(1)
will spawn off and run on a worker thread, but because it is not awaited, it isn't given an resumption point to call when it completes, it just returns the Task<Person>
handle that initiates to represent its execution. This means that the calling thread continues with the Context.Animals.FindAsync(2)
call, again handing off to a Worker Thread and getting a Task<Animal>
back. Both of these worker threads will be running in parallel and the EF DbContext is not going to like that at all. Each Task will block on the .Result
reference to wait for the task to complete, but often async
calls are used where we don't care about the result which leads to silent errors creeping in. For instance some times there are multiple async
calls with a bit of synchronous work happening in-between that always seems to take long enough that an un-awaited async
call never seems to overlap... until one day it does in production.
Where you would want parallel execution you would opt instead for something like a WhenAll()
:
using(var contextOne = new AppDbContext());
using(var contextTwo = new AppDbContext());
var personTask = contextOne.People.FindAsync(1);
var animalTask = contextTwo.Animals.FindAsync(2);
await Task.WhenAll(personTask, animalTask);
person.Result.Name = "foo";
animal.Result.Name = "bar";
contextOne.SaveChanges();
contextTwo.SaveChanges();
The key differences here is that we need to ensure that the queries are run against different DbContext instances. Something like this can be used to parallelize DB operations with other long-running tasks (file management, etc.) When it comes to handling exceptions this would bubble up the first exception (if any was raised) from any of the awaited WhenAll
tasks. If you want access potential exceptions from all parallel tasks, (AggregateException) that involves a bit of work to play nice with async
/await
without blocking.
see: Why doesn't await on Task.WhenAll throw an AggregateException? if you want to dive deeper down that rabbit hole. :)
CodePudding user response:
Answer
Your code is going to stop at each await
until the specific asynchronous function it proceeds returns. This means your code is going to run from top to bottom (i.e. number 1 is correct)
Example
await longAsyncFunc()
shortFunc()
Above, shortFunc()
must wait until the longAsyncFunc()
returns because of the await
keyword.
Below, longAsyncFunc()
will start execution, then shortFunc()
will start execution -- no need to wait for longAsyncFunc()
to finish its computation and return.
longAsyncFunc()
shortFunc()
Suggestion
If you would rather your code work like number 2 then I would wrap each pair of statements in an async
function.
async funcA() {
var person = await Context.People.FindAsync(1);
person.Name = "foo";
return person;
}
async funcB() {
var animal = await Context.Animals.FindAsync(2);
animal.Name = "bar";
return animal;
}
async funcC() {
var hybrid = await Context.Hybrids.FindAsync(3);
hybrid.Name = "godless abomination";
return hybrid;
}
var person = funcA();
var animal = funcB();
var hybrid = funcC();
Notice that I didn't use await
on the last three lines -- that would change the behavior back to number 1.