I had a developer challenge the use of asynchronous server side code the other day. He asked why asynchronous code is superior to synchronous code on a server with no UI thread to block? I gave him the typical thread exhaustion answer but after thinking about it for a while I was no longer sure my answer was correct. After doing a little research I found that the upper limit to threads in an OS is governed by memory not an arbitrary number. And servers like Kestrel support unlimited threads. So in "theory" the number of requests (threads) a server can block on in parallel is governed by memory. Which is no different than async code in .NET; it lifts stack variables to the heap but it's still memory bound.
I've always assumed that smarter people than me had thought this through and async code was the right way to handle IO bound code. But what are the measurable advantages of async .NET code when running in a dedicated server farm with no UI thread? Does a move to the cloud (AWS) change the answer?
CodePudding user response:
Server-side asynchronous code purpose is completely different from asynchronous UI code.
Asynchronous UI code makes UI more responsive (especially when multiple CPU cores are available), it allows multiple UI tasks to run in parallel which improves UI user experience.
The purpose of server-side asynchronous code on the other hand is to minimise the resources necessary to serve multiple clients simultaneously. In fact it is beneficial even if there is only one CPU core or a single-threaded event loop like in Node.js. And it all boils down to a simple concept of
Asynchronous IO.
The difference between synchronous and asynchronous IO is that in case of the former the thread which initialises an IO operation is paused until the IO operation is completed (e.g. until DB request is executed or a file on a disk is read). The same thread is then un-paused once the IO operation is completed to process the result of it. Note: even though while paused the thread is most likely not using any CPU resources (it is probably put to sleep by a thread scheduler) its resources are still tied to this particular IO operation and are pretty much wasted while IO is executed by the hardware. Effectively with synchronous IO you will need at least one thread per currently being processed client request even though most of those threads are probably asleep waiting for their IO operations to complete. In .NET each thread has at least 1MB of stack allocated so if the server is currently processing say 1000 requests it leads to almost 1GB of memory allocated simply for thread stacks plus an additional burden for a thread scheduler and more time CPU spends doing context switches: the more threads there are the slower overall performance of the system. More memory allocated means less efficient memory/CPU caches usage too.
Asynchronous IO is more efficient because a worker thread only initialises an IO operation and instead of waiting for it to complete it is immediately switched to another useful task (e.g. continuation of another client's request processing) and when the IO operation is completed by the hardware the processing of the result is resumed on any available worker thread. As a result, depending on the ratio between overall time spent waiting for hardware to complete IO and the time spent doing CPU tasks (e.g. serialisation of the result of IO operation into JSON) this approach can use less threads to serve the same number of simultaneous client requests: if, say, 90% of the time is spent in IO we can potentially use only 100 thread to serve the same 1000 simultaneous requests. The more your server-side code is IO-bound vs CPU-bound the more simultaneous clients requests it can process using a given amount of resources: CPU and memory.
What is the drawback of asynchronous code? Mainly it is generally harder to write than synchronous. Asynchronous code uses callbacks to resume operation so instead of a simple linear code a programmer needs to pass a delegate (a continuation) to IO method which is later called by the system when IO operation is completed (potentially on a different thread). However modern C# with its async/await
facilities makes this task less complicated and even makes asynchronous code to almost look like synchronous. The only thing to remember: the asynchronous code only works when it is asynchronous "all the way down": even a single Task.Wait or Task.Result somewhere in the stack of calls from initial HTTP request processing to DB request call makes entire code synchronous thus forcing the current working thread to wait for that Wait
call to finish defeating the purpose. Note: await
in C# code does not actually awaits to the result of the call but is converted by the compiler to a ContinueWith i.e. to a continuation callback though in practice it is a bit more complicated than that but luckily the complexity is hidden from a programmer so nowadays writing efficient asynchronous code is relatively straightforward task.