I saw that there are many ways in order to avoid race conditions when tackling with multithreading, but the most common are mutexes and queues. I cannot understand how a queue could make an application thread safe as long as the threads may populate the queue in a chaotic order.
CodePudding user response:
I cannot understand how a queue could make an application thread safe as long as the threads may populate the queue in a chaotic order.
First of all, let's get "thread safe" out of the way. "Thread safety" really only should be used to describe the components (functions, modules, classes, libraries, etc.) that can be used to build an application.
A component is thread-safe if it guarantees to behave in some useful and well documented way when it is concurrently used by more than one thread. Since you're talking about queues, here's what you might expect from a thread-safe queue:
- Every item that is
put()
into the queue eventually can betake()
en from the queue. - No item can ever be
take()
en that was notput()
. - *IF* the program puts two items into the queue in a certain order, then they will be
take()
en in the same order,
BUT,
- If the program allows concurrent
put()
calls to the queue, then the queue cannot guarantee the order in which those items actually will appear in the queue.
"Thread safety" doesn't really apply to applications because the user of an application cannot choose whether to use it in a multi-threaded way or in a single-threaded way. That choice is baked-in by the application author. An application either is correct (it works as expected) or else it isn't.
So, let's re-state your question:
I cannot understand how a queue could make an application correct as long as the threads may populate the queue in a chaotic order.
It can't. If the application depends on objects being put()
into a queue in some particular order, then it is the application programmer's responsibility to ensure that the put()
calls do not overlap. If the calls are allowed to overlap, there is no way that the queue could possibly know which order the application programmer intended for the items to go in to the queue.
CodePudding user response:
In a concurrent system, (thread priorities aside), the operating systems scheduler decides, which of the currently runnable threads it schedules next on the (multiple) CPU cores. It tries to be "fair" as to not starve some unfortunate thread, while favoring others. In realtime operating systems (e.g. QNX), of course, there is a lot more fineprint, adding to this simple concept.
As such, in such a system, it impossible to reason about "who came first", as from the point of view of the application, the scheduler decisions appear random (but fair).
So, if someone tried to make a "better" queue, they might think of using a priority queue instead and use a timestamp for the ordering of the queues contents. But - as each thread obtains a timestamp (assuming high enough resolution anyway), it is yet again just "random" from the point of view of the application, which of the threads will come first.
Which - kind of is the proof, that queue or not - this problem cannot be solved.
It is - by the way, the same with a cpu which has N interrupt sources, which change their logical level all at once. Either the hardware prioritizes or the order in which those interrupts get processed depends on the interrupt architecture (interrupt vectors for each interrupt or a single interrupt vector with software dispatch?), line noise, variability in the electrical characteristics of each input etc.
How to work around this? Instead of having strict ordering (x came before y), you need to have some ternary approach (x, [y,z,t], u)
, where - according to your applications requirements, you have a notion of "simultaneity". I.e. x came first, then y,z,t simultaneously (within a defined time interval), then u.
Also, having your threads arranged in form of an acyclic directed graph, like little steps in an assembly line instead of some events forming some feedback loop helps avoiding "higher order" race conditions.
Just like in the www, having idempotent requests to a worker thread also helps. (a,b) -> ((a,b),c)
Returning the question with the answer is another trick to avoid confusion.