I recently noticed there are two different Semaphore implementations in different packages in python, one is in threading
package and another is in asyncio
package. And I am curious what is the difference between these two implementation? If in async function I try to use Semaphore from threading
package, would that cause any potential problems?
And by checking python official documentation, it wrote
asyncio primitives are not thread-safe, therefore they should not be used for OS thread synchronization (use threading for that)
But what does that mean by saying asyncio primitives are not thread-safe and should not be used for OS thread sync
?
Thanks in advance
CodePudding user response:
The whole goal of a semaphore is to provide exclusive access to something. Only one "piece of code" can access own the semaphore at any one time.
What I mean by "piece of code" in the previous statement depends on whether I'm using multi-threading, multi-processing, or asyncio. And the means by which you guarantee exclusive access depends on what I'm using.
Asyncio is the most restricted kind of multi-threading. Everything is running within a single Python thread. The Python interpreter is only executing one thing at a time. Each "piece of code" runs until it voluntarily waits for something to happen. Then another "piece of code" is allowed to run. Eventually the original piece of code runs again when the thing it was waiting on happens.
With multithreading, multiple pieces of code are running within the Python interpreter. Only one piece of code runs at any time, but they are not politely waiting for each other. Python switches from "piece of code" to "piece of code" as it wants.
With multiprocessing, multiple Pythons are running simultaneously. There is no sharing between the pieces of code, other than what is provided by the operating system. To set up a semaphore usually requires some support from the operating system to create a shared variable that all threads/processes can access.
So. Asyncio primitives are designed so that they are all run within a single Python process with the processes cooperating. They are not designed to work if multiple pieces of code try to use it simultaneously.
I hope this helps.
CodePudding user response:
You don't have a choice which queue or which semaphore you use. They are not compatible.
Even if they both offer concurrency, asyncio
and multi-threading are two really quite different worlds based on different principles and APIs, have different typical use-cases, etc.
Development of multi-threaded programs is not easy and a Python feature called GIL makes them not so efficient as they could be. This makes asyncio the preferred choice unless you are forced to use multi-threading.
Having said that, both approaches can be used in a single program.
It is possible - but not common - to run an async code in one of the threads of a multi-threaded application. The quoted notice from the docs reminds you that almost all async operations must be performed within that one thread. There are only few specialized thread-safe low-level scheduling functions.
For completness, the asyncio
library can utilize a thread pool for its internal use, beacuse so-called blocking operations should not be performed by async programs.