I have come across an interesting coroutines freeze that I have simplified into the following problem:
//running on main thread
runBlocking {
lifecycleScope.launch {
delay(1000)
}.join()
}
This causes the main thread to freeze indefinitely. I assume it is because of the following sequence of events:
- Queue to launch
- Call to join, pass main thread to coroutine pool
- Call to launch
- Call to delay, pass main thread to coroutine pool
- Thread moves back to join and waits
- Delay never finishes because it does not have a thread available?
Correct me if I am misunderstanding the above logic. What is a reasonable pattern to avoid this from happening? I understand the running blocking on the main thread is not a good idea, but deeper in the code it seems odd that you can accidentally freeze a single thread coroutine in this manner.
CodePudding user response:
Here's an explanation about what exactly causes the deadlock.
Any code anywhere in your app that is running on the Main thread is actually running from a message that has been sent to the Main Looper's queue of messages to process on the main thread.
The way Dispatchers.Main
works is that it essentially sends pieces of coroutines as Runnable messages to an Android Handler
that is backed by the Main Looper
. Messages sent to the Main Looper can only be processed one at a time.
Inside your runBlocking
call, your join()
call is suspending until its associated coroutine finishes. That coroutine has been submitted to the main Looper. The Looper cannot process any messages in its queue until the current message returns. The current message is whichever one ran the method on the main thread that you called runBlocking
from.
runBlocking
is waiting on join()
to return. join()
is waiting for its coroutine to get processed by the Looper. The Looper is waiting on runBlocking
to return.
I saw you mentioned in a comment that it works with GlobalScope. This is because GlobalScope uses Dispatchers.Default
and lifecycleScope
uses Dispatchers.Main
(unless you modify the default context when launching the coroutine).
CodePudding user response:
It is even simpler than you think. Because of runBlocking()
, join()
doesn't return the thread to the event loop, so the launch()
block never starts executing - deadlock.
Actually... this is not entirely true. join()
returns the thread to the pool, but not to the one we think about. runBlocking()
starts its own event loop using the caller thread. From the outside of runBlocking()
the thread seems to be constantly blocked, but in the inside it loops and can suspend. Anyway, from the perspective of lifecycleScope
the main thread is blocked and it can't launch anything on it.
What is a reasonable pattern to avoid this from happening?
Do not call runBlocking()
on the main thread. Coroutines are no exception here. We should not run blocking IO or other kind of blocking operations on the main thread and that includes runBlocking()
.
CodePudding user response:
It's because of calling runBlocking
on the main thread (which defeats the idea of even using coroutines); the order of events might not matter, when the top-most instruction already stalls the thread. It's always GlobalScope
vs. CoroutineScope
vs. lifecycleScope
...where lifecycleScope.launch
can be used with different dispatchers:
lifecycleScope.launch(Dispatchers.IO)
: Launches a coroutine within thelifecycleScope
provided by AndroidX. Coroutine gets cancelled as soon as lifecycle is invalidated (i.e. user navigates away from a fragment). UsesDispatchers.IO
as thread pool.lifecycleScope.launch
: Same as above, but usesDispatchers.Main
if not specified.
Therefore I'd assume, the behavior also may stem from dispatching with Dispatchers.Main
.