I am trying to build a number of Machine Learning models on a single dataset. The output of all models is then to be used in further steps. I would like the training of the models, to happen simultaneously to save time and manual labour.
I am completely new to asynchronous processing, and that has manifested itself in my code below not working. I get the error:
sys:1 RuntimeWarning: coroutine 'level1models' was never awaited
This appears to be a fairly common issue when await
isn't used, but wherever I place this command the error persists, and answers I find online do not seem to address functions that return values.
To provide a reproducible example I have altered my code while keeping the structure identical to the original.
from time import sleep
nrs_list = [1, 2, 3, 4, 5]
def subtract(n):
return n - 1
async def subtract_nrs(nrs):
# Train selected ML models
numbers = {nr: subtract(nr) for nr in nrs}
sleep(50)
# Loop to check if all models are trained
while True:
print([i for i in numbers.values()])
if [i for i in numbers.values()] != [None for _ in range(len(numbers))]:
break
sleep(5)
return numbers
r = subtract_nrs(nrs_list)
print(r)
<coroutine object subtract_nrs at 0x000002A413A4C4C0>
sys:1: RuntimeWarning: coroutine 'subtract_nrs' was never awaited
CodePudding user response:
Anytime you create a coroutine (here when you call subtract_nrs
) but don't await it, asyncio will emit the warning you received [0]. The wait you avoid this is by awaiting the coroutine, either via
await subtract_nrs(nrs_list)
or by using asyncio.gather
[1], which itself must be awaited
await asyncio.gather(subtract_nrs(nrs_list)
Note that here there's no value in using asyncio.gather
. That would only come if you needed to wait for multiple coroutines at once.
Based on your code, you seem to be using subtract_nrs
as the entry point to your program. await
can't be used outside of an async def
, so you need another way to wait for it. For that, you'll typically want to use asyncio.run
[2]. This will handle creating, running, and closing the event loop along with waiting for your coroutine.
asyncio.run(subtract_nrs(nrs_list))
Now that we've covered all that, asyncio won't actually help you achieve your goal of simultaneous execution. asyncio never does things simultaneously; it does things concurrently [3]. While one task is waiting for I/O to complete, asyncio's event loop allows another to execute. While you've stated that this is a simplified version of your actual code, the code you've provided isn't I/O-bound; it's CPU-bound. This kind of code doesn't work well with asyncio. To use your CPU-bound code and achieve something more akin to simultaneous execution, you should use processes. not asyncio. The best way to do this is with ProcessPoolExecutor
from concurrent.futures [4].