Speed of loading files with asyncio-CodePudding

I'm writing a piece of code that needs to compare a python set to many other sets and retain the names of the files which have a minimum intersection length. I currently have a synchronous version but was wondering if it could benefit from async/await. I wanted to start by comparing the loading of sets. I wrote a simple script that writes a small set to disk and just reads it in n amount of times. I was suprised to see the sync version of this was a lot faster. Is this to be expected? and if not is there a flaw in the way I have coded it below?

My code is the following:

Synchronous version:

import pickle
import asyncio
import time 
import aiofiles

pickle.dump(set(range(1000)), open('set.pkl', 'wb'))

def count():
    print("Started Loading")
    with open('set.pkl', mode='rb') as f:
        contents = pickle.loads(f.read())
    print("Finishd Loading")

def main():
    for _ in range(100):
        count()

if __name__ == "__main__":
    s = time.perf_counter()
    main()
    elapsed = time.perf_counter() - s
    print(f"{__file__} executed in {elapsed:0.3f} seconds.")

Asynchronous version:

import pickle
import asyncio
import time 
import aiofiles

pickle.dump(set(range(1000)), open('set.pkl', 'wb'))

async def count():
    print("Started Loading")
    async with aiofiles.open('set.pkl', mode='rb') as f:
        contents = pickle.loads(await f.read())
    print("Finishd Loading")

async def main():
    await asyncio.gather(*(count() for _ in range(100)))

if __name__ == "__main__":
    import time
    s = time.perf_counter()
    asyncio.run(main())
    elapsed = time.perf_counter() - s
    print(f"{__file__} executed in {elapsed:0.3f} seconds.")

Execuitng them led to:

async.py executed in 0.052 seconds.
sync.py executed in 0.011 seconds.

CodePudding user response：

aiofiles is implemented by using threads, so each time you tell it to read a file a thread will be launched to read the file.

the file being read is actually very small it fits in 3 KB which is under 1 page in your memory and also smaller than your core L1 cache, the computer didn't actually read anything from the disk most of the time, it's all being moved between parts of your memory.

in the async case it is being moved from one core's memory to the second, which is slower than keeping everything within 1 core's cache, but for larger files that are actually read from disk and other tasks to attend to, such as reading from sockets and reading different files from disk and doing some processing concurrently you will find the async version is faster, because it is using threads under the hood, and some tasks drop the gil, like reading from files and sockets, and some processing libraries.

you are still reading files at the same speed in both cases as you will be limited by your drive read speed, you will only be reducing the "dead-time" of when you are not reading files, and your example has no "dead-time", it isn't even reading a file from disk.

an exception to the above happens when you are reading data from multiple HDDs and SSDs concurrently where 1 thread can never read the data fast enough so the async version will be faster, because it can read from multiple drives at the same time (assuming you have the cores and IO lanes for it in your CPU)

CodePudding user response：

Asyncio doesn’t help in this case because your workload is basically disk-IO bound and CPU bound.

CPU bound workload cannot be sped up by Asyncio.

Disk-IO bound workload could benefit from async operation if but the disk operation is very slow and your program can do other things during that time. This may not be your situation.

So the slower asyncio performance is mainly due to the additional overhead introduced.