Home > Blockchain >  Fast IO operations in a seperate thread
Fast IO operations in a seperate thread

Time:10-13

I have a set of instruments my program communicates with and I'd like to put the communication in a separate thread.

The IO is pretty slow (~100 ms per item per instrument), and I need to record the resulted values from them in a shared array (of the last N values) and saved to a file, with repeated measurments taken as quickly as possible. Some instruments are slow to 'formulate' a responce, so some readings could be done concurrently, but the readings need to be approximately syncronised (i.e. 1 timestamp per row of readings)

I'd like this all to be done in a seperate thread(s) so the timing can't be interupted by computation etc happening in the main thread, but the main thread should be able to access the array.

Ideally I should be able to run some daq.start() and it gets going without further interaction.

What's the 'pythonic' way to do this? I've been reading about asyncio, threading, and multiprocessing and it's unclear to me which is appropriate.

In c I would have start thread1, which would just record measurments sequentially into a cache array. thread2 would flush that cache into the main shared array whenever it could get a lock. At the same time it would write that to the output file. By keeping track of the index ranges which were being read, lock conflicts would be rare (but importantly wouldn't interupt the DAQ when they happen)

CodePudding user response:

the correct answer here is threads, this is not an opinion.

communicating with hardware is done using drivers implemented in DLLs, and when python calls a DLL it drops the GIL, therefore threads can execute in the background with as little overhead as possible on the python interpreter itself.

proper synchronization should be done, which include thread locks if writing to file is done using the threads, but also when writing to files the threads will drop the GIL and they will still have little to no overhead on your python interpreter.

both of the above are not the case for asyncio, which is designed for asynchronous networking, not hardware.

for the implementation, a threadpool is usually the most pythonic way to go about this, you just spawn as much workers as the number of instruments that you connect to and let them do their work.

since you are not using any of asyncio features, you should be using multiprocesing.threadpool with apply_async or imap_unordered and the thread locks from the threading module for writing to disk, there is also a barrier if you want to synchronize each frame across all threads.

  • Related