How should I pass a function that runs 2 other functions from an external file to be able to use mul-CodePudding

I recently saw a module that allows me to run my code simultaneously which happens to be what I need. However, as I tried to implement it ran into problems here and there and need some help with it.

So basically I will need to run 2 code from an external python script named genODE and runODE. Basically genODE is my simulation file, so it finds the file spec and creates an output which is then fed into runODE that generates another file. I have to run this for specifically 30 times with different file names, I want to be able to run them concurrently and create output files that are unique to each of the 30 files.

Currently this is my code,

from sim import genODE, runODE
import multiprocessing

def sim(niteration):
        genODE(modelfile=f'./models/organism/Organism_{niteration}.modelspec', mtype='ASM', solver='RK4', timestep='1', endtime='21600', lowerbound='0;0', upperbound='1e-3;1e-3', odefile='organism_{niteration}.py')
        runODE(odefile='organism_{niteration}.py', sampling='500', resultfile=f'./models/simulation/simulation_{niteration}.csv')

if __name__ == "__main__":
    for niteration in range(1, 31):
        simulation = multiprocessing.Process(target=sim, args=(niteration,))
        simulation.start()
    simulation.join()

However, I have been getting error and unable to get it to work. Currently I'm getting an error of Can't attribute 'sim' on module '__main__'.

CodePudding user response：

Here's a framework suggestion for introducing some concurrency to this problem:

from concurrent.futures import ProcessPoolExecutor

def genODE():
    pass
def runODE():
    pass
def sim(n):
    print(n)
    genODE()
    runODE()

def main():
    with ProcessPoolExecutor() as executor:
        executor.map(sim, range(1, 31))

if __name__ == '__main__':
    main()

Leaving it up to OP to "fill in the gaps".

Depending on the OS and/or potential memory constraints, consideration should be given to specifying max_workers=N in the ProcessPoolExecutor constructor

CodePudding user response：

This is a full working example for a solution with producer consumer pattern using queue and multi threaded.

from concurrent.futures import ThreadPoolExecutor
from concurrent.futures import as_completed
from queue import Queue
import numpy as np
import pandas as pd


def genODE(itmun):
    dates = pd.date_range('20130101', periods=6)
    pdf = pd.DataFrame(np.random.randn(6, 4), index=dates, columns=list('ABCD'))
    pdf['itmun'] = itmun
    return pdf

def runODE(pdf):
    itnum = pdf['itmun'][0]
    pdf['itmun'] = pdf['itmun']   1
    return pdf

def producer(queue, itmun):
    # apply your function to create the data
    data = genODE(itmun)
    # put the data in the queue
    queue.put(data) 

def consumer(queue):
    while not queue.empty(): 
        data = queue.get() 
        # do somework on the data by running your function
        result_data = runODE(data)
        itnum = result_data['itmun'][0]
        result_data.to_csv(f'{itnum}test.csv')
        queue.task_done()
        return result_data 


def main():
    q = Queue() 
    futures = []
    with ThreadPoolExecutor(max_workers=15) as executor:
        for p in range(30):
            producer_future = executor.submit(producer, q, p)
            futures.append(producer_future) 
        for c in range(30):
            consumer_future = executor.submit(consumer, q)
            futures.append(consumer_future)

        for future in as_completed(futures):
            # do something with the result as soon as it is available such as saving or printing or nothing if you already done what you need in runODE()
            print(future.result()) 

if __name__ == "__main__":
    main()