Home > Net >  Python execute a function in parallel in loop
Python execute a function in parallel in loop

Time:01-05

I tried to improve the execution time of a script which import datas from CSV into Graphite/Go-Carbon DB time series.

this is the loop which parse all zipfiles and read them in function (execute_run) : It tried this code but i got an error:

    for idx4, Lst_f in enumerate(full_csvfile_paths):
       if lst_metrics in Lst_f:
          zip_file = Lst_f
          with zipfile.ZipFile(zip_file) as zipobj:
             print("Using ZipFile:",zipobj.filename)
             #execute_run(zipobj.filename, confcsv_path, storage_type, serial)
             output = subprocess.run(execute_run(zipobj.filename, confcsv_path, storage_type, serial),stdout=subprocess.PIPE)
             print ("Return code: %i" % output.returncode)
             print ("Output data: %s" % output.stdout)

Error:

Traceback (most recent call last):
  File "./02-pickle-client.py", line 451, in <module>
    main()
  File "./02-pickle-client.py", line 361, in main
    output = subprocess.run(execute_run(zipobj.filename, confcsv_path, storage_type, serial),stdout=subprocess.PIPE)
  File "/usr/lib64/python3.6/subprocess.py", line 423, in run
    with Popen(*popenargs, **kwargs) as process:
  File "/usr/lib64/python3.6/subprocess.py", line 729, in __init__
    restore_signals, start_new_session)
  File "/usr/lib64/python3.6/subprocess.py", line 1240, in _execute_child
    args = list(args)
TypeError: 'NoneType' object is not iterable

Is there a way to execute X times the function :"execute_run" and control the correct running.

Many thanks for help.

CodePudding user response:

The problem could be that the parallel processes is not set up to handle iterables correctly. Instead of subprocess.run, I would recommend using multiprocessing.pool or multiprocessing.starmap as specified in these docs.

This could look something like this:

    import multiprocessing as mp

    # Step 1: Use multiprocessing.Pool() and specify number of cores to use (here I use 4).
    pool = mp.Pool(4)

    # Step 2: Use pool.starmap which takes a multiple iterable arguments
    results = pool.starmap(My_Function, [(variable1,variable2,variable3) for i in data])
    
    # Step 3: Don't forget to close
    pool.close()
  •  Tags:  
  • Related