Home > Blockchain >  multiprocessing multiple files in loop and store results in dictionary (python 3)
multiprocessing multiple files in loop and store results in dictionary (python 3)

Time:03-30

I want to execute a function in parrallel on multiple files and store the results in a dictionary with the file name as key.

But what I get from it is only <multiprocessing.pool.ApplyResult at 0x7f37065fac40> for each entry of the dictionary.

How can I get the results in each dictionary entry directly ?

Additionally I would like to monitor the progress on the overall task (how much of the files have been processed (for example a print saying file i/total).

I tried the following :

from multiprocessing import Pool
import os

def process(file):
    # processings ...
    return results


pool = Pool()
result_dict = {}
for file in os.listdir("<DIRPATH>"):
    result_dict[file] = pool.apply_async(process, file)
pool.close()
pool.join()

CodePudding user response:

The multiprocessing.pool.Pool.apply_async method returns a multiprocessing.pool.AsyncResult instance that represents a "future" result. That is, to get the actual result you have to call method get on this instance, which will block until the actual result is available and then return that result. So, you need to modify your code as follows:

from multiprocessing import Pool
import os

def process(file):
    # processings ...
    return results


pool = Pool()
result_dict = {}
for file in os.listdir("<DIRPATH>"):
    result_dict[file] = pool.apply_async(process, file)
for k, v in result_dict.items():
    # Update key with the actual result:
    result_dict[k] = v.get()
pool.close()
pool.join()

As far as your second question concerning showing progress, you must post a new question (you may not post multiple questions on a single posting that are totally unrelated). If you wish, once you have done so you may add a comment to this answer with a link to the new question and if I am able I will take a look at it.

  • Related