Home > Software design >  How can I multiprocess a single function thousands of time?
How can I multiprocess a single function thousands of time?

Time:03-23

I'm using OpenCV to compare thousands of images to one reference image. The process is very lengthy and I'm considering multiprocessing as a way to accelerate it.

How should I make it so that it'll do the "cv.matchTemplate(...)" function for each image, and without looping re-doing the function on the same image?

def myFunction():
    
    values_for_each_image =[]
    
    for image in thousands_of_images:
        result = cv.matchTemplate(reference_image, image, cv.TM_CCOEFF_NORMED)
        
        values_for_each_image.append(result[1])
     
     return values_for_each_image

Theoretically, I know that I could do something like this (but it's unrealistic for thousands of images):

def do_image1():
    return cv.matchTemplate(reference_image, image1, cv.TM_CCOEFF_NORMED)

def do_image2():
    return cv.matchTemplate(reference_image, image2, cv.TM_CCOEFF_NORMED)

p1 = multiprocessing.Process(target=do_image1)
p2 = multiprocessing.Process(target=do_image2)

if __name__ ==  '__main__':
    p1.start()
    p2.start()
...

CodePudding user response:

This is how I would solve it using concurrent.futures:

from concurrent.futures import ThreadPoolExecutor, as_completed

def do_image(reference_image, image):
    return(cv.matchTemplate(reference_image, image, cv.TM_CCOEFF_NORMED))

def myFunction():
    values_for_each_image = []
    with ThreadPoolExecutor(20) as executor:
        results = {executor.submit(do_image, reference_image, image) for image in thousands_of_images}
        for result in as_completed(results):
            values_for_each_image.append(result.result())
    return(values_for_each_image)

CodePudding user response:

You could use the Pool from python's multiprocessing library, instead of individual processes. The pool will take care of launching each process as necessary.

Seeing as your function does not have any parameters, I would say that you could get the best results by using the pool's apply or the apply_async functions.

  • Related