Multi Processing member function of a class store in array in python-CodePudding

I program mostly in C, C and recently converted a project over to python. Except I haven't been able to convert the multiprocessing as easy.

In the example I have an array fill with a ball class that has a member function named update that has 3 variables passed in.

That's it below. It's store in an array called balls. I've gone through enough post documentations and videos and haven't found anything covering this a few get close but don't show how to deal with the variables being passed in.

Ideally I would create a process pull and let it split the work up between them. I need to retrieve the objects and update the one's in the original process space.

Not sure but it looks like it may be easier to force it to output a tuple then with all the data to update the class and just write another function to update the class.

Feed back on the best way to do this in python is appreciated. Also I appreciate performance over the easy of doing something. That's the point of doing this after all. Thanks in advance.

class Ball:
          
    def __init__(self,x,y,vx,vy,c):
        self.x=x
        self.y=y
        self.vx=vx
        self.vy=vy
        self.color=c
        return
    @classmethod
    def update(self,w,h,t):
        time = float(t)/float(1000000)
        #print(time)
        xp = float(self.vx)*float(time)
        yp= float(self.vy)*float(time)
        self.x  = xp
        self.y  = yp
        #print (str(xp)  ","  str(yp))
        if self.x<32:
            self.vx = 0 - self.vx
            self.x  = (32-self.x)
        if self.y<32:
            self.vy = 0 - self.vy
            self.y  = (32-self.y)
        if self.x 32>w:
            self.vx = 0 - self.vx
            self.x -= (self.x 32)-w
        if self.y 32>h:
            self.vy = 0 - self.vy
            self.y -= (self.y 32)-h
        return

The class is updated via the following method

def play_u(self):
    t = self.gt.elapsed_time()
    self.gt.set_timer()
    for i in self.balls:
        i.update(self.width,self.height,t)
    return

CodePudding user response：

Here's an idea on how you might call update against multiple Ball objects with the same arguments in parallel. Here I am using multiprocessing.pool.Pool class.

Because Python serializes/de-serializes the Ball object from the main process to the process in the pool that will be executing the task, any modifications to the object will not be reflected back in the object copy that "lives" in the main process (as you found out). But that does not prevent update from returning a list (or tuple) of updated attributes that have been modified that the main process can use to update its copy of the object.

class Ball:
    # If this is a class constant, then it can and should stay here:
    radius = 32

    def __init__(self, x, y, vx, vy, c):
        self.x = x
        self.y = y
        self.vx = vx
        self.vy = vy
        self.color = c
        return

    def update(self, w, h, t):
        time = float(t) / 1000000.0
        #print(time)
        xp = float(self.vx) * float(time)
        yp = float(self.vy) * float(time)
        self.x  = xp
        self.y  = yp
        #print (str(xp)  ","  str(yp))
        if self.x < 32:
            self.vx = 0 - self.vx
            self.x  = (32 - self.x)
        if self.y < 32:
            self.vy = 0 - self.vy
            self.y  = (32 - self.y)
        if self.x   32 > w:
            self.vx = 0 - self.vx
            self.x -= (self.x   32) - w
        if self.y   32 > h:
            self.vy = 0 - self.vy
            self.y -= (self.y   32) - h
        # Return tuple of attributes that have changed
        # (Not used by serial benchmark)
        return (self.x, self.y, self.vx, self.vy)

    def __repr__(self):
        """
        Return internal dictionary of attributes as a string
        """
        return str(self.__dict__)

def prepare_benchmark():
    balls = [Ball(1, 2, 3, 4, 5) for _ in range(1000)]
    arg_list = (3.0, 4.0, 1.0)
    return balls, arg_list

def serial(balls, arg_list):
    for ball in balls:
        ball.update(*arg_list)

def parallel_updater(arg_list, ball):
    return ball.update(*arg_list)

def parallel(pool, balls, arg_list):
    from functools import partial

    worker = partial(parallel_updater, arg_list)
    results = pool.map(worker, balls)
    for idx, result in enumerate(results):
        ball = balls[idx]
        # unpack:
        ball.x, ball.y, ball.vx, ball.vy = result

def parallel2(pool, balls, arg_list):
    results = [pool.apply_async(ball.update, args=arg_list) for ball in balls]
    for idx, result in enumerate(results):
        ball = balls[idx]
        # unpack:
        ball.x, ball.y, ball.vx, ball.vy = result.get()

def main():
    import time

    # Serial performance:
    balls, arg_list = prepare_benchmark()
    t = time.perf_counter()
    serial(balls, arg_list)
    elapsed = time.perf_counter() - t
    print(balls[0])
    print('Serial elapsed time:', elapsed)

    print()
    print('-'*80)
    print()

    # Parallel performance using map
    # We won't even include the time it takes to create the pool
    from multiprocessing import Pool
    pool = Pool() # pool size is 8 on my desktop
    balls, arg_list = prepare_benchmark()
    t = time.perf_counter()
    parallel(pool, balls, arg_list)
    elapsed = time.perf_counter() - t
    print(balls[0])
    print('Parallel elapsed time:', elapsed)

    print()
    print('-'*80)
    print()

    # Parallel performance using apply_async
    balls, arg_list = prepare_benchmark()
    t = time.perf_counter()
    parallel2(pool, balls, arg_list)
    elapsed = time.perf_counter() - t
    print(balls[0])
    print('Parallel2 elapsed time:', elapsed)


    pool.close()
    pool.join()


# Required for windows
if __name__ == '__main__':
    main()

Prints:

{'x': -29.0, 'y': -28.0, 'vx': 3, 'vy': 4, 'color': 5}
Serial elapsed time: 0.0018328999999999984

--------------------------------------------------------------------------------

{'x': -29.0, 'y': -28.0, 'vx': 3, 'vy': 4, 'color': 5}
Parallel elapsed time: 0.236945

--------------------------------------------------------------------------------

{'x': -29.0, 'y': -28.0, 'vx': 3, 'vy': 4, 'color': 5}
Parallel2 elapsed time: 0.1460790000000000

I used nonsense arguments for everything but you can see that the overhead of handling the serialization/deserialization and updating of the main process's objects cannot be compensated for by processing the 1,000 calls in parallel when you have such a trivial worker function as update.

Note that benchmark Parallel2, which uses method apply_async, actually is more performant in this case than benchmark Parallel, which uses method map, which is a bit surprising. My guess is that this is due in part to having to use method functools.partial to convey the additional, non-changing w, h, and t arguments in the form of arg_list to worker function parallel_updater, which provides an additional function call required. So that's a total of two more function calls that benchmark Parallel has to make for each ball update.