Home > Net >  Using "map function" in python to reduce time of processing
Using "map function" in python to reduce time of processing

Time:07-16

I am trying to run a loop for 100,000 times. I have used map function as shown below to divide the work between processors and make it less time consuming.

But also I have to pass the variable as argument to the map function due to which it consumes more time as compared to when I define this variable inside the main function. But the problem with define the variable inside the main function is - this variable is generated by random function hence when different processors come to pick function every time it give new random gussian plot and this is not required.

Hence- as a solution I defined the gussian random function out of the main function and passed as an argument to main function. But now the map is consuming more time to process. Can any one please help to reduce the time of map processing or suggest me where to define the random gussian variable so that it is calculated once and picked by different processors.

Defining random gussian variable to pass as an argument to map function

Code

def E_and_P(Velocity_s, Position_s,tb):
    ~
    ~
    for index in range(0,4000):
    
    return X_position,Y_position,Z_position, VX_Vel, VY_Vel

if __name__ == "__main__":        
    Velocity_mu = 0
    Velocity_sigma = 1*1e8  # mean and standard deviation
    Velocity_s = np.random.normal(Velocity_mu, Velocity_sigma, 100000)
    print("Velocity_s =", Velocity_s)
    #print("Velocity_s=", Velocity_s)
    
    Position_mu = 0
    Position_sigma = 1*1e-9  # mean and standard deviation
    Position_s = np.random.normal(Position_mu, Position_sigma, 100000)  
    #print("Position_s =",  Position_s)
    
    tb = range(100000)
    #print("tb=",tb)
    
    items = [(Velocity_s, Position_s,tb) for tb in range(100000)]
    p = Pool(processes=4)
    result = p.starmap(E_and_P, items)
    p.close()
    p.join()

Please help or suggest some new ways.

CodePudding user response:

Based on your last comment, you could change this line:

items = [(Velocity_s, Position_s,tb) for tb in range(100000)]

to:

items = [(Velocity_s[tb], Position_s[tb],tb) for tb in range(100000)]

Each element of items is now a simple 3-number tuple: one velocity, one position, and one index.

You will also have to change E_and_P since its arguments are now 3 scalers: one velocity, one position, and one index.

def E_and_P(vel, pos, tb):
    ~
    ~

This should dramatically improve the performance. When using multiprocessing, keep in mind that different processes do not share an address space. All the data that gets exchanged between processes must be digitized on one end and rebuilt as Python objects on the other end. As I indicated in my comment, your original implementation resulted in about 20 billion digitizations. This approach still has 100000 steps, but each step needs to digitize only 3 numbers. So 300000 digitizations instead of 2000000000.

  • Related