Home > other >  On small super calculate, server with Python - multiprocessing parallel computing and what are the s
On small super calculate, server with Python - multiprocessing parallel computing and what are the s

Time:09-16

Need to deal with Gaia satellite data, the data of large program winners, had to learn parallel processing, have some problems
One part of thinking (after the code) are as follows:
 # set the input data data1, data2, convenient after a number of different data2 data to do the same calculations; And data1 data is fixed, cut it into n pieces in this code for n process in parallel: 
Class Data1 + 2 () :

Def __init__ (self, data1, data2) :

The from multiprocessing import cpu_count
Self. Ncpu=cpu_count ()/2 # '/2' means double thresds, no '1' caus only 2 CPU in this thinkpad

Self. Data1=data1
The self. The data2=data2

# define process ( is in the process contains a parallel computing ) :
Def def0 (self) :

# cut data1 data into n pieces for parallel computing n process:
For I in range (self. Ncpu) :
Vars () [' data1 + STR (I))={} # define dict 'data1i={}
Data1size=len (self. Data1data)/self ncpu + 1
Idata=https://bbs.csdn.net/topics/0
For the key, xydata1 in self. Data1. The items () :
If len (vars () [' data1 + STR (idata)]) & lt; Data1size - 1:
Vars () [' data1 + STR (idata)] [key]=xydata1
Elif len (vars () [' data1 + STR (idata)])==data1size - 1:
Vars () [' data1 + STR (idata)] [key]=xydata1
Idata +=1

# in order to minimize the repeated operations, the common data save as much as possible into higher order array:
D0=arange (600800, 2)
A0=arange (10,80,2)
D, A=ix_ (d0, A0)
Num_d=d.s considering
Num_a=A.s considering
MAx=reshape (A + d - d, (num_d num_a, 1, 1))
MAy reshape=(A + d, (num_a num_d, 1, 1))

The from multiprocessing import Process, Queue
# define the function of each process:
Def run_process (que, xydata1 xydata2, MAx, MAy) :
Match={} # define a dictionary (results)
#... Calculation process is not here... Results:
Match [key, Mwhere Gwhere]=mind

The result of a # return under the current process:
Que. Put (match)
Que=Queue ()
# start each process:
For idata in range (self. Ncpu) :
Pro=Process (target=run_process, args=(que, self. Gaiadata, vars () [' data '+ STR (idata)], MAx, MAy))
Pro. The start ()

Pro. The join ()
# merge each process results:
Match={}
For idata in range (self. Ncpu) :
Match=dict (match, * * que. Get (True))

# all return results:
The return match


in different platform test speed:
Is a MAC OS i7 16 G, 2.8 G of 4 nuclear 8 thread thinkpad is Linux CENTOS7 system i5 2.5 G 2 nuclear 4 thread 6 G, otherwise is Dell desktop Linux Ubuntu system 4 core i7 3.9 G 8 thread 16 G;
Have one server is 5 node 100 nuclear, frequency 2.5 G of memory did not know (not use the priority rules, few people use);
Otherwise a small super calculate node 25 500 nuclear, frequency of 2.5 G of memory did not know (rules can only submit ten task, usually no user, the server and hardware knowledge, I know very little, only know to pipe the teacher spoke so)

(1) when the total input data1 is 72:
  • Related