So my goal is to use mpi4py to send the right column of matrix A to another thread where it should be written in the left column of matrix B. So we start for example with the two numpy ndarrays int the following form:
[[1,2,3] [[0,0,0]
[4,5,6] [0,0,0]
[7,7,9]], [0,0,0]]
And after sending, I want to have them as follows:
[[1,2,3] [[3,0,0]
[4,5,6] [6,0,0]
[7,7,9]], [9,0,0]]
One way to do that is using structs in mpi4py. I don't want to save them in a buffer and then copy it into the matrix.
I tried to use MPI.INT.Create_vector
to do that. But I don't seem to get the right struct, whatever I try. I have a test script, which I start with mpirun -n 2 python3 mpi_type_tester.py
:
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
world_size = comm.Get_size()
rank = comm.Get_rank()
# Size of my send and receive matrix
height = 3
width = 3
# Variables used to define the struct
count = 3
blocklength = 1
stride = 3
# Int seemingly used to define how many of the structs are being sent?
sending_int = 1
# Here I define the struct with Create_vector:
column_type = MPI.INT.Create_vector(count = count,blocklength = blocklength,stride = stride)
column_type.Commit()
if rank == 0:
send_array = np.arange(width*height).reshape(height,width)
send_array = 1
comm.Send([send_array,sending_int, column_type], dest = 1, tag = 0)
print(send_array)
if rank == 1:
rec_array = np.zeros(width*height, dtype = int).reshape(height, width)
comm.Recv([rec_array,sending_int,column_type], source = 0, tag = 0)
print(rec_array)
When I vary now count
, blocklength
, stride
or sending_int
it just sends seemingly random things. Can someone help me understand this, or point me to some resources so I might understand Create_vector
?
CodePudding user response:
you need to pay close attention that your data has the same datatypes and size, and that you are telling MPI to send data of said size, for example numpy may use int32
or float32
if it sees it doesn't need to use a bigger type, so you have to be explicit about your data types.
import numpy as np
from mpi4py import MPI
comm = MPI.COMM_WORLD
world_size = comm.Get_size()
rank = comm.Get_rank()
height = 3
width = 3
count = 3
blocklength = 1
stride = 3
sending_int = 1
# make sure you are sending exactly 64 bits per item.
column_type = MPI.INT64_T.Create_vector(count = count,blocklength = blocklength,stride = stride)
column_type.Commit()
if rank == 0:
# specify dtype
send_array = np.arange(width*height,dtype=np.int64).reshape(height,width)
send_array = 1
comm.Send([send_array,sending_int, column_type], dest = 1, tag = 0)
print(send_array)
if rank == 1:
rec_array = np.zeros(width*height, dtype = np.int64 # specify dtype
).reshape(height, width)
comm.Recv([rec_array,sending_int,column_type], source = 0, tag = 0)
print(rec_array)
i have stripped your comments and commented where the datatypes are specified.
this only sends the first column, to send the third one you need to use Create_subarray
instead, as it accepts the start to be [0,2].
you can see vector documentation MPI_Type_vector, it is useful when dealing with GPU buffers that you can specify a block-size separate from the stride, in case your are indexing into a buffer with other "datatypes"
# total size of matrix is 3x3, and you are sending a 3x1 array starting at [0,2]
column_type = MPI.INT64_T.Create_subarray([3,3], [3,1], [0,2],)
to send a different column than the one received into you should set the type inside the sending/receiving block to point to different type.
if rank == 0:
column_type = MPI.INT64_T.Create_subarray([3, 3], [3, 1], [0, 2], )
column_type.Commit()
...
if rank == 1:
column_type = MPI.INT64_T.Create_subarray([3, 3], [3, 1], [0, 0], )
column_type.Commit()
...