MPI provides point-to-point and collective communication capabilities. Point-to-point communication includes synchronous and asynchronous send/receive functions. Collective communication functions like broadcast, reduce, scatter, and gather efficiently distribute data among processes. MPI also supports irregular data packaging using packing/unpacking functions and derived datatypes.
Introduction to MPI communications types: Point to Point, Collective Communication, Data Packaging.
Details of point-to-point communication including send and receive protocols: synchronous, buffered, asynchronous operations, and implementation dependencies.
Asynchronous send and receive using MPI_Isend and MPI_Irecv. Detection of message completion with MPI_Wait and MPI_Test.
Types of collective communications: Broadcast, Scatter, Gather, Reduce, emphasizing synchronous communication across multiple nodes.
Data packaging techniques in MPI for non-contiguous data include MPI_Pack for sending and MPI_Unpack for receiving.
Point-to-Point Communication
Send andReceive
• MPI_Send/MPI_Recv provide point-to-
point communication
– synchronization protocol is not fully specified.
• what are possibilities?
3.
Send and Receive
Synchronization
•Fully Synchronized (Rendezvous)
– Send and Receive complete simultaneously
• whichever code reaches the Send/Receive first waits
– provides synchronization point (up to network
delays)
• Buffered
– Receive must wait until message is received
– Send completes when message is moved to buffer
clearing memory of message for reuse
4.
Send and ReceiveSynchronization
• Asynchronous
– Sending process may proceed immediately
• does not need to wait until message is copied to buffer
• must check for completion before using message
memory
– Receiving process may proceed immediately
• will not have message to use until it is received
• must check for completion before using message
5.
MPI Send andReceive
• MPI_Send/MPI_Recv are synchronous,
but buffering is unspecified
– MPI_Recv suspends until message is received
– MPI_Send may be fully synchronous or may
be buffered
• implementation dependent
• Variations allow synchronous or buffering
to be specified
6.
Asynchronous Send andReceive
• MPI_Isend() / MPI_Irecv() are
non-blocking. Control returns to program
after call is made.
• Syntax is the same as for Send and Recv,
except a MPI_Request* parameter is added
to Isend and replaces the MPI_Status* for
receive.
7.
Detecting Completion
• MPI_Wait(&request,&status)
– request matches request on Isend or Irecv
– status returns status equivalent to
status for Recv when complete
– Blocks for send until message is buffered or sent so
message variable is free
– Blocks for receive until message is received and
ready
8.
Detecting Completion
• MPI_Test(&request,flag, &status)
– request, status as for MPI_Wait
– does not block
– flag indicates whether message is sent/received
– enables code which can repeatedly check for communication
completion
9.
Collective Communications
• Oneto Many (Broadcast, Scatter)
• Many to One (Reduce, Gather)
• Many to Many (All Reduce, Allgather)
10.
Broadcast
• A selectedprocessor sends to all other
processors in the communicator
• Any type of message can be sent
• Size of message should be known by all (it
could be broadcast first)
• Can be optimized within system for any
given architecture
11.
MPI_Bcast() Syntax
MPI_Bcast(mess, count,MPI_INT,
root, MPI_COMM_WORLD);
mess pointer to message buffer
count number of items sent
MPI_INT type of item sent
Note: count and type should be the same
on all processors
root sending processor
MPI_COMM_WORLD communicator within which
broadcast takes place
Reduce
• All Processorssend to a single processor,
the reverse of broadcast
• Information must be combined at receiver
• Several combining functions available
– MAX, MIN, SUM, PROD, LAND, BAND,
LOR, BOR, LXOR, BXOR, MAXLOC,
MINLOC
14.
MPI_Reduce() syntax
MPI_Reduce(&dataIn, &result,count,
MPI_DOUBLE, MPI_SUM, root,
MPI_COMM_WORLD);
dataIn data sent from each processor
result stores result of combining operation
count number of items in each of dataIn, result
MPI_DOUBLE data type for dataIn, result
MPI_SUM combining operation
root rank of processor receiving data
MPI_COMM_WORLD communicator
15.
MPI_Reduce()
• Data andresult may be arrays -- combining
operation applied element-by-element
• Illegal to alias dataIn and result
– avoids large overhead in function definition
16.
MPI_Scatter()
• Spreads arrayto all processors
• Source is an array on the sending processor
• Each receiver, including sender, gets a piece of
the array corresponding to their rank in the
communicator
17.
MPI_Gather()
• Opposite ofScatter
• Values on all processors (in the communicator)
are collected into an array on the receiver
• Array locations correspond to ranks of
processors
Many to ManyCommunications
• MPI_Allreduce
– Syntax like reduce, except no root parameter
– All nodes get result
• MPI_Allgather
– Syntax like gather, except no root parameter
– All nodes get resulting array
• Underneath -- virtual butterfly network
20.
Data packaging
• Neededto combine irregular, non-
contiguous data into single message
• pack -- unpack, explicitly pack data into a
buffer, send, unpack data from buffer
• Derived data types, MPI heterogeneous data
types which can be sent as a message
21.
MPI_Pack() syntax
MPI_Pack(Aptr, count,MPI_DOUBLE,
buffer, size, &pos, MPI_COMM_WORLD);
Aptr pointer to data to pack
count number of items to pack
type of items
buffer buffer being packed
size size of buffer (in bytes)
pos position in buffer (in bytes), updated
communicator