Distributions can be viewed as abstract coordinate systems that allow us to align different collections with respect to each other. If two elements from two different collections are mapped to the same Distribution point, they will be allocated in the same processor memory. Consequently, if there is a data communication between two collections, it is best to align them so that costly interprocessor communication is minimized.
Unlike Templates in HPFF Fortran, Distributions in pC++ are first class objects. A Distribution is characterized by its number of dimensions, the size in each dimension and the function by which the Distribution is mapped to processors.
Current distribution functions allowed in pC++ include BLOCK, CYCLIC, and WHOLE. To understand these consider the mapping of a one dimensional Distribution to a set of processor objects. The constructor for a one dimensional distribution takes the form
Distribution distribution-object-name(size, &(a processors object), map-function);
For example to make a one dimensional Distribution Q of size n over a processors object P with a BLOCK mapping one would write
Distribution Q(n, &P, BLOCK);
The mapping function BLOCK means that the n positions in the distribution are mapped to processor objects by associating contiguous blocks of positions with each processor object so that the load is even. For example, if P bpositions 0 through 3 will go to thread 0, positions 4 through 7 to thread 1, and position 8 through 11 to thread 2. In the case that the number of threads does not divide the size of the distribution we use the rules given by the HPF Forum.
Collections are sets of objects that are distributed over processors object threads by a two stage mapping process. One stage is defined by a distribution. The first stage is defined by specifying an alignment object. An alignment object defines the size and shape of a collection and the way in which it is mapped to a distribution.
The syntax for an Align constructor is given by
Align object-name( dimensions-of-collection, mapping-string);For example, a one dimensional collection of size 100 mapped to a one dimensional Distribution of the same size would be defined by the verb
Align OneD( 100, "[ALIGN( dummy[i], myDistribution[i])]");The mapping string is designed to be similar to the HPF ALIGN directive. it is merely a notation to describe a linear affine imbedding of one lattice into another. In other works, we use array notations to say how one array is to align with another. In the example above it is the identity mapping. The mapping
ALIGN( dummy[i], myDistribution[2*i])would map the source array to the even positions of the target. The mapping
ALIGN( dummy[i], myDistribution[2*i+1])would map the source array to the odd position of the third row of the target. However the mapping
ALIGN( dummy[2*i], myDistribution[i]) //<<< erroris in error because it does not define the odd members of the source. But the mapping
ALIGN( dummy[i], myDistribution[i/2])defines a two to one mapping of the source to the target and it is o.k.
In version 1.0 of pC++, one can only define array shaped collections and Align is always dimensioned like an array. In future versions, more general Align constructors will be made available.
The mapping function is described in terms of a text string which corresponds to the HPFF Fortran alignment directive. It defines a mapping from the domain to a Distribution using dummy domain and dummy index names.
To map a two-dimensional matrix A to a set of processors one can define a two-dimensional Distribution and align the matrix with the Distribution and then map the Distribution to the processors. Suppose we have a 7 by 7 Distribution and the matrix is of a size of 5 by 5, and suppose the Distribution will be mapped over the processor object threads so that an entire row of the Distribution is mapped to an individual processor and the row is mapped to processor i mod P.numProcs(). This mapping scheme corresponds to a CYCLIC map in the Distribution row dimension and a WHOLE map in the Distribution column dimension. The Distribution and distribution can be defined as follows.
Processors P; Distribution myDistribution(7, 7, &P, CYCLIC, WHOLE);The alignment of the matrix to the Distribution is constructed by the declaration
Align myAlign(5, 5, "[ALIGN( dummy[i][j], myDistribution[i][j])]" ); DistributedMatrix<Complex> A(&myDistribution, &myAlign);Notice that the alignment object myAlign defines a two-dimensional domain of a size , and a mapping function.
We may now align the vectors to the same Distribution. The choice of the alignment is best determined by the way the collections are used. For example, suppose we wish to invoke the library function for matrix vector multiply as follows.
Y = A*X;While the meaning and computational behavior of this expression is independent of alignment and distribution, we would achieve best performance if we aligned X along with the first row of the matrix A and Y with the first column. (This is because the matrix vector algorithm, which is described in more detail later, broadcasts the operand vector along the columns of the array and then performs a reduction along rows.). The declarations take the form
Align XAlign(5, "[ALIGN( X[i], myDistribution[i])]"); Align YAlign(5, "[ALIGN( Y[i], myDistribution[i])]"); DistributedVector<Complex> X(&myDistribution, &XAlign); DistributedVector<Complex> Y(&myDistribution, &YAlign);The alignment and the Distribution form a two stages mapping, as illustrated in Figure 2. The array is mapped into the Distribution by the alignment object and the Distribution definition defines the mapping to the processor set.
Because all of the standard matrix-vector operators are overloaded with their mathematical meanings, pC++ permits expressions like
Y = A*X; Y = Y + X;even though X and Y are not aligned together. We emphasize that the meaning of the computation is independent of the alignment and distribution; the correct data movements will be generated so that the result will be the same.