Collection parallelism is based on the concurrent application of a class member function to every object in a large set. However, it is not always the case that we want every object to carry out this action. Often we only want to apply the function to a subset of the objects. pC++ allows a mechanism to select a vector subset of elements in a parallel action. We will illustrate the use of this operator below.
To implement a parallel tree reduction the collection will define the addition operator as virtual so that it may use it in the MethodOfElement section.
Collection C: SuperKernel{
public:
C(Distribution *D, Align *A);
ElementType reduce1();
MethodOfElement:
virtual ElementType &operator +=(ElementType &);
void sumFrom(int toRight);
};
The element function sumFrom() access an element a given distance to
the right of itself and adds that element to its own value. To access an element
at a given offset from the current, i.e. this, element, one must
have access to the collection structure. This is accomplished through
the pointer ThisCollection which is added to each element by the
SuperKernel. There is a bug in version 1.0 of pC++: ThisCollection can not
be used in an inlined MethodOfElement function defined in the body of the
collection, so we must define sumFrom outside the collection C
as follows.
void C::sumFrom(int toRight){
ElementType *ptr_neighbor;
ptr_neighbor = (*ThisCollection)(Ident + toRight);
(*this) += *ptr_neighbor;
}
There is another way to access an element that can be found by an offset
from the current position. The SuperKernel function Self can be
used to achieve the same result by replacing the statement
ptr_neighbor = (*ThisCollection)(Ident + toRight);
with the equivalent expression
ptr_neighbor = Self(toRight);
The reduction function will run in each processors object thread and invoke parallel action on the elements. In the first step it will ask all the even indexed elements to fetch the values from the odd elements and add these to their own values. Next it will have all the elements that are multiples of 4 add the sums from the remaining even elements, etc. The result will leave to total in element 0.
ElementType C::reduce1(){
int i;
int n = dim1size;
ElementType E;
for(i = 1; i < n; i = 2*i)
this[0:n-1:2*i].sumFrom(i);
}
Note that we have used a parallel subset operation to refer to the elements
of the collections that are multiples of 2*i. This operator uses a
fortran 90 vector syntax
[ base : limit : stride ]
to define the range of elements in the subset. Note also that this
operator is applied to a POINTER to the collection.