Next: Quick Sorting
Up: Tree Traversals
Previous: Straightforward Parallelization
As alluded to in section 3.2.2,
the overhead of passing an additional parameter
during serial execution in the bottom levels
of the method invocation tree can be eliminated
at the expense of some code duplication
by using an unaltered copy of the original
method at this stage. For example,
if we denote this copy by compLevel2_ser_b()
and all recursive method invocations in this copy
are renamed accordingly, then compLevel2_par_b()
can be expressed, for instance, as follows:
Figure 21: Improved Unbalanced Tree Traversal (instance method)
In figure 21, we
present the parallel execution time for cut-depths c=1
and c=3. The execution time for cut-depth c=1 reveals
that after this improvement, the overhead
of the parallelization method is almost
negligible. As expected, speedup is obtained
again for cut-depth c=3. In the next section we will
see that starting some additional threads
can also be useful to overcome less trivial
load imbalancing.
ajcbik@extreme.indiana.edu