Consider a linked-list of integers that is implemented using the following class List:
Under the assumption that each list is terminated with a special node SENTINEL that points to itself and with a data item larger than all elements in the list, merge-sorting can be implemented as follows (see e.g. [33]):
Obviously, listMergesort() is a parallel 2-way recursive class method and can be parallelized using the method of this paper. In this case, however, the performance can depend substantially on the implementation of the method merge(). This method must merge two sorted linked lists into one completely sorted linked list, as illustrated in figure 27. Consider, for example, the following simple implementation of merge(), in which an auxiliary node l is explicitly allocated to obtain a hook to the new list:
Figure 27: Merging two Linked Lists
Figure 28: Linked List Merge Sorting (memory allocation)
In figure 28, we show the execution time of serial and parallel merge-sorting on integer lists varying in length up to N=100,000. Rather surprisingly, parallelizing the previous implementation of merge-sorting dramatically decreases the performance. This performance decrease is due to the explicit memory allocation in merge(). If we implement merge() without the need for an auxiliary node, an example of which is shown below, then the execution time shown in figure 29 result:
Figure 29: Linked List Merge Sorting (no memory allocation)
This experiment clearly reveals that the parallelization method of this paper should only be applied to parallel multi-way recursive methods in which no explicit memory allocation is performed.