Finally, we discuss some possible future extensions of our method by means of the following simple implementation of a min-max search algorithm for the game tic-tac-toe. In this implementation, we assume that the method evalBoard() yields one of the values -1, 0, or +1 if `O' wins, the game is a draw, or `X' wins, respectively, or the value UNDECIDED otherwise:
The method minMax() can be called with an arbitrary board position and a boolean value that indicates whose turn it is (if b holds, then player `X' must place the next symbol). Clearly, this method can be easily modified to yield the best move as side-effect of the evaluation. For example, given the board positions shown in figure 30, this algorithm evaluates to the value -1 and 0, respectively, yielding the moves shown in the same figure.
Although searching different part of the state space can clearly be performed in parallel, for several reasons our method is not directly applicable. First, the different recursive method invocations are controlled by a for-loop, rather than having some fixed recursive method invocations that appear statically in the program text. Second, although the different method invocations could operate on different copies of the board, for efficiency purposes, the state space searching is implemented by doing and undoing moves on a single board. Finally, to avoid the need for additional storage, the best evaluation seen so far is updated immediately after each recursive method invocations, rather than computing the best evaluation after all recursive method invocations are done.
Fortunately, these problems can be easily dealt with. The first problem can be solved using an array of appropriate tree-workers, rather than a fixed number of scalar tree-worker. In such cases, it is more convenient to start a thread for each method invocations that can be done in parallel, rather than treating the last invocation differently. Furthermore, the second problem can be resolved by implementing by-value parameter passing for board. However, because in the previous section we have seen that explicit memory allocation may have a dramatic impact on the performance, this copying is only done in the top levels of the method invocation tree. The last problem is solved by simply distributing the for-loops around the recursive method invocations and the computation of the best evaluation, which is possibly because the intermediate result of each method invocation is stored in field result_x of the corresponding thread.
The class TreeWorker_x for the parallel version has the following form:
The previous observations give rise to the following method minMax_par_x(), where minMax_ser_x() is an unaltered copy of the original method, and copyBoard() is an auxiliary method that yields a new copy of a board:
For an empty board, for example, the serial and parallel version using cut-depth c=0 evaluate to a draw in about 3.32 and 0.93 seconds respectively, yielding a speedup of about 3.6.