| n | JDK1.0.2 |
| 500 | SPMquotMflops: 3.3 Time: 25.8 s. Norm Res: 5.28" |
| 1000 | SPMquotMflops: 3.2 Time: 210.9 s. Norm Res: 9.61" |
| JDK1.0.2 + native Level 1 BLAS | |
| 500 | SPMquotMflops: 8.1 Time: 10.3 s. Norm Res: 4.50" |
| 1000 | SPMquotMflops: 7.6 Time: 88.5 s. Norm Res: 11.13" |
| JDK1.0.2 + native Level 1 BLAS (parallel factorize) | |
| 500 | SPMquotMflops: 9.7 Time: 8.6 s. Norm Res: 4.50" |
| 1000 | SPMquotMflops: 15.6 Time: 43.0 s. Norm Res: 11.13" |
In tables 1-3, we show the results on the IBM, Sun, and SGI, respectively, for a pure Java implementation of this benchmark and an implementation that uses native implementation of the primitives DDOT, DAXPY, DSCAL and IDAMAX from Level 1 BLAS. For the IBM, we also present the performance of a version in which loop parallelization has been applied to the elimination step in the factorization.
Again, it is clear that providing native Level 1 BLAS can improve the performance substantially. Note, however, that the mapping between Java and C data types caused a slight change in precision of the computed result.
| n | JDK1.0.2 |
| 500 | SPMquotMflops: 8.5 Time: 9.9 s. Norm Res: 5.17" |
| 1000 | SPMquotMflops: 8.5 Time: 78.6 s. Norm Res: 10.10" |
| JDK1.0.2 + native Level 1 BLAS | |
| 500 | SPMquotMflops: 18.4 Time: 4.6 s. Norm Res: 5.77" |
| 1000 | SPMquotMflops: 18.6 Time: 35.9 s. Norm Res: 10.44" |
| n | JDK1.0.2 |
| 500 | SPMquotMflops: 0.3 Time: 265.4 s. Norm Res: 5.17" |
| 1000 | SPMquotMflops: 0.3 Time: 2119.5 s. Norm Res: 10.10" |
| JDK1.0.2 + native Level 1 BLAS | |
| 500 | SPMquotMflops: 7.3 Time: 11.6 s. Norm Res: 5.77" |
| 1000 | SPMquotMflops: 7.0 Time: 95.9 s. Norm Res: 10.44" |