Next: Multi-protocol design
Up: Performance results
Previous: Test set C: Gantt
Observations
Figure 4 compares
throughput for Sun RMI, Nexus RMI and SOAP RMI. The performance was
also compared to a transfer of serialized array and linked list data
over a raw socket connection. In general SOAP RMI is approximately
ten times slower than Sun's implementation, not surprising considering
the relative sizes of data that must be sent for the same object.
Surprisingly, SOAP RMI outperforms Nexus RMI for small data sizes
(Figures 11 and 12). For
the linked list example, the cross-over point was around 10 nodes;
for double arrays this was around 100 elements. Since SOAP RMI
is consistently slower in the corresponding serialize/deserialize tests,
this implies that the effect comes from buffering implementations and
possibly from the cost of conversion to network representations.
It also implies that using SOAP is not only acceptable for small
messages such as control signals or small parameter exchanges
between remote components, but that SOAP is actually preferrable
in some cases.
Sun's native Java serialization and deserialization is closely tied to
Java and not surprisingly, turns out to be the most efficient of the
four. Nexus's serialization and deserialization protocols are also
binary, like Sun's, but the serialization is designed to be
interoperable with C/C++, which adds some overhead to its performance
relative to Sun's native Java serialization. Apache SOAP uses DOM
[10] for deserialization. It provides the ability to plug-in
different encoding-styles: Soapv1.1 Encoding, literal XML and XMI (XML
Metadata interchange [21]). The nanoSOAP implementation is a
fast, simple implementation of serialization and deserialization of
Java into SOAPv1.0 encodings and it uses SAX [16] for deserialization.
Figure 6:
Serialization-deserialization sizes for linked list (top) and array
(bottom) on E10K.
SizeLinkedList
SizeArray
Figure 6 shows that the size of a serialized data
types in SOAP is approximately ten times larger than in Sun native
serialization. This increase in size is from the translation of
binary data into text. For example,
in Java, each double takes 8 bytes. The string
representation in XML of a double with 16 digits of precision takes at
least 16 characters (and so 16 bytes in UTF-8) in addition to the 17
bytes for the tags <double> and
</double>. Thus, each double serialized into XML could take at
least 33 bytes. If the data is transferred using Unicode, that estimate
doubles. In any case, the overhead is
at least a factor of four larger in the XML representation of
a double array. Since SOAP uses XML for data representation, this
overhead is intrinsic to the SOAP protocol and cannot be removed by
choosing a better implementation.
Serializing Java objects into SOAP-encoded XML data takes
approximately ten times more memory than the binary representation.
Figure 5 compares serialization and
deserialization throughputs for E10K. Sun's native
serialization-deserialization is the fastest. Nexus's performance is
comparable to Sun's. Serialization and deserialization speeds for
SOAP-based (Apache SOAP and nanoSOAP) implementations are
approximately 100 times slower and their throughtputs are also a 100
times lower.
Some additional observations can be made from the experiments:
- The most significant defect of using SOAP for RMI is
performance; just sending the 8-byte double in XML, <double>
3.141592653589793E+000 </double>, requires 40 bytes of data.
Determining the precise performance penalty is important for
deciding when SOAP is appropriate. Figure 6
shows that SOAP's data representation size in general is about 10
times the size of binary representations.
- The asymptotic behavior between socket and Sun RMI (especially
for the local-remote test) is similar
(Figure 11). This indicates that Sun RMI
imposes a minimal overhead for marshalling and unmarshalling of
data.
- The nonmonotone behaviour of Nexus RMI (see
Figures 11, or 12)
appears consistently and is statisticially significant. Since it
does not appear in the corresponding serialization tests, this
implies it is an artifact of how Nexus RMI handles buffers or its
on-wire data representation.
- There is a clear difference (a factor of 2-10) in performance
between JIT-compiled and interpreted code for Linux
(Figure 9). Serialization in particular has
small tight loops that are readily amenable to compilation
optimizations.
The Gantt diagrams from Test C in Figure 7 shows that
costs of actual communication and data copying are considerably
smaller than time spent on serialization and deserialization of XML
encoded messages. Serialization and deserialization need to be the
primary targets for improving performance of SOAP-based protocols.
The graph data suggests at least a thirty percent improvement is
possible if serialization and deserialization were pipelined.
In more detail, the time taken for the roundtrip linked list example
may be divided into the following segments:
- Serialization converts an object into its persistent state; SOAP
uses XML as its serialization format. Java RMI serialization is
approximately 100 times faster than nanoSOAP.
- Deserialization converts objects from their persistent state to
their representation in memory. Deserialization in SOAP involves
parsing the XML representation of an object and instantiating the
object using reflection. In Java RMI deserialization the class
structure of the object being deserialized is already known. On the
other hand, in SOAP deserialization the class structure is learned
as the XML is parsed. This coupled with the already large size of
the XML representation of the serialized object makes the SOAP
deserialization considerably less efficient. The Figures in
Appendices C and D show this dichotomy consistently.
- Buffer copying: Ideally, any distributed object protocol should
ensure that there is no copying of buffers from the network layer to
the runtime system (i.e., zero-copy protocol). However, for
transporting SOAP over HTTP, the serializer output needs to be
copied into a buffer before sending it on the wire if the length of
the stream is to be sent as an HTTP header. The cost of copying is
small and could be eliminated if the content-length header is not
sent.
- Network: RMI involves sending serialized representations of
objects over a network. The time taken for this is directly
proportional to the size of the serialized representation for the
low-latency networks and large objects used in the testing.
Next: Multi-protocol design
Up: Performance results
Previous: Test set C: Gantt
SoapTeam: Extreme Computing Lab
2000-08-17