next up previous
Next: Brief survey of RMI Up: Requirements for and Evaluation Computing Previous: Requirements for and Evaluation Computing

Introduction

Distributed component systems [1,23,26,27] for scientific and engineering computing can potentially provide the same benefits that components do for business and financial computing: seamless access to remote resources, plug-and-play software composition without recompilation, access to specialized non-compute resources like data warehousing and visualization, and improved software reuse. Component systems also can provide a natural milieu for large multidisciplinary research teams with pools of expertise distributed across the country. However, scientific computing presents challenges not typically found in commercial applications. In particular, the components encapsulate parallel programs sending large, complex, and rapidly changing data objects.

Because of this, the underlying communications substrate plays a more critical role than it does for commercial computing. The communication system must be capable of high-performance messaging, efficiently using specialized high-speed networks like Abilene, MREN, or ESNET when possible, using modern research protocols like quality of service and source routing, and handling parallel-to-parallel component communications. Modern communication systems are based on remote method invocation (RMI) protocols that allow an object in one address space to invoke the methods of an object in another address space. The desiderata for an effective RMI system include reliability, robustness, human accessibility, readily usable APIs in modern computer languages, interoperability across languages, platforms, and component frameworks, and integration with the emerging Grid infrastructure [18] and standards like COM, OMG's Corba Component Model and the DOE Common Component Architecture [1] specifications.

Unfortunately, no single RMI system provides all of those features. The problem of designing a ``universal'' RMI is difficult. The format of data and the protocol used to exchange it is a determining factor in the degree of interoperability among applications. The lack of a reliable, universally understood data-exchange format has long limited effective communication between heterogenous systems. In recent months, XML [11] has emerged as a standard for representing data in a platform-independent way. XML is essentially a tree-oriented data representation language that is simple to generate and parse. Also, HTTP has emerged as a simple, universally supported protocol for exchanging data over the Internet. HTTP requests/replies are readily passed through firewalls and handled securely, unlike the notoriously unsafe execution of arbitrary remote procedure calls (RPC) or RMI code. Thus, moving XML data via HTTP is an attractive way for distributed applications to communicate with each other. SOAP does precisely that. By expressing RPCs independent of platforms, it opens the possibility of implementing other architecture-specific protocols in SOAP. In particular, this makes SOAP attractive as an intermediary protocol into which other protocols can be easily translated - one reason why Microsoft has targeted SOAP as an entry mechanism to the COM world [3].

The same feature that makes SOAP attractive causes a potential performance problem: tagged data is sent as characters. By contrast, Nexus [17] is designed for high-performance communications and when layered with HPC++ [19] presents to the user a complete RMI system which can interoperate with Java [24]. However, Nexus has robustness problems and the failure of a global pointer can lead to complete deadlock of the distributed computation. Java RMI tends to be more robust, partly because of Java's exception and error-handling system. Java RMI is, however, a single-language protocol, and scientific computing relies on languages like Fortran90, C/C++, and Matlab as well.

This paper addresses the following questions:

This paper addresses these questions and shows that SOAP can be used to build a reliable, multi-protocol RMI system that can access desktop component technology like Microsoft COM and other non-Java software components. However, when additional performance is needed a multi-protocol approach allows a faster, more specialized protocol to be dynamically inserted to move data. Several efforts have been started to extend SOAP to have security and higher performance [2,7] but at the cost of reducing its simplicity and universality. Many modern scientific computing systems now involve multiple languages, using the strengths of each where appropriate: Java for GUIs, Fortran for fast complex arithmetic, C/C++ for operating system interactions, Matlab for rapid prototyping, etc. Analogously, our work suggests that rather than try to extend a single communications subsystem to handle the wide range of scientific computing requirements, the more effective solution is to use multiple communication protocols.


next up previous
Next: Brief survey of RMI Up: Requirements for and Evaluation Computing Previous: Requirements for and Evaluation Computing
SoapTeam: Extreme Computing Lab
2000-08-17