next up previous
Next: Events and Messaging: Desiderata Up: XEVENTS/XMESSAGES: Application Events and Previous: Introduction


Survey of existing systems

There are a number of event systems available. We briefly describe some of them and highlight the features that are necessary and those that are lacking in the existing systems.

CORBA Events

A CORBA event channel is an intervening object that allows multiple suppliers and consumers to communicate asynchronously. An event channel is both a consumer and a supplier of events. Event channels are standard CORBA objects and communication with an event channel is accomplished using standard CORBA requests. References to the event channel may be received from a Naming Service or from specific-to-task object protocol. CORBA supports a Naming Service to locate listeners and has both a pull and push model. Event types are described using an Interface Definition Language (IDL) and the OMG specification describes mechanisms for load balancing and recovery. The quality of service depends on the implementation of the event channel and the reliability provided may vary from ``at most once'' delivery to a ``best effort'' policy. The implementation also determines the order in which the events are are received.

Jini Distributed Events

Distributed events in the Jini framework allow an object in one virtual machine (VM) to register interest in the occurrence of an event in another object, possibly running in another VM, and receive notification when such an event happens. The Jini Event System uses the Jini Lookup Service for naming which can be optionally used with Java Native Directory Interface (JNDI). An event is a Java object that can be subtyped for extensibility. The listener interface is simple and aids in the use of a flexible publisher model. Jini supports leasing and uses Java-RMI as the communication substrate. It leverages the built-in security of Java. However, Jini events are designed to work only in the Java environment and are not equipped to work with firewalls and Network Address Translation (NAT). Third party objects can handle the distribution of events using Store and Forward mechanism, Notification Filters and Notification Mailboxes. The protocol does not indicate the reliability and timeliness of the notifications and is left to the implementation of the various objects.

In designing a distributed event system, it is desirable to have a programming language independent solution since the Grid contains heterogeneous applications written in various languages. This means that the representation of the data on the wire also needs to maximize portability. It is widely accepted that XML is an ideal choice for platform and language independent representation of data and as an extension, SOAP is preferred for invoking remote procedure calls (RPCs). The Web services community has also gone in with this.

Java Messaging Service

The Java Message Service (JMS) is a Java API that allows applications to create, send, receive and understand messages. It defines a common set of interfaces and associated semantics that allows Java programs to communicate with other messaging implementations. JMS provides a loosely coupled architecture that supports asynchronous communication and guarantees reliable delivery of messages. The specification provides for both point-to-point messaging using queues and the publisher/subscriber approach using topics as intermediaries. Messages can be consumed both synchronously (``pull'') and asynchronously (``push''). It also has message filtering capabilities in the form of message selectors based on a subset of SQL92 conditional expression syntax.

However, JMS does not specify how events are represented on the wire. As a result, interoperability between different JMS providers is not ensured and the primary intention is to provide a standard messaging API.

ECho Event Delivery System

ECho is an event delivery middleware system. It is designed as an anonymous group communication mechanism. It has an efficient event propagation middleware, the event channel, which achieves good performance characteristics due to the use of a binary protocol for event transmission. ECho supports the publish/subscribe model of communication and can interoperate with CORBA and Java-based components.

In spite of these features, interoperability is lacking due to the use of a custom binary protocol which other messaging systems will have to use in order to be compatible with it.

Grid Monitoring Service Architecture

The Grid Monitoring Architecture is a working group of the Global Grid Forum and it is focused on producing a high-level architecture of the components and interfaces needed to promote interoperability between heterogeneous monitoring systems on the Grid. Events are defined to be timestamped sets of information with a type. Means of interaction between producers and consumers, that are being considered, are ``publish/subscribe'', ``request/response'' and ``notification''. The wire protocol used to transfer events is not specified and neither are delivery guarantees mentioned. A Grid Information Service (GIS) is used to maintain an event dictionary. In addition, the producers register themselves with the GIS and the consumers can search for producers registered with the GIS. The Grid Forum is in the process of adopting LDAP as their standard for information services.

Nonetheless, the GMA is interested in events for performance analysis and problem monitoring and the requirements for it vary from that of application level events.

SoapRMI Events

SoapRMI Events is our previous version of an application level messaging system for distributed environments. SOAP is used as the format for specifying the events and invoking the RPC calls. An event is uniquely identified by its namespace and type. A source field is used to indicate the originating point of the event. A timestamp is also present along with a message field for including additional information. A handback field can be provided by the listener to the publisher and this is set when returning events to the listener.

An event publisher and event listener, that produced and consumed events respectively, are defined. An event publisher can generate events and publish it by just writing preformatted SOAP string into the socket layer. This gives it a light footprint and makes it useful for embedded systems and resource monitors, other than regular grid applications.

One of the problems with the publisher is the handling of exceptions in case an event cannot be sent due to a network glitch or the listener being temporarily unreachable. An exception is thrown by the publisher and the application has to handle it. This puts an additional burden on the application program writer to handle failure cases. In addition, the event send is a blocking call to enable synchronous failure notification and this overhead becomes significant when many events are sent. We have overcome these limitations by using a non-blocking send in a separate thread of execution, coupled with a logging mechanism that enables retrying sending the events in case of an error. We use an intermediary known as publisher agent for this purpose.

Retrieving events in SoapRMI events is based on leasing. An event listener subscribes to a publisher with its remote reference, the lease duration and the type of events it is interested in. The publisher uses the remote reference of the listeners to send events to all those registered with it for a particular event type it has for publishing. This leasing approach is used to reclaim resources in case a listener fails to unsubscribe. An automatic lease renewal system, that enables the lease to continue as long as the application exists, is also available.

This approach is elegant and works well on a small LAN but does not scale effectively on a large scale Internet environment. The problems are due to two factors: (1) not all clients have static or globally accessible IP addresses and (2) network and server outages are fairly common in a large network.

Every listener's remote reference is a URL that includes the IP address of the host on which it is running. These are typically dynamic IP addresses and, in combination with firewalls, are inaccessible outside the local network. So a listener with such a remote reference cannot have events pushed to it when subscribing to a publisher outside the local network. This is overcome by allowing an alternative way of retrieving events by ``pulling'' them from the publisher (i.e.) the listener initiates the retrieval by contacting the publisher instead of the publisher having to use the remote reference of the listener.

The latter problem relates to robustness in the event of a network failure. If a listener's lease expires during a network outage, the publisher discontinues retaining the events for that listener since the lease is not renewed. When the resubscribe finally gets through, the listener has lost events that took place in the interim between the connection being lost and the resubscription occurring. One way to manage this is for the publisher to save the events in a persistent store so that the events are retained. In this case, the listener can use the timestamp of the event to select the ones to retrieve. But it is still possible to retrieve duplicate events or miss some of them due to the nature of the timestamp. This problem is surmounted by having the mechanism to have unlimited or persistent subscription. The listener is granted the persistent lease by an agent while the agent takes care of maintaining a valid lease, of limited duration, with the channel. Also, each event retrieval is assigned a permanent id that encapsulates the state information so that using that ID always returns the same set of events from the channel along with the ID to be used for the successive set of events. So, even if a lease is lost, events can be retrieved from where they were left off.

next up previous
Next: Events and Messaging: Desiderata Up: XEVENTS/XMESSAGES: Application Events and Previous: Introduction
Aleksander Andrzej Slominski 2002-09-20