LDAP can be configured to use strong security mechanisms.
2.2)A Schema for Software Components
The construction of an LDAP database involves two major design decisions. First,
the object classes from which entries are created must be defined; this is sometimes
to referred to as the schema of the database. Second, guidelines
regarding the layout of entries within the directory tree must be established.
It should be noted at this point that the Information Server is designed
to supplement the
Globus Metacomputing Directory
Service (MDS). Therefore, it
makes use of the MDS's collection of object classes that pertain to machines,
networks, and other physical resources.
Given the application-centric nature of the CAT, a set of similar
object classes for software components is also desirable.
Information about software components can be divided into two broad categories:
environment-dependent information and environment-independent information.
The former category includes information that could vary between different
installations of the same component. The installation path, for example, could
differ between hosts.
The latter category includes information that does not vary between different hosts.
This includes such attributes as the author's name and the component version.
Data for the two categories are stored in separate subtrees of the
directory tree. This separation is advantageous for several reasons.
First, it greatly reduces the amount of redundant information in the database;
it is clearly not necessary to record the author's name 100 times if her
component is installed on
100 different machines.
Second, by describing the core functionality of a component with a single,
environment-independent entry, the user is assured that all instances
of that component will behave the same, regardless of host or architecture.
The user is therefore able to select a group of components before
she chooses a set of machines. Indeed, the choice of machines could
be choosen automatically by a resource manager.
This approach comports with the problem-centric philosophy of the CAT.
It should also be noted that the effect of having one large entry
that contains both environment-independent and environment-dependent data
is achieved
by inserting distinguished name links between the subtrees. For example, each
component maintains a list of distinguished names that correspond to
the machines on which it is installed. The extra delay associated with
performing an extra query is typically imperceptible to the end-user.
Figure 3: LDAP Directory Structure
2.2.1) Environment-Independent Information
This category contains information that is constant for every installation
of the component. As such,
every component has one (and only one) environment-independent entry.
In general, a component entry will contain the following attributes:
- un: A name that uniquely identifies the component
- machine: A list of machines on which the component is installed
- format: A generic description of the executable type: native binary,
java bytecode, perl script, etc.
- platform: The operating system or environment in which the component executes:
Solaris2.6, Java1.1.x, Perl5.001, etc.
- description: A human-readable description of what the component does. For example,
"This component reads in a matrix from a file system and places it into memory."
- readme: A human-readable description of any other component-specific information
such as bugs and hardware requirements.
- version: The version number of the component.
- author: The author's name; alternately, a distinguished name that points
to an entry that describes that author.
- inputtype: A list of input types that this component accepts. In the CAT model,
these correspond to input ports [see below].
- outputtype: A list of output types that this component produces. In the CAT model,
these correspond to output ports.
The CAT component model delegates I/O to objects called ports. Ports are
intimately related to component compatibility, so it is crucial that port
entries be included in the database.
Port entries are located immediately beneath their associated component entry in the
directory tree. They contain the following attributes:
- pn: The name of the port. Port names must be unique for a single component,
but need not be unique between components. Thus, every component could have a
port called Input_Port1. Note that the name "Input_Port1" is serving as an
example, and component
designers would generally give their ports more descriptive names.
- dataFlowDirection: A string that describes whether the port handles input
or output or both.
- tn: the data type associated with the port.
A port entry can convey further information about its data type by using one of
two optional attributes:
- javaByteCode: Java byte-code for the port data type. Using Java's dynamic class
loading mechanism and reflection capabilities, the CAT can check for potential
incompatibilities between ports.
- idlDefinition: The definition a data type in the CORBA Interface Definition
Language.
2.2.2) Environment-dependent information
This information can vary between hosts, and each component has an
environment-dependent entry for every installation. These "component installation"
entries are stored immediately beneath their corresponding machine entry in the
directory tree, and they contain the following attributes:
- un: The unique name of the component. This is the same unique name that is
stored in the environment-dependent subtree.
- protocol: A list of procreator protocols that can be used to instantiate this
component: gram, rsh, etc.
- readme: Other human-readable, machine-specific information
The CAT model allows for component instantiation by any number of procreation
protocols. Currently, the CAT supports remote shell (rsh) procreation and Globus
GRAM procreation. This information is described in "component protocol" entries,
which are located beneath their corresponding component installation entry
in the directory tree. They contain the following attributes:
- protocol: A protocol by which the given component can be instantiated on the
given machine. Each component protocol entry describes only one component protocol.
- url: A list of procreator URLs that understand this protocol.
- path: The executable path for the component that corresponds to this protocol.
2.3) Information Server Implementation
Although the Information Server
should be viewed schematically as one large LDAP database, it
is actually
the synthesis of two smaller LDAP databases. Information about machines, networks,
and hardware devices is stored in the Globus MDS, while information about software
components is stored in a local LDAP database. This unnatural separation was
necessary because of the difficulty of grafting the aforementioned software
component schema onto the pre-existing MDS schema for hardware entries.
Fortunately, the distributed nature of the IS is made transparent to database clients
by way of the LDAP referral mechanism, which allows an entry to contain a reference
to another entry, possibly in a completely separate database. Referral-aware clients
may query the distributed LDAP database as if it were a single database.
3) Information Browser
The Information Browser (IB) is a graphical tool that gives the user
access to the information stored in the Information Server.
The IB can operate as a standalone module, or it can be associated with an
instance of a CAT. In the latter case, the user can directly instantiate
components on a CAT workspace without leaving the IB. This process is described
below.
The fundamental task of the IB is to formulate LDAP queries based on user input.
For this purpose it uses a two-phase query process. In the first phase,
the user obtains a list of machines or components that match certain
criteria. In the second phase, the user obtains a detailed description of
one of those machines or components.
The first query can be made in one two modes.
In browse mode, the user can descend the directory hierarchy without
formulating a specific query. This allows the user to get an overview of
the software and hardware resources available on the system.
In search mode, the user can submit a specific query to find information
about a particular set of components or machines.
Figure 4: Infobrowser screen 1
After submitting a search or browse request, the user is presented
with a list of components or machines that satisfied the query.
Double-clicking on a machine or component name brings up a separate
dialog window that gives a detailed description of that object.
The top half of the dialog window presents a table of
attribute/value pairs, which
correspond to a subset of pairs stored in the LDAP database.
Figure 5: Infobrowser screen 2
If the IB is associated with a CAT instance, the user can create an instance
of a component directly on the CAT workspace.
The bottom half of the dialog window contains either a list of machines
(if the user initially choose a component entry) or a list of components
(if the user initially choose a machine entry). The user must
select a machine or component from the bottom half of the dialog window
and procreator to process the request. If an RMI connection
exists between the IB and the CAT, the new component will be instantiated,
and an icon will be placed on the CAT workspace.
An IB can become associated with a CAT instance in two ways. First, if the
IB was forked by a CAT, then that IB is autmatically associated with its
parent CAT process. Second, a user can connect an autonomous IB to a CAT
by entering the CAT's RMI URL on the first screen of the IB.
3.1) Information Browser Implementation
Given the highly distributed nature of the CAT, it was necessary to program
the Information Browser in an architecture-independent language. Java
was a natural choice to minimize the complications of porting the IB
to new platforms.
The IB makes heavy use of the
Java Naming and Directory Interface
(JNDI) for accessing the LDAP database. JNDI is a general Java API that
provides access to numerous naming and directory services including
LDAP, NDS, DNS, NIS, and the CORBA COS Naming Service. In keeping
with the Java philosophy, JNDI processes directory queries through
a series of object-oriented abstractions. This guarantees that the
actual database could be changed with few changes in the client-side code.
The IB also makes use of the new
Java Swing classes .
The Swing classes, which are part of the Java Foundation Classes,
provide several of the graphical widgets used in the IB such
as the tree and the table. Swing is written in pure Java, and will
be a standard package in the Java 1.2 release.
The IB and the CAT can reside in different JVM's, so they must use
RMI to communicate. Specifically, the communication is handled by
Fabian Breg's NexusRMI library. NexusRMI is syntactically identical
to Java RMI, but affords better performance in a scientific computing
environment.
3.2) Information Publisher
The Information Publisher is a utility that automates
the task of publishing new components into the
Information Server. The component developer enters the pertinent
comonent information into a series of three forms.
The first form requests general component information such as
author, version, and description. The second form requests
information about the data types associated with the component.
The third form requests a list of machines on which the component
is installed. After all of the information has been supplied,
the Information Publisher updates the appropriate entries
in the Information Server.
Figure 6: Component Information Publisher
The Information Publisher also allows the user to modify an
entry already stored in the Information Server. After the
user selects an entry to modify,
the form attributes are filled in with their previous values.
The user may override any of the old values and upload the modified
entry to the Information Server.
Because CAT components in general do not have a reflection mechanism,
the Information Publisher is unable to check the validity of
component information. The onus is on the component developer,
therefore, to insure the validity of the information that she uploads
to the Information Server.
4) Comparison to Related Technologies
Several technologies and standards provide services similar to the RIS in
different settings. For various reasons, however, they fail to meet all
of the needs of the CAT.
- Globus Metacomputing Directory Service
- The MDS stores information about the state of the Globus metacomputing environment.
The CAT relies on Globus to handle low-level hardware interaction. As such,
the RIS uses the MDS to store computer and network information. The RIS extends
the MDS by including support for software components.
- Agora
- Agora is prototype system being developed by the Software Engineering
Institute at Carnegie Mellon University whose aim is to develop an indexed
database of software components. The Agora graphical interface employs
a two-step query process similar to the RIS Information Browser.
However, the Agora system does not maintain a database of component
information. Instead, Agora clients perform introspection on
components to learn about their characteristics. As such, the Agora
system is limited to Java Beans and CORBA components. The CAT must
facillitate a more general type of software component that may or
may not provide an introspection mechanism.
- Jini Lookup Attribute Schema Specification
- The Jini Lookup Service allows Jini services to adverstise
themselves to potential clients. A Jini service represents some sort
of autonomous computing entity, and is therefore akin to a CAT
component. Jini services are described by entries, which contain
any number of attributes. Clients locate services by querying
the Lookup Service with JNDI. All Jini services and clients
must operate within a Java Virtual Machine. Therefore, only Java-based
components can exist within a Jini framework.
5) References
[1] CAT DEVELOPMENT TEAM, SuperComputing '98 HPC Challenge,
11 November 1998.
http://www.extreme.indiana.edu/sc98/sc98.html
[2] D. GANNON, R. BRAMLEY, T. STUCKEY, J. VILLACIS, J. BALASURBRAMANIAN,
E. AKMAN, F. BREG, S. DIWAN, AND M. GOVINDARAJU,
Component architectures for distributed scientific problem solving.
Special Issue on Languages for Computational Science and Engineering.
[3] R. BRAMLEY, D. GANNON, J. VILLACIS, A. WHITAKER,
Using the grid to support software component systems ,
Presented at 1999 SIAM Conference.
[4] I. FOSTER, C. KESSELMAN, ET. AL.,
Globus Project , 1999.
http://www.globus.org
[5] SUN MICROSYSTEMS,
Java Naming and Directory Interface , 1999.
http://java.sun.com/products/jndi
[6] SUN MICROSYSTEMS,
Java Foundation Classes , 1999.
http://java.sun.com/products/jfc
awhitake@extreme.indiana.edu
Last modified: Thu Feb 18 10:08:38 EST 1999