MDS Component Software Format

A Proposal by the Extreme! Computing Lab, IUB



Below are some ideas on the format of the MDS component software database schema/format.
We are basing this format on the following assumptions:

  • complete information on where a component is AND how to start it up should be contained in the database; (potential problem: this information will need to be updated by the client/end user if any of it changes; it would be nicer/better if the server on which the component runs could "automatically" publish any changes to the MDS)

  •  
  • components live in a heterogenous world, and hence platform-specific information should be included in the database

  •  
  • the process creation mechanism may (or may not) use the component information in order to create the component processs; however, it should be included to cover all possible implementations of process creation (e.g., GRAM)
  • NOTE:  the "official" response from ANL is here.  and this was their answer to our reply.

    Common Attributes

    The following are common information attributes for components.  We split these according to whether or not the attributes depend on the underlying implementation/architecture/hardware (aka the "environment").

    Environment Independent Attributes

    This section contains component information that is independent of the platform on which it will run.
     
  • component name
  • example: "ReadMatrix"
  • purpose: to signify a generic name for the component
  • rationale: we will use this as a generic identifier for the component;  for example, in the CAT, an end user selects a component (represented by its identifier) from a list;  all identifiers should be _unique_ within this list (i.e., within the MDS database component namespace); the CAT uses a selected identifier to generate a unique component instance ID; for example, if the component name is "ReadMatrix", then when the CAT creates an instance of "ReadMatrix", it assigns it a unique name, say "ReadMatrix-1"
  • component format  [ OPTIONAL ]
  • example: "native binary", "java bytecode", "perl script", "C++ source"
  • purpose: a generic description of the file type of the executable; could also use MIME descriptions
  • rationale: this information might be useful to a user if he wishes to know MIME type of a component's "executable"; presently unused in the CAT
  • component platform  [ OPTIONAL ]
  • example: "Solaris2.6", "Java1.1.3", "Perl5.001"
  • purpose: the type of OS/environment that the component will execute in
  • rationale: this information might be useful to a user if he wishes to know the software environment in which the component will run;  this, together with the component format allows a user to determine whether the component he wishes to use is "out of date" wrt the component's version (e.g., suppose the "ReadMatrix" component is "Java bytecode" which only runs under "JDK1.0.2"; the end user may not want to run it since he knows the numerics in "JDK1.0.2" suck in comparison to those in later releases (e.g., "JDK1.2"); presently unused in the CAT
  • component description
  • example: "This component reads in a matrix from a file system and places it into memory."
  • purpose: human-readable description of what the component does
  • rationale: this is a human-readable description of what the component does; the description level can be as vague or detailed as the component developer chooses; this information is static; the CAT presents this information to the user before an instance of the component gets created; more dynamic information on a component can be obtained via a method call on the component (e.g., componentInstance.getDescription()); the compnent description could be split into "standard information" (e.g., component version number, author, date of last modification, etc.), and "semantic information" (e.g., isA matrix_solver, requires filename input, outputs a matrix data structure in Harwell-Boeing format, etc.)
  • component interface
  • scenario 1: inputs/outputs only
  • example: "number of inputs=2 ; number of outputs=3; input[1] of type integer; ..."
  • purpose: to describe the inputs and outputs of the component; this should be in both textual form (e.g., "input[1] is used for specifying matrix dimension") as well as some kind of generic type specification/instance
  • if java, use class bytecode to represent an input or output
  • if not java, use an IDL (textual) description
  • scenario 2: functions only
  • example: "number of functions=3, function 1: char* ReadMatrixFile(int i), ..."
  • purpose: to describe the function prototypes of the component
  • use IDL (textual) description for all function prototypes
  • provide human-readable description of functions which explain what the functions do
  • scenario 3: combination of functions, their arguments, their return types, and descriptions of each ...
  • NOTE:  All components are assumed to have two sets of interfaces:  system and user-defined.  The system interfaces are required by the component framework, and MUST BE PROVIDED since they define membership in the component framework.  The user-defined interfaces are optional.  So, a list of each of these types of interfaces should be provided in the database.
     

    Environment Dependent Attributes

    This section contains component information that is DEPENDENT on the evironment in which it will run
     
  • component executable
  • example: "/usr/local/bin/ReadMatrix"
  • purpose: to indicate the complete path and name of the executable
  •  component estimated resource requirements  [ OPTIONAL ]
  • example: "10MB RAM, 32MB Disk, 0.25 CPU LOAD"
  • purpose: the estimated minimum/average/maximum resource requirements during execution
  • component resource path   [ OPTIONAL ]
  • example: "/usr/local/lib/readmatrix"
  • purpose: the directory containing resources used by the executable
  • component startup directory   [ OPTIONAL ]
  • example: "/tmp/readmatrix"
  • purpose: directory in which to start the component from
  • component work directory  [ OPTIONAL ]
  • example: "/tmp/user/foo"
  • purpose: the working directory of the user to start the component in (is this necessary?)

  • And any more we can think of ...

    CAT-specific Attributes

    The following are attributes specific to the CAT model of components
     
  • component port types

  •  
  • component port type descriptions
  • LDAP Implementation

    We describe our implementation of the above schema in LDAP, and suggest where these attributes might be placed in the MDS format.

    Environment-specific component attributes should reside under the machine category in the current MDS format.

    Environment-independent info resides in a separate branch of the tree, but is linked to environment-specifc info with distinguished name links.

    Interface objects resides below their corresponding component entry in the tree. There may or may not be "standard" interfaces that do standard things.  Also, a component can have any number of interfaces, method prototypes.