In e-science most of the tools are available as commandline applications in various forms. (E.g. shell scripts, Perl, Python orFORTRAN). However with availability of Internet and technologies like Grid computing, collaboration among different scientist are becoming common place, and on that settingcreating a Web service that exposes commandline application is a common requirement. This may be motivated by various reasons.
Generic Service Toolkit facilitates users to create Web Services from a commandline application. The resulting Service (a.k.a. Application Service)will have WSDL interface that matches the behavior of the applications and provides following capabilities.
Gfac includes two types of services, persistent and temporary services. Former are the services that suppose to be available 24 * 7, and the latter are services that are created on demand, and if they are not used frequently they will shutdown themselves. Gfac installation includes two persistent services, the factory service and registry service, where former create temporary application services on demand, and latter keep the deployment (state) configuration.
Gfac is part of LEAD environment discovery project which provide a workflow system based on Meteorology application services. In Next sections we shall explore Gfac architecture in detail. However if you are a user who needs to use Gfac, please look at Gfac User Guide.
Following diagram shows how Gfac fits in to other services of the LEAD
System is fronted by a portal which allows users to create workflows using services that wrap command line applications, execute them and monitor them.
Gfac consists of two services, Registry Service that acts as the repository for Gfac Deployment descriptors and Factory Service that accepts the XML deployment descriptors that explains the application and creates an Application Service. Figure also shows few more services that would help user to understand different functionality of Gfac and we will introduce them while going through a workflow invocation.
In order to create a new workflow, user would use Gfac portlet to define XML deployment files that explain Hosts, Applications and mapping Services. Then using these Services, workflows can be composed using XBaya workflow composer. Those workflows are saved in users account.
When user invokes a workflow using the portal, following steps are taken place. The numbers given below matches the numbers given in the figure.
While execution, each application service publishes events about different state changes of the workflow service execution. Those events are published to notification bus and XBaya workflow composer provides users a visualization of workflow invocation.
Each new Application service may be placed on a different host from Factory service and each application service may submit jobs to a remote application Host, which is usually a computing cluster. In the discussion that follows we will use following terms.
To create an application service, user needs to explain the application, the service host, and the service to application mapping using XML descriptors. However we provide a portlet Web interface that generates those XML document via a Web form. The same Web interface can register those documents with Gfac registry; therefore the process will be transparent to the end user to a great extent.
Application Service is a stateless Web Service that accepts requests, set up the input parameters for underline Application, invoke the application, build and send a response using outputs of the Application execution. In this section we will visit this process in detail.
Application Service has a well defined information model and every other component of the system is stateless. Information model consists of four parts.
The Gfac top level configurations, this contains information like users grid credentials, broker URL .. Only one instance exists for one Gfac instance
Information about a service. This contains service Map document, Application, Host descriptions WSDL of the service ..
Represent a single request to a created service. New object is created for each request
Represents SOAP message
Architecture builds on top of minimal kernel, which provides a processing
pipeline. Application service is build by adding set of extensions via two set
of extension points. From our experience it was evident different users has
varying requirements and expectations from a application service and by
introducing a lightweight core, so we expect to preserve the architecture from
the details of the implementation.
The extensibility is achieved as the both levels by adding extension point to the architecture. A extension is a java class that is allowed the Intercept the processing of the Message.
Following figure shows the components of Gfac Kernel.
Upon reception of request for a new service, Factory Service will create a minimal Service. At srtartup, the Service is run though set of extensions called "static extensions". The static extensions are so named as they are executed only once per each service. These static extensions do the additional configurations to the service. Good example for one extension is adding the security to the created service. In the LEAD use case, to enable security, the Security and Capability Handlers are added to the XSUL and capability tokens are registered with the Capman service. This way the security can be added to the Gfac transparently to the core architecture. Similarly we believe other WS-Extensions and scenarios can be added via smiler extensions.
When a started service received a request, the request is accepted and
processed by the Gfac core. The core read the message and populates dynamic
portions of the information hierarchy (e.g. Populate the input output
parameters). Then the information hierarchy is run though preProcess() method
of set of dynamic extensions. The extensions are called dynamic in the sense
that they are called per each request. Usually processing done by these
extensions are use case specific. Main advantage we expect from the extensions
are to map the commonly use specific functionality in to set of components. In
one hand this makes architecture transparent from the specific details and on
the other hand users/administrators may set up Gfac with subset of desired
functionality by picking up components that are interest to them.
After running though the extensions the processing is handed over to Provider, which is a abstract representation of the execution of the application. Examples of the concrete representations of these providers are LocalProvider, RemoteProvider, PBSProvider, SLURMProvider ect. Again this would make architecture transparent to the real execution of services and clearly mark the extension points so providing new application execution methods are simplified.
As a recap each extension point and their intended use is listed here -
Following is a list of Known dynamic extensions implemented in Gfac
Following is a list of Known dynamic extensions implemented in Gfac
A Application Service is explained using three deployment descriptors. They represent the Hosts available to the system, applications installed in each host and how application should be mapped to a Web Service. Schema files for Deployment Discyptors can be found here.
Service Description Document refers to an application document which in turns refers to a Host description document. Before creating an application service, all three documents must be present in the registry service. User requests the Factory Service to create a Application Service by providing a Service Name. Factory Service search Registry for a host to create the service and transfer Service name together with other configurations to service Host. Once the configuration is set up, new Service is created using that configuration (In Service Host). When the Service receives a request, it searches the registry for a Host where the given application is deployed and invokes the application on that host.
To learn more about Deployment Descriptors, please refer to Deployment Descriptor Guide. We also provide a portal interface to create those Deployment Descriptors, please refer to the User Guide for more information.
Following figure shows steps Application Service undergo while processing a request. Unlike last section here we discuss the process with all the extensions in place.
Service requests may use WS-Addressing to receive result of the invocation asynchronously.
Gfac supports following providers that uses different mechanisms for Job Submission.
To perform file transfers, Gfac supports following mechanisms
While processing a request, Application service could produce events marking the different states reached in the invocation. That notification may be published using WS-Eventing or WS-Notifications specifications. Formats of the events are defined by workflow tracking project. The events are sent to event sink defined by LEAD context header.
Following are events generated by an Application Service
Unless all Job submissions and file transfers are local, Application Service and factory require credentials to perform those operations on behalf of the user. Credentials may be provided in one of the two forms.
Globus credentials - they are loaded from the file system
MyProxy credentials - Loaded from MyProxy server and if these are provided application service will automatically renew the credentials if there are about to be expired.