NIH | National Cancer Institute | NCI Wiki  

This section describes the operational dependencies between the semantic infrastructure and terminology and the platform and includes the following:

Dependencies between Semantic Infrastructure 2.0 and caGrid 2.0 August 30, 2010

Refer to the following sections of 6 - Dependencies Between Semantic Infrastructure 2.0 and caGrid 2.0 August 30, 2010 in the caGrid 2.0 Roadmap August 30, 2010:

Semantic Infrastructure Overview

In an effort similar to developing this roadmap for the future platform, security and tools, a team is defining the future direction for CBIIT Semantic Infrastructure. While the Semantic Infrastructure Roadmap effort is just beginning, it is expected to be deeply harmonious with the future direction of caGrid. With the adoption of SAIF and ECCF and the introduction of behavioral semantics, the infrastructure of the grid must provide increasingly sophisticated support to leverage and enforce these behavioral specifications.

The notion of "computable semantic interoperability" (CSI) applies semantics to not only the static data passed between machines, but also to the behavioral and functional operations exposed for coordination of behaviors during the interaction. Likewise, the semantic infrastructure itself (that is, its tools and applications) are being transformed to fully participate in the new services-aware environment. Thus the Semantic Infrastructure will depend on the grid platform as at least one of potentially many delivery platforms for its information.

The Semantic Infrastructure 2.0 is expected to provide management services and tooling comparable to those which exist today (including vocabulary services, model management and annotation, and curation tools and others as needed), albeit in potentially new formats and standards. However, it also has the increased scope of greater flexibility in accommodating granular levels of conformance and participant sophistication.

For example, the assumption of adherence to a centrally curated authoritative source of information models and terminology is no longer true; the infrastructure must gracefully accommodate local terminologies or localizations. It must enable a path towards "as much interoperability as is possible" between any two parties, rather than enforcing full "compatibility" of all participants. Additionally, the Semantic Infrastructure 2.0 is charged with making the wealth of knowledge contained within the numerous SAIF artifacts available and consumable (in a programmatic fashion) to all the grid participants. Having runtime support for leveraging this information to inform and drive service interactions is a key value proposition for the future platform and semantic infrastructure.

Semantic Infrastructure Registry

As stated above, the key components of the Semantic Infrastructure 2.0 are still being identified and scoped. However one such component has been identified and is expected to be critical to the grid: the Semantic Infrastructure Registry. While it may ultimately be manifested as numerous types of registries and services (and potentially numerous instances of each), the notional Registry will act as a governance-scoped authoritative repository of ECCF artifacts. As shown below in the diagram, the ECCF Registry will be populated by the iterative process of service design and specification.

Figure 6.1 ECCF Registry

The specification of a CBIIT enterprise service specification requires the development of three separate artifacts:

  • The CIM (computationally independent model specification)
  • The PIM (the platform independent model specification)
  • The PSM (the platforms specific model) specification

The CIM, PIM and PSM are again a collection of artifacts (models). The ECCF matrix is placeholder for these artifacts organized by RM-ODP viewpoints and MDA perspectives.

Currently a Microsoft Word document acts as a template or placeholder for describing the CIM, PIM and PSM (along with the artifacts of each viewpoint), while the future semantic infrastructure will define computable representation formats for this information.

Part of this computable representation is expected to be a SOA onotology for describing variations entities involved in the numerous conformance assertions (with examples including but not limited to services, operations, data types, faults, and actors). This onotology will provide the backbone for reasoning to be performed by the platform and tools at both runtime and design time (as illustrated later in this section).

Registry Reliance on Platform

While the Semantic Infrastructure Registry services will be specified in ECCF, and potentially manifested on multiple platforms, one such platform will be the grid and will therefore use the platform as scoped in this document.

The registry will require numerous capabilities described in this document including the security layer for items such as authentication, authorization, auditing, and data assertions and integrity. The infrastructure may also require support for a rules engine capable of consuming and enforcing rules in both static (for example, data constraints in the information model) and behavioral semantics (for example, pre-conditions and post-conditions of operation invocation) stored in the registry.

Similarly, the platform will inform the content and format of various platform specific artifacts to be stored in the registry (including but not limited to XSD and WSDL). The platform will provide the capability to enforce or test conformance to those profiles by, for example, checking service interfaces against published PSMs and doing data validation against published information models. Finally, the platform will act as the service implementation technology for the grid services of the registry (that is, be the management interfaces or consumer facing services).

Platform Reliance on Registry

The platform itself will require and leverage numerous capabilities of the Semantic Infrastructure 2.0, most importantly, access to the information contained in the ECCF registry. The registry will facilitate nearly all parts of the service development and consumption life cycle.

At design time, the registry will provide a wealth of information to the designer including available relevant service specifications to adopt and extend, information models and terminologies to leverage for new operations, and formal specifications of expected behavior of the existing services that the new service may consume from in its implementation. For example, service templates (shelled out implementation artifacts) could automatically be constructed based on platform specific specifications.

Extending beyond the basic query and retrieval of these artifacts, tools can be built to actually understand the semantics of this information and aid the service developer. For example, potentially relevant information models may be found by entering simple terms like "tissue sample" into a tool, which binds that string terms to concepts in identified terminologies and locates models containing information bound to those terms. Similarly, behavioral contracts may be discovered based on terminology binding to their function based on simple search terms like "data insert," which can act as models or examples for new service operations.

Further, such "understanding" of behavioral and static semantics can provide a powerful feature in a tool for workflow or service composition. It could leverage this information to make suggestions on "next steps" in a workflow, even suggesting specific services to use. It could also provide powerful integrity checking of the data flow and functional effect, by validating the invocations are consistent with published conformance assertions and rules.

At deploy time, the platform (or deployment tools built upon it) can automate the generation and execution of a test suite to check conformance assertions published in the registry, relevant to the service being deployed.

At run time, the service can provide powerful self-descriptive metadata by referencing profiles, policies, conformance assertions, and specifications in the ECCF registry. This metadata will provide significant details about the nature and behavior of the service, and can be used to discover it, as well as to ascertain programmatically how to correctly consume it (as well as validate it is functioning correctly). The platform may also be able to automatically flag non-comforming service instances (for example, servces sending incorrect data, or running outside of published performance metrics) by monitoring runtime behavior.

Metamodel and Information Model

The Semantic Infrastructure 2.0 effort is still deciding on the format and structure to be used. This decision will be important to caGrid 2.0, as it will inform how things like the publishing of service metadata work, and how higher layer semantics (for example, operation preconditions) are built upon static descriptions (for example, WSDL). It is expected, however, that a transition from ISO 11179 metadata to RIM-derived semantics is important in the future infrastructure. As further information is available from both roadmpa efforst, this section will evolve to discuss the impact on the platform.

  • No labels