5.2.2.2 - Search and Access Sept. 6, 2010

This group of capabilities focuses on enabling developers of composite services and applications to discover, compose, and invoke services. This includes the discovery of published services based on service metadata and the generation of client APIs in multiple languages to provide cross-platform access to existing services.

The platform will use the semantic infrastructure service metadata to address all the service discovery requirements. The semantic infrastructure relies on metadata about services and artifacts.

Link to use case satisfied from caGRID 2.0 Roadmap: As institutions share de-identified glioblastoma data sets, they are available to others via data discovery. The treatment recommendation service used by the oncologist is able to discover these new data sets and their corresponding information models, and include that data for subsequent use in recommendation of treatment.

Link to use case satisfied from caGRID 2.0 Roadmap: all of the data management and access services in the use case are utilized by application developers to build the user interfaces that the clinicians use during the course of patient care.

Service orchestration and choreography allows both application developers and non-developers to discover service "building blocks" that can be composed dynamically to provide business capabilities. Special cases include the orchestration of multiple services for a distributed query, or for a transactional workflow. Service orchestration and choreography will leverage static and behavioral semantics from the Semantic Infrastructure 2.0.

The Semantic Infrastructure provides the behavioral semantics required for dynamic composibility of services or generation of distributed queries. This includes runtime contract discovery and negotiation to determine composibility of services based on service capabilities and constraints.

Another use case is dynamic retrieval and enforcement of the policies that are in effect for a service interaction in the areas of logging, validations, data transformation, or routing. This information can be used either during the design of the orchestration or during the execution of the defined flow.

Link to use case satisfied from caGRID 2.0 Roadmap: Federated query over the TCGA data and other data sets is performed using a service orchestration.

Policy and Rules Management allow non-developer secondary users to create policies and rules and apply them to services. The scope of policies includes, but is not limited to, definition and configuration of business processing policy and related rules, compliance policies, quality of service policies, and security policies. Some key functional requirements for managing policies include capabilities to author policies and store policies, and to approve and validate policies and execute policies at runtime.

The Semantic Infrastructure will provide a mechanism to specify policies, including business processing policies and related rules, compliance policies, and quality of service policies. Tools and services for creating security specific policies will be provided by the caGRID 2.0 platform and will be used by the semantic infrastructure. All other policies specified in the Semantic Infrastructure will be enforced by the platform at runtime.

Link to use case satisfied from caGRID 2.0 Roadmap: Each institution has different data sharing needs, access control needs, and business rules for processing that are defined and customized. For example, policy at the pathologist's institution may state that the patient is scheduled for a visit when the review is complete.

The wealth of data must be accessible, resulting in the need for exploration of available datasets. This includes the ability to view seamlessly across independent data sets, allowing a secondary user to integrate data from multiple sources. In addition, the query capability must support sophisticated queries such as temporal queries and spatial queries.

The semantic infrastructure will provide metadata for discovery of these datasets. Comples temporal and spatial queries will be informed by the metadata but will be formulated and executed by the platform.

Link to use case satisfied from caGRID 2.0 Roadmap: The oncologist must be able to quickly find glioblastoma data sets, indicating the fields that he is interested in comparing from his clinical data in order to find similar disease conditions and associated treatment plans. Temporal queries allow clinicians to identify changes in patient condition and treatment over time.

Data management includes linking of disparate data sets and updates of data across the ecosystem. Data updates may include updates to multiple data sources, necessitating the need for transactions.

Linkages between the different disparate data sets will be managed by the semantic infrastructure. Data updates that trigger transactions are captured by the platform and are propagated upstream to the semantic infrastructure. An example would be the platform monitoring events to identify changes to data.

Link to use case satisfied from caGRID 2.0 Roadmap: the patient has an electronic medical record that spans multiple institutions. The clinical workup data (for example, genomics and proteomics data) is linked to the clinical care record; similarly pathology and radiology findings must be attached to the patient's electronic medical record.

There are numerous data repositories on the web today. These data repositories contain essential information that must be accessible to services in the ecosystem. As a result, caGrid 2.0 must provide capabilities to integrate these external repositories into the grid with the assumption that the remote service cannot be changed.

The semantic infrastructure will support integration with other metadata repositories, allowing the platform to leverage the semantic infrastructure for federated metadata discovery and analysis. The federated data query capabilities will be implemented by the platform.

Link to use case satisfied from caGRID 2.0 roadmap: The oncologist searches both TCGA glioblastoma data as well as de-identified data that has been added by care providers around the country. The additional data sets are external data repositories.

Functional Profile

5.2.2.2.1 - Business Processing Policies and Related Rules Sept. 6, 2010 Policy and Rules Management allow non-developer secondary users to create policies and rules and apply them to services. The scope of policies includes, but is not limited to, definition and configuration of business processing policy and related rule. Some key functional requirements for managing policies include capabilities to author policies and store policies, and to approve and validate policies and execute policies at runtime.
5.2.2.2.2 - Compliance Policies Sept. 6, 2010 Policy and Rules Management allow non-developer secondary users to create policies and rules and apply them to services. The scope of policies includes, but is not limited to, compliance policies. Some key functional requirements for managing policies include capabilities to author policies and store policies, and to approve and validate policies and execute policies at runtime.
5.2.2.2.3 - Contract Discovery Sept. 6, 2010 The Semantic Infrastructure provides the behavioral semantics required for dynamic composibility of services or generation of distributed queries. This includes runtime contract discovery.
5.2.2.2.4 - Data Discovery Sept. 6, 2010 Data discovery enables secondary users to find the types of data available in the ecosystem as well as summary-level information about available data sets.
5.2.2.2.5 - Data Transformation Policy Sept. 6, 2010 Dynamic retrieval and enforcement of the policies that are in effect for a service interaction in the areas of data transformation. This information can be used either during the design of the orchestration or during the execution of the defined flow.
5.2.2.2.6 - Disparate Data Set Linkage Sept. 6, 2010 Data management includes linking of disparate data sets and updates of data across the ecosystem. Data updates may include updates to multiple data sources.
5.2.2.2.7 - Federated Metadata Discovery and Analysis Sept. 6, 2010 There are numerous data repositories on the web today. These data repositories contain essential information that must be accessible to services in the ecosystem. As a result, caGrid 2.0 must provide capabilities to integrate these external repositories into the grid with the assumption that the remote service cannot be changed.
5.2.2.2.8 - Federated Repositories Sept. 6, 2010 The KR has to support Federated Repositories. The structure of each repository and the information models each contains may be different. At the M2 layer a range of meta-models have to be supported. For example, BRIDG models will be based on a HL7 meta-model, caDSR is currently using a ISO 11179 meta-model, CDISC is using the XML schema information model. Requirements include; 1. Federate the metadata infrastructure such that different organizations, departments, labs, software applications, etc. can maintain their own metadata while standards flow both top-down and bottom-up. The current semantic infrastructure at NCI is not amenable to this type of federated model. 2. Support querying across Grid 3. Support the ability to have data spread over the grid/internet, but know which data is the original, source of truth; concerns about data duplication and authorization - need to know where the authoritative data/information resides 4. Support federated sharing of data sources.
5.2.2.2.9 - Logging Policy Sept. 6, 2010 Dynamic retrieval and enforcement of the policies that are in effect for a service interaction in the areas of logging. This information can be used either during the design of the orchestration or during the execution of the defined flow.
5.2.2.2.10 - Policy Discovery Sept. 6, 2010 Policy discovery allows application developers to find and retrieve policies on services.
5.2.2.2.11 - Quality of Service Policies Sept. 6, 2010 Policy and Rules Management allow non-developer secondary users to create policies and rules and apply them to services. The scope of policies includes, but is not limited to, definition and configuration of quality of service policies. Some key functional requirements for managing policies include capabilities to author policies and store policies, and to approve and validate policies and execute policies at runtime.
5.2.2.2.12 - Routing Policy Sept. 6, 2010 Dynamic retrieval and enforcement of the policies that are in effect for a service interaction in the areas of routing. This information can be used either during the design of the orchestration or during the execution of the defined flow.
5.2.2.2.13 - Service Composition Sept. 6, 2010 Service orchestration and choreography allows both application developers and non-developers to discover service "building blocks" that can be composed dynamically to provide business capabilities. Special cases include the orchestration of multiple services for a distributed query, or for a transactional workflow. Service orchestration and choreography will leverage static and behavioral semantics from the Semantic Infrastructure 2.0.
5.2.2.2.14 - Service Discovery Sept. 6, 2010 Service discovery allows primary users as well as secondary users to locate a service specification and instances based on attributes in the service metadata (for example, via a search for specific micro-array analysis services)
5.2.2.2.15 - Spatial Queries Sept. 6, 2010 The query capability must support sophisticated queries such as spatial queries.
5.2.2.2.16 - Temporal Queries Sept. 6, 2010 The query capability must support sophisticated queries such as temporal queries.
5.2.2.2.17 - Validation Policy Sept. 6, 2010 Dynamic retrieval and enforcement of the policies that are in effect for a service interaction in the areas of routing. This information can be used either during the design of the orchestration or during the execution of the defined flow.

Content

Space Tools