NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

A service specification is made up of service metadata, artifacts and the metadata supporting these artifacts. Artifact management enables creating a service specification and provides helps to accomplish the following benefits:

Improves Improve visibility through publication. When the management service can be integrated into the development, testing and production cycle, artifacts become available for review and discussion, as well as reference for supporting development. This helps insure proper understanding of applications and services being developed, and provides a standard and controlled method of access.

Annotated Annotate artifacts to expand understanding. To further improve the understanding of artifacts, the management service provides the ability to add annotations to both the parts of an artifact (depending on artifact type) and the artifact as a whole. Adding additional semantic definitions to an artifact allows for the searching and location of elements across artifact type, as well as makes clear the intent of a given artifact.

Support for Governancegovernance. When the management services allows for artifact versioning, along with state representation, artifact elements which require governance can be located and interacted with. This functional aspect of the artifact management provides a change history as well as links to external change control systems.

...

Service discovery and governance allows service developers to specify rich metadata about services. This enables better discovery, and governance of services. Service discovery and governance help to accomplish the following.:

Promote service reuse: The use of well defined service metadata promotes better discovery and reuse of services during design and run time. Service metadata includes information about service interactions and dependencies. It also includes a classification scheme for organizing services based on business objectives, domain, and usage. It links services to all the supporting artifacts in the specification and provides a placeholder for conformance statements. This enables better reuse across the enterprise and eliminates redundancy.

...

Clinical data forms definition and modeling provides help to accomplish the following benefits:

Define data entry forms using robust data representation. Ultimately the data that is captured on a form is used in many ways, but that data must provide a high level of meaningful use to insure the consumer knows how the data was captured and what context it represents. In this way even a simple question on a form may result in a much more complex representation in the data. As an example, a Yes or No question on a form may result in a codified representation of an observation.

Reuse of contextual representation. Since a given form may collect data for a context that might be common to many forms, being able to reuse these elements in a way that insures contextual consistency is a must. Forms created with the form definition tool must retrieve from well defined metadata sources that provide common contexts, default values, and coded representations including value set binding.

Reuse of form elements. When defining a form element which is bound to a specific contextual representation, it should be easy to reuse that element with minimal reconfiguration.

...

The functions of clinical data forms include the followingability to:

  • Define model objects for reuse
  • Define form templates
  • Bind value set to data element
  • Provide default form delivery
  • Provide form data transformation
  • Manage lifecycle, governance and version of forms and document schemas

...

One of the primary reasons for having structured data is to provide the ability to automate decision support and reasoning across information models, data types, and the terminology associated with the attributes of each data type. For the ECCF registry to provide maximal value to end users, it is necessary to support common decision support functions across the enterprise and to extend that through services to the end users.  In In effect the semantic infrastructure must provide the tools to support Decision Support solutions.:

Identify sources of valued information.  Using Using the semantic metadata as a source, reasoning systems need to be able to identify the sources of information which are key to a given decision support solution.  The The services, models, and annotations provide definitions which can identify candidate sources for integration.

...

Support for classification. The system provides for data classification, discovering new knowledge about key elements.  This This classification process is based on description logic and business rules which process the semantic structures of artifacts. Classification information should be added to the pool of knowledge about given structures and related information

Support for expert system rule processing and choreography. Using Using systems such as the OWL classifiers (Pellet, Fact++, Hermit), rule based expert systems (Jess, Drools), and work with RDF (Resource Description Framework) choreography languages (SPIN), the decision support system should be able to applied in a choreographed layered fashion.  Key Key to this process is a choreography engine which matches data with rules and a reasoning environment.  Because Because of the complexity of the reasoning requirements, the OWL 2 specification is required in order to support the Semantic Infrastructure 2.0 requirements.

Integration with Service Registriesservice registries. Since the artifact metadata provides definitions of data, the service registry provide provides the data access needed to process information.  If If a given artifact is a service, the decision support system determines the necessary definitions to integrate a service into decision support for the gathering of data.

Decision Support Functions

Decision support functions include the ability to:

  • Query artifact metadata to locate useful artifacts for decision support.
  • Query service metadata to locate services matching artifacts and metadata definitions
  • Create a decision support definition
  • Create a decision support session
  • Provide scheduling and access information to choreographer
  • Selection of Select rules and rule system environment
  • Execution of Execute reasoning systems against gathered data providing classification and additional data

...

Services specifications developed by NCI and the community have to be testable to ensure that the implementation conforms to the specification.  Conformance Conformance testing leverages the artifact and service registries along with predefined reasoning systems to validate that an implementation adequately addresses the requirements stated in the service specification. An example of service requirement is the ability to specify a response time in the specification (design time) and validate that this response time is valid adequate for an implementation of the service. Aadditional Additional test points include but are not limited to binding to specific terminologies and domain models.

Conformance testing allows both CBIIT and other HL7 SAIF adopters to validate specifications .as follows:

Analyse Analyze a given artifact for it's its stated ECCF purpose. Determine Determine if a given artifact satisfies the requirements of the ECCF artifact that it declares itself to be.  This This analysis should look at such things as datatypes matching the appropriate level (abstract data types in a Platform Specific Model (PSM)).

Analyse Analyze a given artifact to verify traceability. Determine if a given artifact provides correct traceability from level to level.  The The analysis should look at naming conventions and stereotypes to determine correctness along with promotion of data types from different levels of abstraction.

Analysis of accessibility and interoperability. Used Used to determine if a given service matches it's its proposed service specification. Also determine if an artifact or specification is complete as it relates to data binding and value set binding.

Conformance Testing Functions are as follows include the ability to:

  • Analyze an artifact for ECCF Conformance and traceability
  • Produce a non-conformancy statement conformance statement
  • Interact with governance systems systems

The requirements listed above are derived from the following use cases:

CBIIT's adoption of ECCF: ECCF requires all specification developers to make conformance statements, ; the conformance testing framework leverages these conformance statements to generate validation tests.

Other National Initiativesnational initiatives: Other national organizations like NIST are adopting a similar approach to conformance testing.

...

caGrid 2.0 Platform and Terminology Integration

The Semantic Infrastructure semantic infrastructure has to support seamless integration with the caGRID caGrid 2.0 platform. The following are some high-level platform and terminology requirements that are either supported or addressed by the Semantic Infrastructuresemantic infrastructure.

Service Generation

Service generation is the ability to generate services from user defined service metadata. The semantic infrastructure provides this metadata and the platform leverages this metadata for service generation. The constraints and policies specified in the semantic infrastructure are inherited by the platform and are enforced as runtime policies.

...

Discovery includes service discovery, data discovery, and policy discovery. Service discovery allows primary users as well as secondary users to locate a service specification and instances based on attributes in the service metadata (for example, via a search for specific micro-array microarray analysis services). Data discovery enables secondary users to find the types of data available in the ecosystem as well as summary-level information about available data sets. Policy discovery allows application developers to find and retrieve policies on services.

...

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: As institutions share de-identified glioblastoma data sets, they are available to others via data discovery. The treatment recommendation service used by the oncologist is able to discover these new data sets and their corresponding information models, and include that data for subsequent use in recommendation of treatment.

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: all of the data management and access services in the use case are utilized by application developers to build the user interfaces that the clinicians use during the course of patient care.

...

Service orchestration and choreography allows both application developers and non-developers to discover service "building blocks" that can be composed dynamically to provide business capabilities. Special cases include the orchestration of multiple services for a distributed query, or for a transactional workflow. Service orchestration and choreography will leverage static and behavioral semantics from the Semantic Infrastructure 2.0.

The Semantic Infrastructure semantic infrastructure provides the behavioral semantics required for dynamic composibility of services or generation of distributed queries. This includes runtime contract discovery and negotiation to determine composibility of services based on service capabilities and constraints.

...

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: Federated query over the TCGA over The Cancer Genome Atlas (TCGA) data and other data sets is performed using a service orchestration.

...

Policy and Rules Management allow non-developer secondary users to create policies and rules and apply them to services. The scope of policies includes, but is not limited to, definition and configuration of business processing policy and related rules, compliance policies, quality of service policies, and security policies. Some key functional requirements for managing policies include capabilities to author policies and store policies, and to approve and validate policies and execute policies at runtime.

The Semantic Infrastructure semantic infrastructure will provide a mechanism to specify policies, including business processing policies and related rules, compliance policies, and quality of service policies. Tools and services for creating security specific policies will be provided by the caGRID caGrid 2.0 platform and will be used by the semantic infrastructure. All other policies specified in the Semantic Infrastructure semantic infrastructure will be enforced by the platform at runtime.

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: Each institution has different data sharing needs, access control needs, and business rules for processing that are defined and customized. For example, policy at the pathologist's institution may state that the patient is scheduled for a visit when the review is complete.

...

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: As patient care proceeds, the system notifies the designated clinicians that data (for example, images) are ready for review. Similarly, when notifications are received, event processing logic allows the appropriate parties to assign clinicians for care. In order to facilitate better treatment (a learning healthcare system), as new de-identified glioblastoma data is made available, notifications are sent that could indicate a recommended change in the treatment plan.

...

This set of requirements includes providing an application developer with the ability to define application-specific attributes (for example, defined using ISO 21090 healthcare datatypes) and an information model that defines the relationships between these attributes and other attributes in the broader ecosystem. In particular, the last requirement suggests linked datasets, where application developers can connect data in disparate repositories as if the repositories are part of a larger federated data ecosystem. Additional requirements include the ability to publish and discover information models. Support is needed for forms data and common clinical document standards, such as HL7 CDA. To support the use of binary data throughout the system, the binary data must be typed and semantically annotated.

All Information information models, their representation and binding to data-types datatypes and terminologies will be managed by the semantic infrastructure. The ability to publish and discover information models will be supported by the semantic infrastructure, and the platform will leverage these capabilities.

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: The pathology, radiology and other data have various data formats which must be described, and the information model for the patient record must link between these various datatypes. The complete information model includes semantic links between datasets to build a comprehensive electronic medical record. Annotations on data are defined and included in the information model.

...

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: the patient has an electronic medical record that spans multiple institutions. The clinical workup data (for example, genomics and proteomics data) is linked to the clinical care record; similarly pathology and radiology findings must be attached to the patient's electronic medical record.

...

In order to also discover dataset contents exposed contents exposed on the GRIDgrid, the ECCF registry must have linkages from dataset metadata to from dataset metadata to the metadata about the data they contain. This is distinct from the metadata about the dataset (the owner, creation time, table structure of fields and attributes) and instaed decribes instead describes the type of data contents of the dataset so that a user can retrieve portions of a dataset of some type.

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: The oncologist must be able to quickly find glioblastoma data sets, indicating the fields that he is interested in comparing from his clinical data in order to find similar disease conditions and associated treatment plans. Temporal queries allow clinicians to identify changes in patient condition and treatment over time.

...

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: The origin of data is tied to the data creator, allowing the oncologist performing the match against TCGA data and other datasets to include and exclude data sets based on their origin.

...

In a diverse information environment, semantics must be used to clearly indicate the meaning of data. This requirement is expected to be addressed by the Semantic Infrastructuresemantic infrastructure, although there will be a touchpoint between the caGrid 2.0 and the Semantic Infrastructure 2.0 to annotate data with semantics. Integration with the Semantic semantic Infrastructure will enable reasoning, semantic query, data mediation (for example, ad hoc data transformation) and other powerful capabilities.

Data semantics are captured in the Semantic Infrastructure semantic infrastructure and the platform will leverage the Semantic Infrastructure semantic infrastructure interfaces for reasoning and analysis.

Link to use case satisfied from caGRID caGrid 2.0 Roadmap: The oncologist accesses the TCGA database to search for de-identified glioblastoma tumor data that is similar to the patient data exported from the hospital medical record. During this search, the semantics of the data fields are leveraged to indicate matches between TCGA data fields and the hospital medical record data fields.

...

Link to use case satisfied from caGRID caGrid 2.0 roadmapRoadmap: The oncologist searches both TCGA glioblastoma data as well as de-identified data that has been added by care providers around the country. The additional data sets are external data repositories.

...