NIH | National Cancer Institute | NCI Wiki  

Contents of this Page
Summary
Description of the profile

Create representation and views of the information, realized through the appropriate transforms.

This category includes the Compare and Merge service profiles. This category has been used to aggregate requirements for model and data element alignment and interoperability. The alignment can be an equivalence relationship, some other relationship, or the statement that no relationship exists. Requirements for model and data element transformation were implied in requirements for interoperability both across metadata models and vocabularies. Alignment should include semantic convergence, datatype and unit of measure convergence, context and scope convergence. Transformations should provide the source data element to the user in the format used by the target data model.

NB: THIS IS NOT AN EXHAUSTIVE ENUMERATION OF TRANSFORMATIONS

Within the Semantic Infrastructure, the notion of transformation is defined in terms of mediation: the resolution of incongruities occurring across heterogeneous data sources, where the data source may be any kind of artifact or model instance. The architectural implications of mediation are reflected in the set of capabilities provided by the Transform functional profile.

Transform specializes capabilities architecturally implied by its associated concepts of DataMediation , Mediation . The implied architectural capabilities are described in the following paragraphs.

DataMediation The most common type of mismatch in the SemanticWeb occurs due to usage of different terminologies by entities that shall interchange information. Within ontology-based environments like the Semantic Web, this results from usage of heterogeneous ontologies as the terminological basis for resource or information descriptions. A main merit of ontologies is that such mismatches can be handled on a semantic level by so-called ontology integration technique. Regarding representation formats and transfer protocols, a suitable way of resolving such heterogeneities is to lift the data from the syntactic to a semantic level on basis of ontologies, and then resolve the mismatches on this level.

The Data Mediator is invoked in two situations: during the discovery phase and during the communication phase. The need for data mediation is necessary when the ontologies of the goal and of the candidate or selected web service are different - in both the discovery or the communication phase. For data level heterogeneity handling, it uses ontology mapping techniques to resolve the mismatches that can appear between two given ontologies. The mappings between ontologies are created in a semi-automatic manner during design time and stored in a persistent storage. That is, these mappings are retrieved during run-time and applied on the incoming data (i.e. ontology instances) to transform it from the terms of one ontology in the terms of another ontology (this process in known as instance transformation). The same mappings can also be used for determining which concepts from the mapped ontologies are semantically related (and how). The former functionality is required to enable the process level mediation (it solves the data heterogeneity for the communication stage), while the latter is required to enable the functional level mediation (solves the data heterogeneity that appears in the functional descriptions).

Mediation Strategies and methodologies for mediation.

Mediation includes the following capabilities:

  • the creation of mappings,based on model artifacts.
  • the creation of appropriate mapping rules, based on model artifacts in conjunction with references to instances. Since the execution environment includes the Semantic Infrastructure, all mediated models conform to the SI meta-meta-model.
  • the execution of the mapping rules, which acts on the instance data taking as input source instances and having as output the target, mediated, instances
Capabilities
Requirements traceability

Requirement

Source

Capability

Provide an EASY traversal from UML<>Ontology<>Metadata<>XSD<>API, depending on one's point of view and expertise.

Gap Analysis::Interface::028 - Easy Model View Traversal

multiMetamodelProjection

Support importing data coded to one data element format and have the data transformed to another data element format.

Gap Analysis::Transform::090 - Data element format transformation

mapDataElementFormat

Provide mapping / transformation support for ISO21090 data types

Gap Analysis::Transform::091 - ISO21090 data types

mapISO21090DataTypes

Extend terminology development to new populations and missions, and do so in a collaborative fashion. For example, NICHD has launched an effort to begin standardizing terminology for the examination of the newborn using NCI’s tools and resources.

Gap Analysis::Transform::143 - Collaborative Terminology Development

collaborativeTerminologyDevelopment

Support pre-coordinated and post-coordinated terminology.

Gap Analysis::Transform::162 - Support pre-coordinated and post-coordinated terminology

coordinatedTerminology

Artifact lifecycle management and metadata requirements include the ability to: * Manage lifecycle, governance and versioning of the models, content and forms * Establish relationships and dependencies between models, content and forms * Determine provenance, jurisdiction, authority and intellectual property * Create represention and views of the information, realized through the appropriate transforms * Provide access control and other security constraints * Create annotations for better discovery and searching of artifacts * Develop usage scenarios and context for the information * Provide terminology and value set binding The artifacts are bound to the services via the service metadata. The service metadata combined with the artifacts and supporting metadata provide a comprehensive service specification. The artifact management requirements listed above are derived from the following use cases: * caEHR: The caEHR project has adopted ECCF for specifications and CDA documents for interoperability. The caEHR project requirements include the need for an infrastructure for managing all the artifacts generated during specification process, including HL7 models and documents. The caEHR project also intends to publish these artifacts for the community and vendors. The infrastructure needs to support better discovery, making all the relevant information available in the right context. * ONC and other external EHR adopters: ONC has adopted CCD and CCR for meaningful use. All national EHR implementations are expected to support forms and the semantics of these forms play a critical role in interoperability. The semantic infrastructure must provide a mechanism to create, store and manage these forms. * Clinical Trials: Clinical trials use forms to capture clinical information, and the semantics captured by these forms are critical for interoperability and reporting. The semantic infrastructure must provide a mechanism to manage the lifecycle of these forms.

Semantic Infrastructure Requirements::Artifact Management::Artifact Lifecycle Management

mapDataElementFormat
mapISO21090DataTypes
collaborativeTerminologyDevelopment
coordinatedTerminology
controlTerminologies
multiMetamodelProjection

Groups with existing data sets often want to know how they can map their existing data models (e.g. clinical trial schema) to data element definitions that have been published in caDSR.  They do not wish to transform their existing (potentially large) datasets into caDSR data element definitions. Their existing data models usually represent a significant investment, their data element definitions are sponsor driven, and researchers cannot easily change data elements that evolved from the (very expensive) research process.  Thus, end users (e.g., Cancer Researchers) want a way to describe the mappings between their trial specific information model(s) and caDSR data elements - both semantically and syntactically.  This relates to caDSR-6 in that harmonization involves mapping or aligning of information models.

Gap Analysis::caDSR::caDSR-1 - Map caDSR data elements to existing metadata models

multiMetamodelProjection

Provide service interface for all query and retrieval functions for clinical systems such as: * Laboratory Information Systems (LIS) * Radiology/Imaging Systems (PACS) * Admit Discharge Transfer (ADT) * Radiation Oncology dosing systems * Pharmacy * Order Entry * Clinical Care * Patient History * Clinical Notes   These services provide the core information for basic functionality of caEHR.  caEHR does not provide these functional behaviors directly, but expects these systems to maintain standard interfaces based on HL7 V3 functional descriptions.  Metadata concerning the interface, and the interface objects should be accessible via the KM system.  In addition, the business service interfaces for the following: * Referrals * Document Exchange * Outcomes * Order Tracking

Gap Analysis::caEHR::caEHR 1 - Provide service interfaces for clinical systems

controlTerminologies

The Web Service Execution Environment (WSMX) is an environment that is designed to allow dynamic mediation, selection and invocation of web services. For the purposes of the Semantic Infrastructure roadmap, the WSMX specification has been abstracted to be applicable to any SOA environment, and to mediation of any artifact. A range of different models or ontologies describing the same or related problem domains could be created by different entities throughout the world. This implies that more and more systems and applications require mediation in order to be able to integrate and use heterogeneous data sources. Mapping between models is required in several classes of application, such as Information Integration and Semantic Web, Data Migration or Ontology Merging. Unfortunately, there is always a trade-off between how accurate these mappings are and the degree of automation that can be offered. There are approaches able to provide these kinds of mappings (also known as alignments) between different schemas or ontologies using machine learning techniques in an automatic manner but only with limited accuracy. In order to rule out the false results, the domain expert has to validate and check the mappings or the alignment at the end of the process. Another type of approach considers the human intervention from the beginning, proposing an interactive mapping process where the tool suggestions and the human user validations alternate in the process until the final result is achieved. The mediation solution presented in this roadmap follows the second approach described above: we propose well-defined strategies and methodologies for the mapping process in order to guarantee - the most correct and complete mappings possible, together with a set of algorithms and strategies meant to make the mapping task much easier (reducing it to simple validations and choices). We adopted this approach because we believe that in the context of SOA Services and business transactions the transformations on data must be 100% accurate. In addition, we consider that an interactive approach towards mapping creation is much more appropriate in the case of medium/large ontologies and also when the intention is to abstract the domain expert (using a graphical interface) from the underlying logical formalism used to represent the mappings. There are four types of heterogeneities that can occur within the SOA. Each heterogeneity type requires a specific technique for mismatch resolution, referred to as levels of mediation: * Terminology: Services or other resources use different terminologies; e.g. one entity understands name to be the full name of a person, and another one defines name to only denote the family name. This can hamper successful interoperation on the semantic level, i.e. concerning the meaning of information. * Representation Format and Transfer Protocol: resources that interact use different formats or languages for information representation (e.g. HTML, XML, RDF, OWL, etc.), or different protocols for information transfer (e.g. HTTP, RPC, etc.); incompatibilities on this level obviously can hamper prosperous information interchange. * Functionality: specific to services, this refers to functionalities of a provider and a requester that do not match exactly. This enforces complex and thus expensive reasoning procedures for detecting services usable for a given request; the need for such expensive operations can be reduced by gaining and utilizing knowledge on the functional heterogeneities * Business Process: also specific to services, this denotes mismatches in the supported interaction behavior of services and clients. This can hamper successful interaction on a behavioral level for consumption or interaction of services. The process of mediation generally consists of three main steps: * the creation of mappings,based on model artifacts. * the creation of appropriate mapping rules, based on model artifacts in conjunction with references to instances. Since the execution environment includes the Semantic Infrastructure, all mediated models conform to the SI meta-meta-model. * the execution of the mapping rules, which acts on the instance data taking as input source instances and having as output the target, mediated, instances Service message exchanges are represented in terms of the sender's models, and each of the business partners (e.g. enterprises) understands only messages expressed in terms of its own model. One of the roles of the execution environment (by mean of mediation), is to transform, if necessary, the received message from the terms of sender's model into the terms of the receiver's model, before sending it further. From the perspective of the models, each message contains instances of the source model that have to be transformed into instances of the target model. WSMX distinguishes four different types of mediators : * mediators that link two goals. This link represents the refinement of the source goal into the target goal * data mediators that import models and resolve possible representation mismatches between models. * mediators that link web service to goals, meaning that the web service (totally or partially) fulfils the goal to which it is linked. The mediators may explicitly state the difference between the two entities and map different vocabularies (through the use of data Mediators). * mediators linking two Web Services.

Semantic Profile::OASIS Semantic SOA::Mediation

mappingDefinition from inherited abstract profile MediationmappingRules from inherited abstract profile MediationmappingExecution from inherited abstract profile Mediation

Static models include a variety of models with different representations.

Semantic Infrastructure Requirements::Artifact Management::Static Models

qvtModel

collaborativeTerminologyDevelopment
Description

Extend terminology development to new populations and missions, and do so in a collaborative fashion. For example, NICHD has launched an effort to begin standardizing terminology for the examination of the newborn using NCI’s tools and resources.

Requirements addressed
Overview of possible operations
controlTerminologies
Description

Provide controlled terminologies

Requirements addressed
Overview of possible operations
coordinatedTerminology
Description

Support pre-coordinated and postcoordinated terminology.

Requirements addressed
Overview of possible operations
mapDataElementFormat
Description

Support importing data coded to one data element format and have the data transformed to another data element format.

Requirements addressed
Overview of possible operations
mapISO21090DataTypes
Description

Provide mapping / transformation support for ISO21090 data types

Requirements addressed
Overview of possible operations
mappingDefinition
Description

The creation, destruction, editing, managing of mappings, based on model artifacts.

Requirements addressed
Overview of possible operations
mappingExecution
Description

The execution of the mapping rules, which acts on incoming source instances and provides mediated target instances.

Requirements addressed
Overview of possible operations
mappingRules
Description

The creation, destruction, editing, managing of appropriate mapping rules, based on model artifacts in conjunction with references to instances. Since the execution environment includes the Semantic Infrastructure, all mediated models conform to the SI meta-meta-model.

Requirements addressed
Overview of possible operations
multiMetamodelProjection
Description

Provide an EASY traversal from UML<- >Ontology<>Metadata<>XSD<->API, depending on one's point of view and expertise.

Map KR data elements to other existing metadata models.

Requirements addressed
Overview of possible operations
qvtModel
Description

QVT Model maintenance

Requirements addressed
Overview of possible operations
  • No labels