NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.


Scrollbar

Page info
title
title

This section provides an overview of the proposed architecture, which includes a set of core services and tools. Section 5.2 - Overview of Core Semantic Infrastructure Capabilities and Services Profile summarizes the profile of the solution with mapping to appropriate requirements and use cases. Section 5.3 - Tools for Semantic Infrastructure 2.0 provides an end user's view of the tools. Section 5.4 - User Workflows in Semantic Infrastructure 2.0 describes workflows, and section 5.5 - Tie-in with Terminology and Platform describes integration with the platform and terminology.

The image below gives an overall view of the components required for Semantic Infrastructure 2.0. Image Removed 

Component Name

Description

Registry Components

Box Name : Artifact Reference and Store

This registry component  is The artifact reference is a store or registry that contains references to the various artifacts.  Each artifact should have a URL that can be used to physically access the file.  Each artifact reference is accompanied by a checksum or some other method to be able to verify the accessed object.

Box Name : Artifact Access

This registry component provides programmatic access to artifacts in the Artifact Reference and Store.

Box Name : Artifact Registry and Retrieve

This registry service  This service provides a programmatic method for accessing artifacts within the SIinterface for interacting with the artifact reference registry.

Box Name : Transformation Governance Integration

This service provides a service that has a number of services that take as input some artifact and outputs in alternative representations.  This might include a class model in UML being transformed to an OWL ontology. registry service  provides state mechanisms about known artifacts that can be accessed and reviewed through governance activates.

Box Name : Validation and Conformance Suite

This registry service integrates with the reasoning system to validate the conformance of specific artifacts (ECCF models).

Semantic Components

Box Name : Semantic Knowledge Store

This store semantic component provides a physical representation of semantics that have either been derived through artifact analysis, or through manual annotation.  This store could be represented by an RDF triple store.

Box Name : Artifact Registry and Retrieve Transformation

This semantic component provide a function that take as input some artifact and outputs in alternative representations.  This might include a class model in UML being transformed to an OWL ontology The registry and retrieve service provides a programmatic interface for interacting with the artifact reference registry.

Box Name : Automated Semantics Discovery
Suite

This semantic service takes an as input artifacts or artifact transformations and extracts as much semantics many semantic representations as possible.  The details and amounts of the semantics will depend on artifact type, representation, and completeness.   The results are then stored in the Semantic Knowledge Store.

Box Name  : Annotation

The annotation services This semantic service provides functionality which that allows additional semantics to be added to about an artifact reference in the Semantic Knowledge store, and is used to augment the semantic representation which were automatically discovered.

Box Name : Orchestration

The orchestration service manages the internal flow of operations which can be performed.  This includes automating the transformation and semantic discovery and the utilization of various rule systems or classification systems.

Data Transformation Service

This semantic service provides a set of transformation functions which are designed to transform data, this may include transforming data graphs in to CSV,  result sets in to XML, or other reasonable transformations.  This function may use semantics stored about artifacts to aid in the transformation function.

SI Framework Components

Box Name : Governance Integration

To support governance, this service provides state mechanisms about known artifacts that can be accessed and reviewed through governance activates.

Box Name : Access Service Directory

This directory represents framework component, provides the set of services that are available within an SI implementation which are designed to manage artifacts and their semantic representations.  This will allow for the coordination of stores and services across the grid.

Box Name : Reasoning Framework Service Directory

This framework component provides the set of services that are available within an SI implementation that provide reasoning functionality to analyze artifacts and instance representations of associated data.

Box Name : Rule Systems System Interface

This framework component provides The rule systems provide integrations of one or more rule systems that provide to support to the SI to express in expressing business rules and behaviors.

Box Name : Classification Reasoners System Interface

This framework component provides integrations Classification reasoners provide integration to one or more classification tools.  These tools are systems that process semantic information and dependant information to determine relationships and associations of classes and individuals which may be expressed in an artifact,  its   its annotated information, or  instance representations of associated data.

Box Name :
Expert Systems System Interface

This framework component The expert systems interface provides integration to one or more expert systems.  These systems utilize a set of known facts and domain expert definitions to determine additional semantics and functional definitions within the artifact semantic information and instance representations of associated data.

Box Name : Reasoning SI Services Framework Service Directory

This framework component provides interface support to semantic and reasoning services This directory represents the set of services that are available within an SI implementation provide reasoning functionality to analyze artifacts and instance representations of associated data.

Box Name : caGrid 2.0

Is the connectivity and secure transmission hub for communications with institutions utilized by NCI and it’s associated cancer centers, research centers, and affiliated organizations.

Box Name : Grid Application Toolbox

This is a collection of tools and libraries which are designed to make integration to the caGrid easier and more efficient.

Box Name : caGrid Enabled Applications

Any application that utilizes the grid for communications.  These apps may utilize the Grid Application Toolbox or provide their own interface to the caGrid .  Examples of these applications include the caBig Clinical Information Suite and caTissue.  This will also include infrastructure tools such as Form definition tools, query tools and code generation systems.

Orchestration

This framework service manages the internal flow of operations that can be performed.  This includes automating the transformation and semantic discovery and the utilization of various rule systems or classification systems.

Integrations and applications

Arrow : Grid Integration

The grid integration represents the interaction of SI services with the caGrid

Box Name : Grid Application Toolkit

This SI Tool provides libraries and functions that ease the creation of new caGrid enabled applications.  This tool kit will provide a method to integrate caGrid 1.0 applications to ease applications into the caGrid 2.0 environment.

Box Name : caGrid Enabled Applications

caGrid enabled applications includes any application written to the caGrid specification

Box Name : Semantic Annotation Application

This application is a caGrid enabled application which provides users with the ability to annotate artifacts in an SI framework implementation.  This application is likely a Web Based application that may be part of the caGrid Portal.

Box Name : caGrid Portal The

This caGrid Portal is an application that provides support of the integration of grid components.  From the portal identification of services and data is performed to expose that information to the other users of the gridis a tool for accessing aspects of  of the caGrid in a partner site.

Box Name : Clinical Data

This represents clinical information that may be exposed to the grid.  Using the portal, an authorized user may expose data or services onto the grid, this might include outcome markers, treatment plans or other relevant information

Box Name : Clinical Research Data

This represents clinical research data that might be exposed to the grid.  Using the portal, an authorized user may expose data or services onto the grid, this might include trial cohort qualifications, raw data, or publishable results.

Box Name : Life Sciences Data

This represents life sciences data that might be exposed to the grid.  Using the portal, an authorized user may expose data or services on the grid, this might included gene array studies, algorithms, methodologies and data sets.

Box Name : SI Portal

This application provides a user interface for implementations of the SI framework components.  User would use this tool to access the functionality of the SI components exposed on the grid.  Probably a art of the caGrid Portal

Box Name : Service Discovery

This tool and portal component provides a user with the ability to enter key words and tags or semantic queries to help determine the locations of artifacts, communication endpoints.

Box Name : Semantic Annotation Application

This tool and portal component provides a user with the ability to annotate artifacts and communication endpoints to help user perform queries.

Box Name : Data Endpoint Service Generator

This tool allows a user to quickly create a data endpoint and make it available on the caGrid, merging the data source with a SPARQL Endpoint and structuring for access.

Box Name : Artifact Publication

This tool allows a user to take an artifact and provide a reference to the registry components of the SI framework, provide basic annotations.

Arrow : Other Platforms Integration

This integration represents the interaction of SI services with applications and platforms that might need to utilize function of the SI.

Box Name : SI Application Toolkit

This SI tool provides libraries and functions that ease the creation of new SI Framework enabled applications.

Box Name : Forms and Object Modeler

This SI tool is used to create forms models, message models and other core object models from defined structures.  This tool works with information in the SI to access meta-models and model definitions to construct representations of objects which can be used for data collection and information exchange.

Box Name : Artifact Publication

This SI is the non-portal version of the artifact publication found in the SI portal.  This component is different, because it will provide greater access to various components, enhanced governance support and manipulation of Knowledge Store objects requiring enhanced behaviors.

Box Name : SI Enabled Applications

This represents any number of applications that might need access to SI functionality and would utilize the SI application toolkit.  This may include NCI applications such as caCIS and caTissue.



An example problem:

 
A researcher wishes to collaborate with another researcher to more precisely define a treatment plan for some individuals.  He believes the best way to do that is to expose some data that he is collecting to the other researcher.  This information is changing and expanding, and so merely sending a dataset is insufficient. 
 
How the architecture supports the solution:
 
If the user has followed an expressive methodology (such as ECCF) to design his dataset :
 
Using the caGrid portal, the user logs in and indicates that he wishes to share a dataset with another researcher.  This dataset is accessed via a database, so he must connect the database to the caGrid.  The user will have at his disposal various artifacts that describe his data and it’s representation.  If he has not done so before, he will register the appropriate artifacts (models, specifications, etc) using the caGrid Portal.
 
For each artifact, the Semantic Infrastructure services are used to perform an orchestrated flow to learn about the artifact.  First the system will register the artifact in the si, the si will access the artifact and perform any transformations as necessary to most effectively process it’s contents.  The set of rules, classification definitions and expert systems are utilized to extract semantics such as the set of problem domains the information represents, the mood of the information, and specific elements such as standards of coding schemes and value sets which are used by the system.  All this information is stored in the knowledge store using common semantics used to define datasets.
 
Or, if the user doesn’t have those artifact available :
 
If the user does not have artifacts, he isn’t out of luck.  By providing access to the data resource the system will attempt to determine aspects of the data looking at the representation in the data itself.  Since many systems are not self-describing, the user may need to provide more information during the annotation process.
 
Providing a dataset link :
 
The user is now ready to provide a link to the grid using caGrid portal.  The user provides the parameters that are required to connect and access the dataset.  These parameters are different depending on the type and presentation of the dataset, but generally the user will provide a URL or database connectivity information.   Assuming there are no artifacts the system will generate artifact representations of the data set so that the user can annotate the data.
 
 
Annotating the artifacts :
 
In some cases, the user may have to annotate aspects of the artifacts that represent his data.  This is done in situations where there are not supporting design models.  User will provide general metadata that describe the problem domain and specific data representations used.  This may include code system and value set use, and other elements which effectively document the dataset.  Generally the more the user annotates, the better the consumers of his data will be able to access and query his data.
 
Done.
 
The user has placed his data set on the grid in a way that can be accessed by authorized users.
 

Scrollbar

...