NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Scrollbar
iconsfalse

Page info
title
title

Panel
titleContents of this Page
Table of Contents
minLevel2

...

Overview

LexBIG software architecture and implementation is designed to facilitate flexibility and future expansion. The following diagrams are intended to aid the understanding of LexBIG service integration in context of the larger caBIG® universe and specific deployment scenarios:

...

The LexBIG API is designed with web and grid-level enablement in mind. This diagram depicts deployments that wrap the current API to allow the runtime to be accessed through web or grid services.

What is LexGrid?

LexGrid is an initiative of the Mayo Clinic Division of Biomedical Informatics that focuses on the representation, storage, and dissemination of vocabularies. This effort centers on, but is not limited to, the domain of medical vocabularies and nomenclatures. Focal points of the LexGrid project include the development and promotion of standards, tools, and content that:

...

Additional information for LexGrid is available at http://informatics.mayo.edu .

What is LexBIG?

LexBIG is a more specific project that applies LexGrid vision and technologies to requirements of the caBIG® community. The goal of the project is to build a vocabulary server accessed through a well-structured application programming interface (API) capable of accessing and distributing vocabularies as commodity resources. The server is to be built using standards-based and commodity technologies. Primary objectives for the project include:

...

The goal for the initial year of development was to achieve the Bronze level of compatibility with regard to the caBIG® requirements. Silver-level compatibility is being pursued.

LexGrid Model

The LexGrid Model is Mayo Clinic's proposal for standard storage of controlled vocabularies and ontologies. The LexGrid Model defines how vocabularies should be formatted and represented programmatically, and is intended to be flexible enough to accurately represent a wide variety of vocabularies and other lexically-based resources. The model also defines several different server storage mechanisms and an XML format. This model provides the core representation for all data managed and retrieved through the LexBIG system, and is now rich enough to represent vocabularies provided in numerous source formats such as OWL (NCI Thesaurus) and RRF (NCI MetaThesaurus).

...

Following are some of the higher-level objects incorporated into the model definition:

Code Systems

Each service defined to the LexGrid model can encapsulate the definition of one or more vocabularies. Each vocabulary is modeled as an individual code system, known as a codingScheme. Each scheme tracks information used to uniquely identify the code system, along with relevant metadata. The collection of all code systems defined to a service is encapsulated by a single codingSchemes container.

Concepts

A code system may define zero or more coded concepts, encapsulated within a single container. A concept represents a coded entity (identified in the model as a concept) within a particular domain of discourse. Each concept is unique within the code system that defines it. To be valid, a concept must be qualified by at least one designation, represented in the model as a property. Each property is an attribute, facet, or some other characteristic that may represent or help define the intended meaning of the encapsulating concept. A concept may be the source for and/or the target of zero or more relationships. Relationships are described in more detail in a following section.

Relations

Each code system may define one or more containers to encapsulate relationships between concepts. Each named relationship (e.g. "hasSubtype" or "hasPart") is represented as an association within the LexGrid model. Each relations container must define one or more association. The association definition may also further define the nature of the relationship in terms of transitivity, symmetry, reflexivity, forward and inverse names, etc. Multiple instances of each association can be defined, each of which provide a directed relationship between one source and one or more target concepts.

Source and target concepts may be contained in the same code system as the association or another if explicitly identified. By default, all source and target concepts are resolved from the code system defining the association. The code system can be overridden by each specific association, relation source (associationInstance), or relation target (associationTarget).

LexBIG Extensions

The LexBIG vocabulary model extends the LexGrid model to provide unique constructs or granularity required by caBIG® that are not present in the core model. While many extensions exist, this document will focus on some of direct relevance to the high-level architecture.

Concept Resolution

LexBIG allows the service runtime to provide managed resolution of code-based objects that are referenced through LexBIG-specific lists and iterators (mechanism that allow streaming of list content). These lists and iterators are typically returned when requesting sets or graphs of vocabulary terms through the LexBIG API (described in LexBIG APIs). Some model components involved in the resolution process include:

...

Note: Formal representation of the LexGrid and LexBIG models are discussed in VKC:Information Models.

Information Models

Overview

The information below is provided for introductory purposes. A full description of all available model components is also available in the javadoc distributed with the LexEVS installation package (see file breakdown in the LexEVS 5.0 Installation Guide). Since the javadoc is automatically generated and synchronized during the build process, it is recommended as the primary reference for use by LexEVS developers.

LexGrid Model

The LexGrid model is mastered in XML schema. The LexBIG project currently builds on the 2008 version of the LexGrid schema. A formal representation, showing portions of this structure that are of primary interest to the LexBIG project, is presented below. A complete version of the model is available at http://informatics.mayo.edu?page=lgm .

CodingSchemes

The CodingSchemes branch of the model defines high level containers for concepts and relations. Each CodingScheme represents a unique code system or version in the LexBIG service. Components of interest include:

codingScheme

<u>codingSchemes</u>

A collection of one or more coding schemes.

...

A collection of properties.

codingSchemes

Concepts

Each concept represents a unique entity within the code system, which can be further described by properties and related to other concepts through relations.

conceptsAndInstances

<u>codingScheme</u>

A resource that makes assertions about a collection of terminological entities.

...

A binary relation from a set of entities to a set of entities and/or data. The entityType for the class concept must be "association".

conceptsAndInstances

entities

<u>codingScheme</u>

A resource that makes assertions about a collection of terminological entities.

...

A binary relation from a set of entities to a set of entities and/or data. The entityType for the class concept must be "association".

entities

entity

<u>entity</u>

A set of lexical assertions about the intended meaning of a particular entity code.

...

A link between two properties for an entity.. Examples include acronymFor, abbreviationOf, spellingVariantOf, etc. Must be in supportedPropertyLink.

entity

Relations

Relations are used to define and qualify associations between concepts.

association

<u>codingScheme</u>

A resource that makes assertions about a collection of terminological entities.

...

An entity that occurs in one or more instances of a relation on the "from" (or left hand) side of a particular relation.

association

associationInstance

<u>association</u>

A binary relation from a set of entities to a set of entities and/or data. The entityType for the class concept must be "association".

...

A modifier that further qualifies the particular association instance.

associationInstance

Naming

These elements are primarily used to define metadata for a coding scheme, mapping locally used names to global references.

naming

<u>URIMap</u>

A local identifier that is used in a specific context (e.g. language, property name, data type, etc) and an optional URI that can be used to find the exact definition and meaning of the local id. Note: the string portion of this entry can be used to provide additional documentation or information, especially when a URI is not supplied.

...

A source role and athe URI of the defining resource

naming

LexBIG Model

The following extensions to the LexGrid model were introduced in support of caBIG® requirements. As with the LexGrid model, this document provides a summary of the most significant elements for consideration by LexBIG programmers. The complete and current version of the model is available online at http://informatics.mayo.edu?page=lexex .

Core

LexBIG core elements provide enhanced referencing and controlled resolution of LexGrid model objects.

...

References a service in the Globus environment, this will be a global service handle (GSH).

InterfaceElements

Defines metadata related to model objects required by the runtime.

...

The combination of a system release and all of the entityVersions
that accompanied that release.

NCIHistory

Maintains a record of modifications made to a code system.

...

A change event as documented in ftp://ftp1.nci.nih.gov/pub/cacore/EVS/ReadMe_history.txt. Note that date and time of the change event is recorded in the containing version. All change events for the same/date and time a recorded in the same version.

Architecture

LexBIG

LexBIG Services

This section describes architectural detail for services provided by the LexBIG system. These services are geared toward the administration, management, and serving of vocabularies defined to the LexGrid/LexBIG information model. A system overview is provided, followed by a description of key subsystems and components. Each subsystem is described in terms of its overall structure, formal model, and specification of key public interfaces.

...

<tt>LexBIGServiceManager</tt> - The service manager provides a centralized access point for administrative functions, including write and update access for a service's content. For example, the service manager allows new coding schemes to be validated and loaded, existing coding schemes to be retired and removed, and the status of various coding schemes to be updated and changed.

caGRID Hosting

The LexBIG architecture provides the underpinnings LexBIG services to be made accessible through the caGRID environment in the future, where LexBIG services might optionally be deployed in a caGRID Globus container. caGrid provides a Globus service for service registration and discovery. LexBIG services deployed to the grid would be registered in the NCICB registry and be searchable through the NCICB index service.

Specification

Additional specifications related to the registration and discovery of LexBIG services in the caGRID environment will be included later phases of work in concordance with caGRID 1.0. This is will be coordinated with caBIG® Architecture workspace designees.

Service Management Subsystem

This subsystem provides administrative access to functions related to management and publication of LexBIG vocabularies. These functions are generally considered to be reserved for LexBIG administrators, with detailed instructions on how to secure and carry out related tasks described by the LexBIG Administrator's Guide.

...

    • Lexical Match - for example, "begins-with" and "contains"
    • Phonetic - allows for the ability to query based on "sounds-like" entry of search criteria.
    • Stemming - allows for the ability to find lexical variations of search terms.
      Index creation is typically bundled into the load process. Architecturally speaking, however, this capability is decoupled and extensible.
      *Loaders
      Vocabularies may be imported to the system from a variety of accepted formats, including but not limited to:
    • LexGrid XML (LexBIG canonical format)
    • NCI Thesaurus, provided in Web Ontology Language format (OWL)
    • UMLS Rich Release format (RRF)
    • Open Biomedical Ontologies format (OBO)
      As with indexers, the load mechanism is designed to be extensible from an architectural standpoint. Additional loaders can be supported by the introduction of pluggable modules. Each module is implemented in the Java programming language according to a LexBIG-provided interface, and registered to the loader runtime environment.

Metadata and Discovery Subsystem

This subsystem provides information about accessible vocabularies, related licensing/copyright information, and registration/discovery of LexBIG services.

...

Finally, the LexBIG architecture provides the underpinnings for LexBIG services to be made accessible through the caGRID environment in the future, where vocabulary services might be deployed and discovered within a caGRID Globus container. However, this portion of the API is preliminary and awaits coordination with caBIG® Architecture WS designees to determine exact recommendations and nature of LexBIG services on the grid.

Query Subsystem

This subsystem provides the functionality required to fulfill caCORE/EVS and other vocabulary requests. The Query Service is comprised of Lexical Operations, Graph Operations, Metadata, and History Operations.

...

History provides vocabulary-specific information about concept insertions, modifications, splits, merges, and retirements when supplied by the content provider.

LexEVS API/Grid Service Interaction

(DESIGN DOC IMPORT START)

Revision History

Content changes to this document from the previous to the current level are indicated by revision bars (|) unless a complete rewrite is indicated.

Date

Version

Description

Author

07/29/2008

1.0

Initial document

Kevin Peterson

8/30/2008

1.1

Revised for Security and Exception Handling

Kevin Peterson

Note: If this document has been inspected, please indicate the inspection date that each version is based on in the "Change Description and Explanation" area. Entries in this log must be maintained for at least 3 years.

Document Purpose

This document provides the detailed design and implementation of LexBIG Enterprise Vocabulary Service (LexEVS) caGrid Service. It should be noted that the LexEVS Grid Service is no longer part of the caGrid 1.1 infrastructure and will be deployed as a separate unit. This is a change from the previous release of the LexEVS Grid Service.

The LexEVS caGrid service will allow programs to utilize the caGrid 1.2 infrastructure to access LexEVS information that is currently being produced by NCICB.

Implementation Overview

Team Members

Table 1 - Team Members

Role

Name

Development Lead

Kevin Peterson

Documentation Lead

Kevin Peterson

Project Manager

Tom Johnson

Description

The LexEVS grid service will be used to obtain data accessible via the LexEVS service, specifically, the Distributed LexEVS services. Please refer to the LexEVS Programmer's Guide for more information.

For more Documentation, Build/Deployment instructions and examples, visit the project documentation home at: http://gforge.nci.nih.gov/docman/index.php?group_id=491&amp;selected_doc_group_id=3749&amp;language_id=1

Scope

The LexEVS Grid service will provide programmatic access to the LexBIG domain objects that are available via the LexBIG information model.

...

Context

caBIG

Classification Scheme

LexBIG

Version

LexBIG_v2_3_rv1

Architecture

The LexEVS Grid Service is implemented to expose the API and Model of LexBIG 2.3. For more information on LexBIG, see http://informatics.mayo.edu

...

Below is the LexEVS Grid Service Architecture, viewed from inside of the Web Service Container. For more information on how Service Contexts and Resources are used, see the "Service Contexts and State" section below.

LexEVS Grid Service Class Diagram

The LexEVS Grid Service is built on the LexGrid/LexBIG model and implementation.
For more information about this model, visit (LexBIG) https://gforge.nci.nih.gov/plugins/scmsvn/viewcvs.php/LexBIG_Core_Services/LexBIG-2.3/lexbig/lbModel/?root=lexevs
and (LexGrid) https://gforge.nci.nih.gov/plugins/scmsvn/viewcvs.php/LexBIG_Core_Services/LexBIG-2.3/lgModel/?root=lexevs

...

For information specific to the LexEVS Grid Service, visit: https://gforge.nci.nih.gov/plugins/scmsvn/viewcvs.php/LexBIG_Core_Services/LexBIG-2.3/lexbig/lbModel.cagrid/?root=lexevs
This link contains Class Diagrams and descriptions for input/output parameters, as well as other information concerning the Silver Level Compliance submission package.

LexEVS Grid Service Sequence Diagram

The sequence diagram for the operation "getSupportedCodingSchemes" is described below:

...

General Call Sequence Example:

Assumptions

  • The LexEVS service will be based on the latest LexEVS 5.0 release.
  • The LexEVS Grid Service will not have any method level security. All security requirements will be handled by the actual deployment of the underlying LexEVS 5.0 service. Please see the "Security" section below for more information on how the LexEVS Grid Service utilizes this security.
  • The LexEVS Grid Service will not be deployed as a "core" service by caGrid at NCICB as was previously done, but rather will now be deployed as a standalone service.
  • The LexEVS Grid Service release schedule will no longer be coupled to the caGrid deployment schedule as previously done.
  • Multiple version of LexEVS Grid Service may be active at the same instance in time depending solely on the availability of the underlining EVSAPI service.

Dependencies

  • LexEVS 5.0 service needs to be available and running correctly.
  • The LexEVS service and operations will use the Introduce toolkit to generate the appropriate structure for registering the service into caDSR.

Third Party Tools

  • Introduce Toolkit
  • Globus Toolkit (4.0.3) or appropriate version supported by caGrid 1.2
  • caGrid 1.2 core infrastructure

Server

The LexEVS Grid Service will be deployed as a "stand alone" grid service at NCICB.

APIs

The main Service API exposed by the LexEVS Grid service will be the http://informatics.mayo.edu/LexGrid/downloads/javadocGrid/org/LexGrid/LexBIG/cagrid/interfaces/LexBIGServiceGrid.html Interface. All other APIs will not be directly exposed, but will be made available through Service Contexts.

In General, API calls will follow this sequence:

API Examples

For an example clients, service calls, and SOAP messages, see http://gforge.nci.nih.gov/docman/index.php?group_id=491&amp;selected_doc_group_id=3880&amp;language_id=1

...

Searching for concepts in NCI Thesaurus containing the string "Gene"
:SearchingForConcepts_Snippet

Service Contexts and State

Along with the Main Service (described above), the Server will also host the following Service Contexts. These Service Contexts are not meant to be called directly as Grid Services. The main function of these Service Contexts is to provide additional functionality to the Main Service.

...

6. These API calls are separate calls but statefully maintained on the server via the Resource.

Error Handling

Error Connecting to LexEVS Grid Service

...

Service Context Services are not meant to be called directly. If the client attempts to do so, an org.LexGrid.LexBIG.cagrid.LexEVSGridService.CodedNodeSet.stubs.types.InvalidServiceContextAccess Exception will be thrown. This indicates a call was made to a Service Context without obtaining a Service Context Reference via the Main Service (see the above section Service Contexts and State for more information).

Client

The Introduce toolkit generates a "client" class that will be provided to the users.

Security Issues

Security in the LexEVS Grid Service is implemented in the Distributed LexBIG layer. The information in this section explains how the LexEVS Grid Services utilize this security implementation. For more information about the Distributed LexBIG Security Implementation, see this documentation: http://gforge.nci.nih.gov/tracker/download.php/366/1462/10884/4060/Distributed_LexBIG_%20AccessTo_Licensed_Vocabulary_implemenation.doc

...

All non-secured information accessed by the LexEVS Grid Service is publicly available from NCICB and users are expected to follow the licensing requirements currently in place for accessing and using NCI EVS information.

Performance

The LexEVS service will take advantage of all improvements made to the LexEVS API services with the exception of lazy loading. LexEVS grid service, being in nature a web service is currently not taking advantage of lazy loading since objects are transferred as fully populated objects. However, future releases of LexEVS Grid Service may refractor the interface in such as way as to take advantage of some of the benefits brought about by the inclusion of lazy loading in to LexEVS API service.

LexEVS Grid Services utilize the performance enhancements of the LexBIG API.
For more information about LexBIG performance (which LexEVS Grid Services are dependent on), see http://informatics.mayo.edu

Installation / Packaging

The service will be installed and deployed as a "stand alone" service at NCICB.

Migration

Both the current version of LexEVS grid service may be "in service" simultaneously if the corresponding underlying EVSAPI service is also "in service" to manage migration of clients.

System Testing

See LexEVS Grid Service Testing Documentation at: http://gforge.nci.nih.gov/docman/index.php?group_id=491&amp;selected_doc_group_id=3879&amp;language_id=1

...

TITLE

NAME

Technical Writer

Panel

LexEVS Loader Source Mapping

This section is now moved to separate pages to provide details on how different formats are loaded into the LexEVS model.

...