NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
{scrollbar:icons=false}

Page info
title
title

Panel
titleContents of this Page
Table of Contents
minLevel2

caBIG LexEVS Architecture Overview

LexEVS software architecture and implementation is designed to facilitate flexibility and future expansion in the caBIG community. The purpose of LexEVS is to enable individual Cancer Centers to use the provided caCORE EVS services and if desired, install local instances of vocabularies.

...

  • caBIG Nodes
  • Partial Online Replica
  • Local Replica
  • NCI

This graphic shows the caBIG Grid as described above.

History and Definition

What is LexGrid?

LexGrid is the model used to store terminologies. The LexGrid Model is Mayo Clinic's proposal for standard storage of controlled vocabularies and ontologies. The LexGrid Model defines how vocabularies should be formatted and represented programmatically, and is intended to be flexible enough to accurately represent a wide variety of vocabularies and other lexically-based resources. The model also defines several different server storage mechanisms (e.g., relational database, LDAP) and a XML format. This model provides the core representation for all data managed and retrieved through the LexBIG system, and is now rich enough to represent vocabularies provided in numerous source formats including:

...

For more information see the LexGrid Background Information.

What is LexBIG?

LexBIG is the set of services that EVS adapters use to store/retrieve terminology metadata. LexBIG is a more specific project that applies LexGrid vision and technologies to requirements of the caBIG® community. The goal of the project is to build a vocabulary server accessed through a well-structured application programming interface (API) capable of accessing and distributing vocabularies as commodity resources. The server is to be built using standards-based and commodity technologies. Primary objectives for the project include:

  • Provide a robust and scalable open source implementation of EVS-compliant vocabulary services. The API specification will be based on but not limited to fulfillment of the caCORE EVS API. The specification will be further refined to accommodate changes and requirements based on prioritized needs of the caBIG® community.
  • Provide a flexible implementation for vocabulary storage and persistence, allowing for alternative mechanisms without impacting client applications or end users. Initial development will focus on delivery of open source freely available solutions, though this does not preclude the ability to introduce commercial solutions (e.g. Oracle).
  • Provide standard tooling for load and distribution of vocabulary content. This includes but is not limited to support of standardized representations such as UMLS Rich Release Format (RRF), the OWL web ontology language, and Open Biomedical Ontologies (OBO).

What is LexEVS?

LexEVS combines LexBIG and the EVS adapters into one set of services. LexEVS is a collection of programmable interfaces that provides developers with the ability to access any installation of the LexEVS terminology server. The controlled terminologies hosted by the NCI EVS Project are published via the Open-Source LexEVS Terminology Server. It is a caCORE Software Development Kit (SDK) generated system. The caCORE SDK is a set of tools that can be used by an intermediate Java developer to create a caCORE-like system.

...

LexEVS has a number of API mechanisms for use with various technologies. In addition, LexEVS provides developers GUIs for administration and testing of the terminology server. These GUIs are intended only for developers.

LexEVS 6.

...

x Architecture Overview

The LexEVS 6.0 x infrastructure exhibits an n-tiered architecture with client interfaces, server components, domain objects, data sources, and back-end systems (Figure 1.1). This n-tiered system divides tasks or requests among different servers and data stores. This isolates the client from the details of where and how data is retrieved from different data stores.

...

  • Application Service Layer - accepts incoming requests from all public interfaces and translates them, as required, to Java calls in terms of the native LexEVS API. Non-SDK queries are invoked against the Distributed LexEVS API, which handles client authentication and acts as proxy to invoke the equivalent function against the LexEVS core Java API. The caGrid and SDK-generated services are optionally run in an application server separate from the Distributed LexEVS API.
    The LexEVS caCORE SDK services work directly against the database, via Hibernate bindings, to resolve stored objects without intermediate translation of calls in terms of the LexEVS API. However, the LexEVS SDK services do still require access to metadata and security information stored by the Distributed and Core LexEVS API environment to resolve the specific database location for requested objects and to verify access to protected resources, respectively.
    From the client prospective, the LexEVS services will function as "ports" accessible through the caGrid 1.3 service architectural model. LexEVS services will follow the caGrid architecture for analytical and data services. See the caGrid 1.3 documentation for architectural details.
  • Core API Layer - underpins all LexEVS API requests. Search of pre-populated Lucene index files is used to evaluate query results before incurring cost of database access. Access to the LexGrid database is performed as required to populate returned objects using pooled connections.
  • Data Source Layer - is responsible for storage and access to all data required to represent the objects returned through API invocation.

LexEVS 6.

...

x High-level Design Diagram

The figure below shows the following components of the Architecture Diagram:

  • Clients: Local, Distributed, and caGrid clients, and SOAP, REST, QBE Clients.
  • Application Service: CTS2 (LexEVS Local Runtime), Distributed LesEVS, LexEVS caCORE API's.
  • Core API: LesEVS Model Objects Extended for CTS 2 = > DAO
  • Indexes and Data Source: Lucene Index Files and LexGrid DB

This graphic shows the LexEVS architecture as described above.

LexBIG Architecture

LexBIG Services

This section describes architectural detail for services provided by the LexBIG system. These services are geared toward the administration, management, and serving of vocabularies defined to the LexGrid/LexBIG information model. A system overview is provided, followed by a description of key subsystems and components. Each subsystem is described in terms of its overall structure, formal model, and specification of key public interfaces.

...

LexBIGServiceManager - The service manager provides a centralized access point for administrative functions, including write and update access for a service's content. For example, the service manager allows new coding schemes to be validated and loaded, existing coding schemes to be retired and removed, and the status of various coding schemes to be updated and changed.

caGRID Hosting

The following figure shows the caGrid Hosting Environment. The Hosting Environment comprises the LexBIG Service which comprises the Service metadata, Query Service, Service Manager, and Extensions. A Service Discovery points to the Service Metadata component.

...

Additional specifications related to the registration and discovery of LexBIG services in the caGRID environment will be included later phases of work in concordance with caGRID 1.0. This is will be coordinated with caBIG® Architecture workspace designees.

Service Management Subsystem

The following figure shows a diagram Service Management Subsystem.

...

  • IndexersVocabularies may be indexed to provide enhanced performance or query capabilities. Types of indexes incorporated into the LexBIG system include but are not limited to the following:
    • Lexical Match - for example, "begins-with" and "contains"
    • Phonetic - allows for the ability to query based on "sounds-like" entry of search criteria.
    • Stemming - allows for the ability to find lexical variations of search terms.
      Index creation is typically bundled into the load process. Architecturally speaking, however, this capability is decoupled and extensible.
  • Loaders
    Vocabularies may be imported to the system from a variety of accepted formats, including but not limited to:
    • LexGrid XML (LexBIG canonical format)
    • NCI Thesaurus, provided in Web Ontology Language format (OWL)
    • UMLS Rich Release format (RRF)
    • Open Biomedical Ontologies format (OBO)
      As with indexers, the load mechanism is designed to be extensible from an architectural standpoint. Additional loaders can be supported by the introduction of pluggable modules. Each module is implemented in the Java programming language according to a LexBIG-provided interface, and registered to the loader runtime environment.

Metadata and Discovery Subsystem

The following figure shows the Metadata and Discovery Subsystem diagram.

...

Finally, the LexBIG architecture provides the underpinnings for LexBIG services to be made accessible through the caGRID environment in the future, where vocabulary services might be deployed and discovered within a caGRID Globus container. However, this portion of the API is preliminary and awaits coordination with caBIG® Architecture WS designees to determine exact recommendations and nature of LexBIG services on the grid.

Query Subsystem

The following figure shows the Query Subsystem.

...

This subsystem provides the functionality required to fulfill caCORE/EVS and other vocabulary requests. The Query Service is comprised of Lexical Operations, Graph Operations, Metadata, and History Operations.

Lexical Set Operations

Lexical Set Operations provides methods to return a lists or iterators of coded entries. Supported query criteria include the application of match/filter algorithms, sorting algorithms, and property restrictions. Support is also provided to resolve the union, intersection or difference of two node sets.

Graph Set Operations

Graph Operations support the subsetting of concepts according to relationship and distance, identification of relation source and target concepts, and graph traversal. Additional operations include enumeration and traversal of concepts by relation, walking of directed acyclic graphs (DAGs), enumeration of source and target concepts for a relation, and enumeration of relations for a concept.

Metadata Operations

Metadata Operations allows for the query and resolution of registered code system metadata according to specified coding scheme references, property names, or values.

History Operations

History provides vocabulary-specific information about concept insertions, modifications, splits, merges, and retirements when supplied by the content provider.

Common Terminology Services 2 (CTS 2) Architecture (Preliminary)

Structure of the Preliminary CTS 2 Service

The CTS 2 specification defines several functional profiles which are a focused subset of the functionality of a CTS 2 implementation. Functional profiles are defined to subset a group of operations which must be supported in order to claim conformance to the profile.

The following functional profiles are addressed by LexEVS 6.0x:

Terminology Query Profile

  • Searching and querying terminologies
  • Provide access to terminology content and representational structures (description logic) consistent with the terminology author's intent.

Terminology Administration Profile

  • Restricting administrative access
  • Obtaining and loading terminologies
  • Maintaining terminology access
  • Control Content Access

Terminology Authoring Profile

  • Functional terminology analysis/query
  • Direct terminology edits

...

  • CTS2 Functional profiles: Query Profile, Administration Profile, and Authoring Profile.
  • CTS2 STM Specific Methods: LexEVS Local Runtime, Distributed LexEVS, and LexEVS Analytical Grid.

LexEVS CTS 2 Services

The following figure shows the LexEVS CTS 2 Services.

...

  • Administrative Services
  • Versioning Services
  • Authoring Services
  • Searching and Querying Services
  • Association and Mapping Services
  • Value Set and Pick List Definition Services

LexEVS CTS 2 API Architecture

The LexEVS CTS 2 API provides programmatic access to LexEVS 6.0 x implementation of the preliminary CTS 2 features and services.

Documentation can be found here LexEVS 6.0 CTS2 API

LexEVS API/Grid Service Interaction

See the LexEVS 4.2 Grid Service Design and Implementation.

...