NIH | National Cancer Institute | NCI Wiki  

Contents of This Page

Document Information

Author: Bauer, Scott
Email: bauer.scott@mayo.edu
Team: LexEVS
Contract: ST12-1106
Client: NCI CBIIT
National Institutes of Heath
US Department of Health and Human Services

Revision History

Version

Date

Description of Changes

Author

1.0

2013/03/05

Initial Version

Bauer, Scott

Overview

LexEVS has long relied on a relational database to provide the data store for semantic assertions made about the entity level constructs in terminologies and ontologies. Recently it has become clear that graph database technology has matured enough to allow the the relationships between entities defined by these assertions to be stored in a way that better reflects the nodes and edges of these relationships. Benchmarking tests and practicality reviews have led the LexEVS team to the conclusion that a graph database back end for LexEVS associations will vastly improve traversal performance time and potentially simplify implementation of the association API. 

Database Hierarchy Performance Evaluation

New technologies such as the MVRB-tree algorithm implmented in the OrientDB graph database have proved far more efficient and scalable than the traditional relational data base management system.

Graph Traversals

LexEVS Association Logical Model

The LexGrid Model defines relationships in terms of a source and target node with an edge defined separately in the AssociationPredicate model element. These are the construction basics for larger coded node graphs which are currently represented in a relational schema. The performance restrictions of the relational schema have been well documented above. The source and target structure of LexGrid will be mapped to the structure of the higher performing graph database OrientDB.

associationinstance class diagram

While the graph based database seems capable to handle the functions shown in the diagram above, some calls to LexEVS will continue to access some of the model elements that define metadata about the association.

association class diagram

LexGrid in the LexEVS schema (From the MySQL workbench)

LexGrid in the LexEVS schema (From the MySQL workbench)

Mapping LexGrid data model elements to OrientDB


Mapping LexGrid data model elements to OrientDB

LexEVS Hierarchy Performance Architecture

While the new implementation of the node graph will largely run against the OrientDB service, some portions of the legacy LexEVS API will be needed to access various metadata and property elements.

 

Portions of the legacy LexEVS API needed to access various metadata and property elements

Code Considerations

A CodedNodeFactory will determine whether this is an implementation that uses the graph database in conjunction with the relational database or a purely relational database. And a newly implemented DAO and OrientDBCodedNodeGraph provide the underpinnings of what will be a higher performance version of LexEVS' traversal of relationship hierarchies in stored terminologies.

CodedNodeFactory and CodedNodeGraph

  • No labels