NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Current »

Contents of this Page

 

Situation

Currently NCI terminology administrative staff load ~900 value set definitions and the same number of resolved value sets into the LexEVS common terminology service.  These value sets are defined in a series of xml files which can be auto generated, but sometimes must be checked individually to insure accuracy or correct errors. Loading time of 18 to 24 hours from the files into LexEVS definition and coding scheme entires is extensive enough to be subject to network and other system performance problems.  Even minor updates tend to require that the entire process begin over again.

Background

Originally values sets for LexEVS were expressed in terms similar to the ISO 11179 specification. The specification for Value domains were conceived as a set of rules that could be applied to any version of a given terminology with the expectation that a current version of the value set would be expressed. Value domains, expressed as value set definitions in LexGrid, were expressed as entries in an xml file. Value set expression for the NCI Thesaurus (NCIt) has scaled up over the last few years.  The LexEVS implementation model of loading Value Set Definitions and their Resolved Value Set Coding Scheme counterparts has become unwieldy as the number of value sets has grown.     As this file set grew into the hundreds it became more difficult to keep up to date and more error prone.  

Assessment

Value sets asserted by the NCIt terminology source (Source Asserted Value Sets), provide a complete definition of value sets in static terms  It does not require resolving a graph, or a union of a variety of sources to achieve its definition.  At the same time the root (or target) entity of the asserted values carries adequate metadata to define what former resolved value set coding scheme represented without the workflow complexity of having to create separate coding scheme database entries.  This should result in higher performing value set retrieval and eliminate very lengthy loading times.  At the same time this would change the way that value sets are made available to end users meaning query mechanisms and Java and REST api results would retrieve far fewer value sets than before.

Recommendation

Load value set definitions from source defined (source asserted) value sets as well as xml files.  Create a Source Asserted Value Set API and Search mechanism.  Integrate resolved value set coding schemes in the search mechanism.  Update current Resolved Value Set and CTS2 functions to allow source asserted values to be retrieved along with traditional value sets.  Provide remote api support for asserted value set functions where necessary.

 

 

  • No labels