NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 16 Next »

Document Information

Author: Craig Stancl, Scott Bauer, Cory Endle
Email: Stancl.craig@mayo.edu, bauer.scott@mayo.edu, endle.cory@mayo.edu
Team: LexEVS
Contract: S13-500 MOD4
Client: NCI CBIIT
National Institutes of Heath
US Department of Health and Human Services

Contents of this Page

The purpose of this document is to collect, analyze, and define high-level needs for and designed features of the National Cancer Institute Center for Biomedical Informatics and Information Technology (NCI CBIIT) LexEVS Release 6.4.

The focus is on the functionalities proposed by the stakeholders and target users to make a better product.

Design Scope

The LexEVS 6.4 Scope Document can be found here: LexEVS 6.4 Scope Document

Requirements 

The LexEVS 6.4 Requirements Document can be found here: LexEVS 6.4 Requirements Definition Document

Detailed Design

The following sections specify how the design will satisfy the requirements for the Lucene search upgrade.  This design reflects the wide ranging changes that will be necessary to LexEVS to fully update over three full releases of Lucene.  Since Lucene is the heart of the search mechanism that powers efficient searches in LexEVS these changes are necessarily extensive.  The focus of these changes can be broken down, to some extent, into three areas. 

  • Code decoupling from the current Lucene to allow for easier updates to the underlying search implementation.  
  • Multi-index searches to replace single index searches.  This will allow easier maintenance than the large, monolithic index we currently use
  • Code refactoring to the latest Lucene code base.  This requires extensive changes to the code base including replacement of objects with similar behavior for the current code base  and adjusting to changes in the Lucene API.  This also includes reimplementing a number of customized Lucene analysers and HitCollectors to insure compatibility with current code unit tests and user expectations.

Code Decoupling

Multi-Index Searches

The Current Implementation

Our current search for coding schemes within a monolithic index requires use of a Lucene Filter dependent on an XML file called metadata.xml.  This file has a handmade concurrency protecting class providing access and relies on the processing of DOM objects in order to provide both filtering of more granular entities in the system, and listings of the code systems in general.  As such it is something of a bottleneck for access.

A Proposed, Revised Metadata Implementation

With the advent of an index per code system design.  The metadata structure can go away.  In it's place a contextual file read of the names of the indexes with additional metadata persistence where necessary will replace the concurrent xml parsing.


MetaData Dependencies
The many dependencies on the metadata.xml file and it's accompanying MetaData class will have to be refactored to a new implementation.  All classes in the Indexer will be dropped or pulled into the dao project

Changing MetaData Dependency Class Call Outs

MetaData and Dependencies
//Not really an interface, as a class it will need to be rethought, reimplemented to accommodate multi-index initialization.
org.lexevs.dao.index.connection.IndexInterface

//This class attempts to manage index events concurrently and is highly dependent on the parsing of an XML file
edu.mayo.informatics.indexer.utility.MetaData

//Along with the above a multi index implementation of this interface will have to be done. 
//The pertinent implementation of this provides an in memory collection of objects consistent with the metadata elements
//Registration consists of updating this collection in conjunction with the metadata file.
org.lexevs.dao.index.indexregistry.IndexRegistry

//A good portion of the metadata file is created in this extension of the IndexCreator. 
//Since the metadata.xml  is going away — we’ll want to reimplement
org.lexevs.dao.index.indexer.EntityBatchingIndexCreator

//Creates and deletes indexes.  Managers readers and writers.  Adds and deletes at the document  level.   
//Gets searchers.  This lives in the Indexer, if it’s on the code path it needs to be updated, 
//otherwise it should be tossed out.  
edu.mayo.informatics.indexer.api.IndexerService

// This and its interface EntityIndexService may or should replace the IndexerService.  Needs closer examination.
org.lexevs.dao.index.service.entity.LuceneEntityIndexService

//Central manager for Search, Metadata, and Common indexes as well as the metadata.xml managing class
//Since this class uses some of the properties recorded for the index we will need to see what depends on these values
//and how they can be otherwise provided.  
org.lexevs.dao.index.access.IndexDaoManager

//Index CRUD service.  Cleanup methods serve to do some updates. 
//Depends on Dao, MetaData and Registry classes and contains some Lucene objects
org.lexevs.dao.index.operation.DefaultLexEvsIndexOperations

// Spring wired factory class that implements Spring FactoryBean to create singleton MetaData class
org.lexevs.dao.index.lucenesupport.LuceneIndexMetadataFactory

//Works largely at the entity level of creation and deletion but also can drop full indexes,
//as well as create them and query indexes it has created.
org.lexevs.dao.index.service.search.LuceneSearchIndexService

Code Path Maintenance and Additions

Some support for the remaining MetaData index will have to remain.  An effort will be made to leverage remnants of old multi-index implementations. In essence we'll be maintaining two code paths for this purpose.

 

Multiple Code Paths
//This Index “template” interface directly calls Lucene reader/write elements.  It’s base and multi base implementations will need to be adjusted 
//to some extent, but it’s clear that some support still exists for multiple index reading writing.   Some of both will have to be maintained for the 
//remaining MetaData index search (different from the metadata.xml) and possibly the simple search.
org.lexevs.dao.index.lucenesupport.LuceneIndexTemplate

 

Changing the Relational Representation in Lucene

 

General Code Refactoring

 

 

Detailed Design - Provide the architecture and design for the new Lucene feature.

LEXEVS-724 - Getting issue details... STATUS

 

The following JIRA items are all part of LEXEVS-724.

LEXEVS-813 - Getting issue details... STATUS

LEXEVS-814 - Getting issue details... STATUS

 

LEXEVS-815 - Getting issue details... STATUS

LEXEVS-816 - Getting issue details... STATUS

LEXEVS-817 - Getting issue details... STATUS

LEXEVS-818 - Getting issue details... STATUS

LEXEVS-819 - Getting issue details... STATUS

LEXEVS-820 - Getting issue details... STATUS

LEXEVS-821 - Getting issue details... STATUS

LEXEVS-822 - Getting issue details... STATUS

LEXEVS-823 - Getting issue details... STATUS

LEXEVS-824 - Getting issue details... STATUS

LEXEVS-825 - Getting issue details... STATUS

 

 

 

 

Please view the detailed design: LexEVS 6.2 Design Document - Detailed Design - Make it easy to do retrieval of only active concepts in a terminology through the/ a service

 

  • No labels