NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Increased load time and storage requirements due to additional Meta (RRF) content.
    • Due to the additional content to load from RRF files, there is a risk of increased loading times and increased storage requirements. The new loader framework should mitigate the increased loading times (providing a faster load while increasing content to be loaded).

MAJOR SECTION DIVISION HERE

Detailed Design

Specify how the solution architecture will satisfy the requirements. This should include high level descriptions of program logic (for example in structured English), identifying container services to be used, and so on.

Query Performance and Behavior Enhancements Detailed Design

Lucene Lazy Loading

Backgroud - Lucene Documents
Lucene stores information in Documents, and these Documents have Fields that are used to hold information. Each Document has a unique id.

...

Instead of retrieving the information up front, LexEVS will simply store the Document id for later use. When this information is actually needed by the user (for example, the information needs to be displayed), it is retrieved on demand.

Searching

To allow users to plug in custom search algorithms, the LexEVS Extension framework needed to be extended to include Searches.

The org.LexGrid.LexBIG.Extensions.Extendable.Search interface consists of one method to be implemented:

Class:

org.LexGrid.LexBIG.Extensions.Extendable.Search

Method:

public org.apache.lucene.search.Query buildQuery(String searchText)

Description:

Given a String search string, build a Query object to match indexed Lucene Documents

This enables the user to construct any type of Query given search text. Wildcards may be added, search terms may be grouped, etc.

AND vs. OR
Previously, for most search algorithms Lucene applied an 'OR' to the terms if multiple terms were input as search text. For example, a search of 'heart attack' would match all documents containing 'heart' OR all documents containing 'attack'. This lead to non-intuitive results being returned to the user. Changing Lucene to default to an 'AND' type strategy will increase search precision and in most cases shrink the amount of results returned for a given query, which will in turn increase overall performance.

...

No Format
get: user text input
2: user text input = '*' + user text input + '*'
3: score = lucene.score(user text input)
4: halt

Sorting

Sorting matched results is important part of interacting with the LexEVS API. Allowing users to plug in customized Sort algorithms helps LexEVS to be more flexible to more groups of users. To implement a Sorting algorithm, a user must implement the org.LexGrid.LexBIG.Extensions.Extendable.Sort Interface.

Class:

org.LexGrid.LexBIG.Extensions.Extendable.Sort

Method:

public <T> Comparator<T> getComparatorForSearchClass(Class<T> searchClass) throws LBParameterException
Description: Given a Class that this Sort is valid for, return the correct Comparator to compare the results and sort.

Method:

public boolean isSortValidForClass(Class<?> clazz);
Description: Return whether or not this Sort is valid for Sorting on a given Class

...

No Format
1: get: Sort requested by user
2: get: Context sort is being applied to
3: if: sort is not valid for Context
halt
4: else:
5: get: Class to be sorted on
6: if: sort is not valid for Class
halt
7: get: Comparator for Sort - given (Class to be sorted on)
8: sort results using Comparator for Sort
9: halt

SQL Optimizations

The n+1 SELECTS Problem

The n+1 SELECTS Problem refers to how information can optimally be retrieved from the database, preferably using as few queries as possible. This is desirable because:

...

To avoid this, a JOIN query can be used.

The n+1 SELECTS Problem Example

Given two database tables, retrieve the Code, Name, and Qualifier for each Code

...

This sequence results in 1 Query to retrieve the data from the Codes table, and then n Queries from the Qualifiers table. This results in n+1 total Queries.

The n+1 SELECTS Problem Example (Solution)

Given two database tables, retrieve the Code, Name, and Qualifier for each Code

...

  • The EntryState while building the CodedEntry.
  • The EntityDescription on AssociatedConcepts
  • AssociationQualifiers on AssociatedConcepts

Metathesauraus Content (RRF) Detailed Design

Loads of the NCI MetaThesaurus RRF formatted data into the LexGrid model require a number of adjustments in order to accurately reflect the state of the data as it exists in the current RRF files.

Data Model Elements

Most data elements will be loaded as either properties or property qualifiers:

property diagramImage Modified

A few will be loaded as qualifiers to associations.

Retrieval and API Documentation

No new API retrieval methods will be implemented in the scope of LexEVS 5.1. However, some may be required in the scope of 6.0 for any mapping elements implemented as new model elements or model extensions to LexGrid. No changes to user interfaces will occur. Service methods for loading these elements will be consistent with the new Spring Batch loader framework.

MRREL.RRF File

Problem:

REL and RELA column elements from the RRF source need to be connected.
Currently these are loaded as separate relationships preventing the user from connecting to the REL/RELA combinations that actually occur in the NCI-META (e.g. RELA may be different for same REL value in different sources).

...

Do not treat a CUI1 = CUI2 relationships differently than a CUI1 != CUI2 relationship. For API and query purposes, qualify these relationships with a 'selfReferencing=true' Qualifier. In this way, we can still avoid cycles in the API, but maintain all relevant Qualifier information in the relation.

MRSAT.RRF

Problem:

MRSAT.RRF is not loaded but only accessed for given preferred term algorithms. This data should be loaded as concept properties (STYPE=CUI), properties on properties (STYPE=AUI, SAUI, CODE, SCUI, SDUI), qualifiers on associations (STYPE=RUI,SRUI). Some complexity may arise as concept properties can have additional qualifiers, but property-properties cannot and association-qualifiers cannot.

...

SUPPRESS - load as propertyQualifier if value != N

MRRANK.RRF

Problem:

SAB specific ranking of representational form in MRRANK is not exposed to the user (used in an underlying ranking and specifying of preferred presentations for a given concept)

...

Available in current LexEVS api

MRSAB.RRF

Problem:

MRSAB.RRF file data is not loaded or is otherwise unavailable to the user.

...

Entire content of each row of MRSAB file is loaded as metadata to an external xml file with tags created from column names and value inserted between tags as is appropriate

MRMAP.RRF, MRSMAP.RRF

Problem:

MRMAP.RRF source load is not supported in current load. Currently this RRF file is not populated in NCI Metathesaurus distributions. Mapping is not explicitly supported in the LexGrid Model.

...

To be evaluated for a load to current model elements or possible new model mapping elements. The general agreement is that this is more appropriately implemented in 6.0.

MRHIER.RRF

Problem:

HCD is loaded as a property on the presentation but the SAB isn't associated with it so we do not know the source of the HCD. (only look at row that has HCD field populated)
Path to Root, (PTR) is also not loaded, but is instead used to determine path to root operations in LexEVS.

...

Load HCD associated field SAB as property qualifier when HCD is present. Load PTR as property.

MRDOC.RRF

Problem:

MRDOC contains metadata unavailable to the user. It is not loaded by LexEVS.

...

MRDOC's column names and content will be processed as tag/value mappings to a metadata file.

MRDEF.RRF

Problem:

Some values from each row are not loaded by LexEVS.

...

ATUI, SUPPRESS, CVF, SATAUI, column values will be loaded as property qualifiers on the Definition type property derived from MRDEF column.

MRCONSO.RRF

Problem:

Some elements from the columns of MRCONSO.RRF are not loaded by LexEVS.

...

All noted values will be loaded as property qualifiers.

Value Domain Support Detailed Design

The LexEVS Value Domain and Pick List service will provide ability to load Value Domain and Pick List Definitions into LexGrid repository and provides ability to apply user restrictions and dynamically resolve the definitions during run time. Both Value Domain and Pick List service are integrated part of LexEVS core API.

...

 

listValueDomains(String valueDomainName)

Description:

Return the URI's for the value domain definition(s) for the supplied domain name. If the name is null, returns everything. If the name is not null, returns the value domain(s) that have the assigned name.
Note: plural because there is no guarantee of valueDomain uniqueness. If the name is the empty string "", returns all unnamed valueDomains.

Input:

java.lang.String

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="a0f79e01fa208d06-2cece982-45764d1d-8969b15f-1a49dafe2109cd2704c1838a"><ac:plain-text-body><![CDATA[

Output:

java.net.URI[]

]]></ac:plain-text-body></ac:structured-macro>

Exception:

org.LexGrid.LexBIG.Exceptions.LBException

Implementation Details:

Implementation:
Step 1: Call this method on the associated LexEVS Value Domain Service instance to get the list of Value Domain URI that matches the supplied name.
 
Sample Call:
Step 1 : Using LexBIGService instance, get the LexEVSValueDomainServices interface org.lexgrid.valuedomain.LexEVSValueDomainServices vds = lbs.getValueDomainService();
Step 2 :Call listValueDomains  method:
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="5ac5944e7f74bf7e-47860973-4ca04ca2-bb9a82ce-1e7dca5160be20c0bb5f1e5f"><ac:plain-text-body><![CDATA[URI[] uris  =  vds.listValueDomains("someValueDomainName");

]]></ac:plain-text-body></ac:structured-macro>

...

 

getAllValueDomainsWithNoNames()

Description:

Return the URI's of all unnamed value domain definition(s).

Input:

none

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="8bd87b49e88f440a-51d2b4ff-4eca4928-8c59b315-512cbeddc045ed0741ff4dc6"><ac:plain-text-body><![CDATA[

Output:

java.net.URI[]

]]></ac:plain-text-body></ac:structured-macro>

Exception:

org.LexGrid.LexBIG.Exceptions.LBException

Implementation Details:

Implementation:
Step 1: Call this method on the associated LexEVS Value Domain Service instance to get the list of Value Domain URI that have no names.
 
Sample Call:
Step 1 : Using LexBIGService instance, get the LexEVSValueDomainServices interface org.lexgrid.valuedomain.LexEVSValueDomainServices vds = lbs.getValueDomainService();
Step 2 :Call getAllValueDomainsWithNoNames  method:
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="2fb2e5b0ba866e55-1103d47e-48ed477d-8eeab4aa-003c01d5bb24bd07cfdde9dc"><ac:plain-text-body><![CDATA[URI[] uris  =  vds.getAllValueDomainsWithNoNames();

]]></ac:plain-text-body></ac:structured-macro>

...

 

getPickListDefinitionsForDomain(URI valueDomainURI)

Description:

Returns all the pickList definitions that represents supplied valueDomain URI.

Input:

java.net.URI

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="4aab37af110fe0ce-5bd01b3e-42744857-89b4b907-1390564afc787e514e61a1b1"><ac:plain-text-body><![CDATA[

Output:

org.LexGrid.emf.valueDomains.PickListDefinition[]

]]></ac:plain-text-body></ac:structured-macro>

Exception:

org.LexGrid.LexBIG.Exceptions.LBException

Implementation Details:

Implementation:
Step 1: Call this method on the associated LexEVS Pick List Service instance to get all the Pick List Definitions that are represented by supplied Value Domain URI.
 
Sample Call:
Step 1 : Using LexBIGService instance, get the LexEVSPickListServices interface org.lexgrid.valuedomain.LexEVSPickListServices pls = lbs.getPickListService();
Step 2 :Call getPickListDefinitionsForDomain method:
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="2202df5c58dda40f-9966fb2f-47e64728-94d7ac55-e6d32068e3f56e98f81b1b0d"><ac:plain-text-body><![CDATA[PickListDefinition[] plDefs  =  pls.getPickListDefinitionsForDomain(valueDomainURI);

]]></ac:plain-text-body></ac:structured-macro>

...

<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="a9414cf9a8539faf-8f319f17-466045e7-98e6a8f0-46f26824240e26bb5f07f55a"><ac:plain-text-body><![CDATA[

 

resolvePickListForTerm(String pickListId, String term, String matchAlgorithm, String language, String[] context, boolean sortByText)

]]></ac:plain-text-body></ac:structured-macro>

Description:

Resolves pickList definition by applying  supplied arguments.

Input:

java.lang.String,
java.lang.String,
java.lang.String,
java.lang.String,
<ac:structured-macro ac:name="unmigrated-wiki-markup" ac:schema-version="1" ac:macro-id="b3e9008dc6d760dc-e6abc2e3-4c3e4561-a040a2c4-a7158b928b3d668e18515a2c"><ac:plain-text-body><![CDATA[java.lang.String[],
]]></ac:plain-text-body></ac:structured-macro>
boolean

Output:

org.lexgrid.valuedomain.dto.ResolvedPickListEntryList

Exception:

org.LexGrid.LexBIG.Exceptions.LBException

Implementation Details:

Implementation:
Step 1: Call this method on the associated LexEVS Pick List Service instance to get list of Pick List Entries that  matches the term supplied and meets other supplied restrictions.
 
Sample Call:
Step 1 : Using LexBIGService instance, get the LexEVSPickListServices interface org.lexgrid.valuedomain.LexEVSPickListServices pls = lbs.getPickListService();
Step 2 :Call resolvePickListForTerm  method:
ResolvedPickListEntryList pleList  =  pls.resolvePickListForTerm ("AUTO:DomesticAutoMakers","Jaguar", MatchAlgorithms.exactMatch.name(), "en", null, true);

...