NIH | National Cancer Institute | NCI Wiki  

WIKI MAINTENANCE NOTICE

Please be advised that NCI Wiki will be undergoing maintenance Monday, July 22nd between 1700 ET and 1800 ET and will be unavailable during this period.
Please ensure all work is saved before said time.

If you have any questions or concerns, please contact the CBIIT Atlassian Management Team.

Contents of this Page

Non Leading Wild Card Literal Substring Implementation Details

Search based on  \"*some sub-string here*\" Functions much like the Java String.indexOf method. Single term searches will match '*term' and 'term*' but not '*term*'. This is because leading wildcards are very inefficient.  Special characters are included. 

Algorithm:

The Non Leading Wild Card Literal Substring search has the following characteristics:

  • This search is case in-sensitive. 
  • It searches on the literal property value and the literal reverse property value. 
  • A trailing wild card is applied to the literal property value an the literal reverse property value.
  • The literal property part (without the wild cards) of the query is boosted by 50.  This gives a literal match priority.  
  • Parsing is done with the following analyzer:

    • literal_propertyValue - Uses our custom literal analyzer.  This literal analyzer uses Lucene's WhitespaceTokenizer with Lucene's LowerCaseFilter.

     

Example of use:

The following examples are based on the Automobiles coding scheme.

Example 1:

Search string: grap

Lucene query: +(literal_propertyValue:grap* literal_reverse_propertyValue:parg*) literal_propertyValue:grap^50.0

Result: 1 result

  • entity code: NoRelationsConcept
  • entity description: A concept for testing Graph Building on Concepts with no relations

Example 2:
Search string: rap

Lucene query: +(literal_propertyValue:rap* literal_reverse_propertyValue:par*) literal_propertyValue:rap^50.0

Result: 0 results

  • entity code:
  • entity description:

Associated JUnits:

Junits can be found here: https://github.com/lexevs/lexevs/blob/master/lbTest/src/test/java/org/LexGrid/LexBIG/Impl/function/query/lucene/searchAlgorithms/TestSubStringNonLeadingWildcardLiteralSubString.java

 

 

  • No labels