NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Scrollbar
iconsfalse

...

Panel
titleContents of this Page
Table of Contents
minLevel2

...

The same as the substring search but with special characters enabled.

Algorithm:

The contains Literal Substring search has the following characteristics:

  • This search is case in-sensitive. 
  • It only searches on the literal property value , and the literal reverse property value.  
  • A leading and trailing wild card is added to the token in the search text.
  • The literal property part (without the wild cards) of the query is boosted by 50.  This gives a literal match priority.
  • Parsing is done with

    Lucene's StandardAnalyzer.

    the following analyzers:

    • literal_propertyValue - Uses our custom literal analyzer.  This literal analyzer uses Lucene's WhitespaceTokenizer with Lucene's LowerCaseFilter.

    • propertyValue - Uses our custom standard analyzer that has no stop words.

     

Example of use:

The following examples are based on the Automobiles coding scheme.

Example 1:

Search string: a^s

Lucene query: +literal_propertyValue:*a^s* literal_propertyValue:a^s^50.0Complete query:

...

:

...

+literal_propertyValue:*a^s* literal_propertyValue:a^s^50.0

...

Result: 1 result

  • entity code: SpecialCharactersConcept
  • entity description: Concept containing special characters

...

Lucene query: +spanNear([mask(spanWildcardQuery(literal_reverse_propertyValue:s^a*)) as propertyValue, mask(spanWildcardQuery(literal_propertyValue:sp*cial*)) as propertyValue], 0, true) ((+literal_propertyValue:a^s +literal_propertyValue:sp*cial)^50.0)

Complete query:

...

)

Result: 1 result

  • entity code: SpecialCharactersConcept
  • entity description: Concept containing special characters

...