NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Scrollbar
iconsfalse

...

Panel
titleContents of this Page
Table of Contents
minLevel2

...

When multiple terms are being searched on, the first term is a spanWildcardQuery on the reverse property with a trailing wildcard.  The middle property values are searched for as property values.  The last term is a spanWildcardQuery on the propertyValue with a tailing wildcard.

Algorithm:

The contains Substring search has the following characteristics:

  • This search is case in-sensitive. 
  • Only It searches on the property value and literal property value.
  • The literal property part of the query is boosted by 50.  This gives a literal match priority.
  • Performs a wildcardQuery
  • Lowercase and special characters removed during query parser parse.
  • Parsing is done with the following analyzers:

    • propertyValue - Uses our custom standard analyzer that has no stop words.

    • literal_propertyValue - Uses our custom literal analyzer.  This literal analyzer uses Lucene's WhitespaceTokenizer with Lucene's

    StandardAnalyzer
    • LowerCaseFilter.

     

Example of use:

The following examples are based on the Automobiles coding scheme.

...

Search string: graph

Lucene query: +propertyValue:*graph* literal_propertyValue:graph^50.0Complete query:

...

:

...

+

...

propertyValue:*graph* literal_propertyValue:graph^50.0

...

Result: 1 result

  • entity code: NoRelationsConcept
  • entity description: A concept for testing Graph Building on Concepts with no relations

...

Lucene query: +spanNear([mask(spanWildcardQuery(reverse_propertyValue:hparg*)) as propertyValue, mask(propertyValue:building) as propertyValue, mask(spanWildcardQuery(propertyValue:on*)) as propertyValue], 0, true) ((+literal_propertyValue:graph +literal_propertyValue:building +literal_propertyValue:on)^50.0)

Complete query:

...

)

...

as propertyValue,

...

mask(spanWildcardQuery(propertyValue:on*)) as propertyValue], 0, true) ((+literal_propertyValue:graph +literal_propertyValue:building +literal_propertyValue:on)^50.0

...

)

Result: 1 result

  • entity code: NoRelationsConcept
  • entity description: A concept for testing Graph Building on Concepts with no relations

...

Example 3:
Search string: ncept for testing graph

Lucene query: +spanNear([mask(spanWildcardQuery(reverse_propertyValue:tpecn*)) as propertyValue, mask(propertyValue:for) as propertyValue, mask(propertyValue:testing) as propertyValue, mask(spanWildcardQuery(propertyValue:graph*)) as propertyValue], 0, true) ((+literal_propertyValue:ncept +literal_propertyValue:for +literal_propertyValue:testing +literal_propertyValue:graph)^50.0)

Complete query:

...

mask(propertyValue:

...

testing) as propertyValue, mask(spanWildcardQuery(propertyValue:graph*)) as propertyValue], 0, true) ((+literal_propertyValue:ncept +literal_propertyValue:for +literal_propertyValue:testing +literal_propertyValue:graph)^50.0

...

)

Result: 1 result

  • entity code: NoRelationsConcept
  • entity description: A concept for testing Graph Building on Concepts with no relations

Associated JUnits:

Junits for contains Junit tests can be found here: https://github.com/lexevs/lexevs/blob/master/lbTest/src/test/java/org/LexGrid/LexBIG/Impl/function/query/lucene/searchAlgorithms/TestSubString.java