Contains Algorithm Implementation Details
Equivalent to ' term* ' - in other words - a trailing wildcard on a term (but no leading wild card) and the term can appear at any position. Searches on property value only.
Algorithm:
The contains search has the following characteristics:
- It only searches on the property value.
- A trailing wild card is added to all tokens in the search text.
- Lowercase and special characters removed during query parser parse
- The literal property part of the query is boosted by 50. This gives a literal match priority.
- Parsing is done with Lucene's StandardAnalyzer.
Description of Algorithm:
Example of use:
The following examples are based on the Automobiles coding scheme.
Example 1:
Search string: automob
Lucene query: +propertyValue:automob* literal_propertyValue:automob^50.0
Result: 1 result
- entity code: A0001
- entity description: Automobile
Example 2:
Search string: General Motors
Lucene query: (+propertyValue:general* +propertyValue:motors*) ((+literal_propertyValue:general +literal_propertyValue:motors)^50.0)
Result: 1 result
- entity code: GM
- entity description: General Motors
Implementation Details:
Associated JUnits:
Junits for contains tests can be found here: https://github.com/lexevs/lexevs/blob/master/lbTest/src/test/java/org/LexGrid/LexBIG/Impl/function/query/lucene/searchAlgorithms/TestContains.java