Spelling Error Tolerant Substring Search

Contents of this Page

Spelling Error Tolerant Substring Implementation Details

Adds Spelling-error tolerance to 'subString' search. This makes use of the double metaphone indexed value as well as literal property values.

The Spelling Error Tolerant Substring search has the following characteristics:

This search is case in-sensitive.
It searches on the double metaphone property value and literal property value.
The literal property part (without the wild cards) of the query is boosted by .5. This gives a literal match priority.
Parsing is done with Lucene's StandardAnalyzer.

The following examples are based on the Automobiles coding scheme.

Example 1:

Search string: car

Lucene query: dm_propertyValue:"KR" literal_propertyValue:"car"^0.5

Complete query:

+*:* +(entityType:concept)
+*:* +isAnonymous:F
+*:* +(dm_propertyValue:"KR" literal_propertyValue:"car"^0.5) +isPreferred:T +(propertyType:presentation)

Result: 2 results

Example 2:
Search string: General Motors

Lucene query: dm_propertyValue:"JNRL KNRL MTRS" literal_propertyValue:"general motors"^0.5

Complete query:

+*:* +(entityType:concept)
+*:* +isAnonymous:F
+*:* +(dm_propertyValue:"JNRL KNRL MTRS" literal_propertyValue:"general motors"^0.5) +isPreferred:T +(propertyType:presentation)

Result: 1 result