Weighted Double Metaphone Implementation Details
Search with the Lucene query syntax, using a 'sounds like' algorithm. The exact user-entered text is taken into account, so correct spelling will override the 'sounds like' algorithm. Searches on the same indexed property value as the other double metaphone search.
Algorithm:
The Weighted Double Metaphone search has the following characteristics:
- This search is case in-sensitive.
- It searches on the double metaphone property value and the property value.
- Preference is given to the matches with the correct spelling.
- Parsing is done with Lucene's StandardAnalyzer.
Example of use:
The following examples are based on the Automobiles coding scheme.
Example 1:
Search string: car
Lucene query: +dm_propertyValue:KR propertyValue:car
Complete query:
- +*:* +(entityType:concept)
- +*:* +isAnonymous:F
- +*:* +(+dm_propertyValue:KR propertyValue:car) +isPreferred:T +(propertyType:presentation)
Result: 2 results
- Result 1:
- entity code: C0001
- entity description: Car
- Result 2:
- entity code: C0002
- entity description: Kar
Example 2:
Search string: kar
Lucene query: +dm_propertyValue:KR propertyValue:kar
Complete query:
- +*:* +(entityType:concept)
- +*:* +isAnonymous:F
- +*:* +(+dm_propertyValue:KR propertyValue:kar) +isPreferred:T +(propertyType:presentation)
Result: 2 results
- Result 1:
- entity code: C0002
- entity description: Kar
- Result 2:
- entity code: C0001
- entity description: Car
Associated JUnits:
Junits can be found here: https://github.com/lexevs/lexevs/blob/master/lbTest/src/test/java/org/LexGrid/LexBIG/Impl/function/query/lucene/searchAlgorithms/TestWeightedDoubleMetaphone.java