NIH | National Cancer Institute | NCI Wiki  

WIKI MAINTENANCE NOTICE

Please be advised that NCI Wiki will be will be undergoing maintenance on Monday, June 24th between 1000 ET and 1100 ET.
Wiki will remain available, but users may experience screen refreshes or HTTP 502 errors during the maintenance period. If you encounter these errors, wait 1-2 minutes, then refresh your page.

If you have any questions or concerns, please contact the CBIIT Atlassian Management Team.

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Wiki Markup
{scrollbar:icons=false}
Panel
titleContents of this Page
Table of Contents
minLevel2

Regular Expression Implementation Details

The regular expression search searches against lower cased text. Additionally, this searches against the entire string as a single token, rather than the tokenized string.

Algorithm:

The Regular Expression search has the following characteristics:

  • This search searches only lower cased text. 
  • It searches on the untokenized lower cased property value.  
  • Analyzers are not applied to the expression.  However, the expression is lower cased (this is an explicit step done outside of Lucene by LexEVS code).

Example of use:

The following examples are based on the Automobiles coding scheme.

Example 1:

Search string: automobi.*

Lucene query: untokenizedLCPropertyValue:automobi.*

Result: 1 result

  • entity code: A0001
  • entity description: Automobile

Example 2:
Search string: .*utomobile

Lucene query: untokenizedLCPropertyValue:.*utomobile

Result: 1 result

  • entity code: A0001
  • entity description: Automobile

Associated JUnits:

Junits can be found here: https://github.com/lexevs/lexevs/blob/master/lbTest/src/test/java/org/LexGrid/LexBIG/Impl/function/query/lucene/searchAlgorithms/TestRegExp.java