NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

/*
* Copyright: (c) 2004-2013 Mayo Foundation for Medical Education and
* Research (MFMER). All rights reserved. MAYO, MAYO CLINIC, and the
* triple-shield Mayo logo are trademarks and service marks of MFMER.
*
* Except as contained in the copyright notice above, or as used to identify
* MFMER as the author of this software, the trade names, trademarks, service
* marks, or product names of the copyright holder shall not be used in
* advertising, promotion or otherwise in connection with this software without
* prior written authorization of the copyright holder.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
package org.LexGrid.LexBIG.Extensions.Generic;

import java.util.Set;

import org.LexGrid.LexBIG.Exceptions.LBParameterException;
import org.LexGrid.LexBIG.Utility.Iterators.ResolvedConceptReferencesIterator;

/**
 * A simplified Search Extension.
 *
 * Query syntax is described by the
 * {@link http://lucene.apache.org/core/old_versioned_docs/versions/2_9_1/queryparsersyntax.html Lucene Query Syntax}
 */
public interface SearchExtension extends GenericExtension {
    
    /**
     * Search based on a given text string over all coding schemes.
     *
     * @param text
     *             The search text
     * @return
     *             A ResolvedConceptReferencesIterator
     *
     * @throws LBParameterException
     */
    public ResolvedConceptReferencesIterator search(String text) throws LBParameterException;
    
    /**
     * Search based on a given text string over given coding schemes.
     *
     * @param text
     *             The search text
     * @param codingSchemes
     *             The coding schemes to include in the search
     * @return
     *             A ResolvedConceptReferencesIterator
     *
     * @throws LBParameterException
     */
    public ResolvedConceptReferencesIterator search(
            String text,
            Set<CodingSchemeReference> codingSchemes) throws LBParameterException;
    
    /**
     * Search based on a given text string over given coding schemes, excluding
     * the listed.
     *
     * NOTE: If a coding scheme appears in both codingSchemesToInclude
     * and codingSchemesToExclude, the exclude will be given priority.
     *
     * @param text
     *             The search text
     * @param codingSchemesToInclude
     *             The coding schemes to include in the search
     * @param codingSchemesToExclude
     *             The coding schemes to include in the search
     * @return
     *             A ResolvedConceptReferencesIterator
     *
     * @throws LBParameterException
     */
    public ResolvedConceptReferencesIterator search(
            String text,
            Set<CodingSchemeReference> codingSchemesToInclude,
            Set<CodingSchemeReference> codingSchemesToExclude) throws LBParameterException;
    
}

Query Syntax

Query format of the Search Extension follows the Lucene Query Syntax. Any search string allowed by this syntax is accepted by the Search Extension.

By default, all search terms will be joined by an "AND" operator, unless otherwise specified.

For instance, a search of "Heart Attack" will by default translate to "Heart AND Attack"

The following characters are stripped and not indexed. This means that if passed in as a search string, they will not play a role in matching of a term.

For example, "@heart" and "$heart" will both be indexed identically – as "heart"

Characters to remove:

',', '.', '/', '\', '`', '\'', '"', '+', '*', '=', '@', '#', '$', '%', '^', '&', '?', '!'

 

The following characters will be translated into whitespace during indexing

This means words will be broken and indexed separately.

For example, "Heart-Attack" will be indexed separately as "heart" and "attack"

Characters treated as whitespace: 

'-', ';', '(', ')', '{', '}', '[', ']', '<', '>', '|'