Vocabulary Administration Overview
A set of administrative shell scripts are provided to manage the LexEVS Service. These scripts are provided for Windows (.bat) and Linux (.sh) operating systems and call wrapper classes for LexEVS administration API classes. These scripts are located in the {LEXBIG_DIRECTORY}/admin and {LEXBIG_DIRECTORY}/test directory. A full description of the options with an example is provided for each command line script.
If you are loading a large vocabulary such as SNOMED or the NCI Metathesaurus, then we recommend the use of these scripts over the GUI since adjusting system memory requirements can be accomplished by changing the load script with a text editor.
Coding Scheme Metadata Access and Administration
Shell Script |
Use and Function |
---|---|
ActivateScheme |
Activates a coding scheme based on unique URN and version.
|
DeactivateScheme |
Deactivates a coding scheme based on unique URN and version.
|
ListExtensions |
List registered extensions to the LexEVS runtime environment.
|
ListSchemes |
List all currently registered vocabularies.
|
TagScheme |
Associates a tag ID (e.g. 'PRODUCTION' or 'TEST') with a coding scheme URN and version.
|
Lucene Index Administration
Shell Script |
Use and Function |
---|---|
CleanUpLuceneIndex |
Clean up orphaned indexes.
|
OptimizeLuceneIndex |
Optimizes the Common Lucene Index.
|
RebuildIndex |
Rebuilds indexes associated with the specified coding scheme.
|
RemoveIndex |
Clears an optional named index associated with the specified coding scheme. Note Built-in indices required by the LexEVS runtime cannot be removed. Options:
|
Misc Database Administration Scripts
Shell Script |
Use and Function |
---|---|
ExportDDLScripts |
Exports the database create/drop scripts.
|
PasswordEncryptor |
Encrypts the given password.
|
LexEVS Export Scripts
Shell Script |
Use and Function |
---|---|
ExportLgXML |
Exports content from the repository to a file in the LexGrid canonical XML format.
|
ExportOBO |
Exports content from the repository to a file in the Open Biomedical Ontologies (OBO) format.
|
ExportOwlRdf |
Exports content from the repository to a file in OWL format.
|
LexEVS Loader Scripts
Shell Script |
Use and Function |
---|---|
LoadLgXML |
Loads a vocabulary file, provided in LexGrid canonical xml format.
|
LoadNCIHistory |
Imports NCI History data to the LexEVS repository.
|
LoadOBO |
Loads a file specified in the Open Biomedical Ontologies (OBO) format.
|
LoadOWL |
Loads an OWL file. Note Load of the NCI Thesaurus should be performed via the LoadNCIThesOWL counterpart, since it will allow more precise handling of NCI semantics. Options:
|
LoadRadLexProtegeFrames |
URI or path specifying location of the pprj file. Imports from a RadLex xml file to a LexBIG repository. Requires that the pprj file be configured with reference to a RadLex xml file as follows: ([radlex_ProjectKB_Instance_66] of String (name "source_file_name") (string_value "radlex.xml"))" ); Example: java org.LexGrid.LexBIG.admin.LoadRadLexProtegeFrames
|
LoadUMLSSemnet |
Loads the UMLS Semantic Network, provided as a collection of files in a single directory. The following files are expected to be provided from the National Library of Medicine (NLM) distribution:
|
LoadFMA |
Imports from an FMA database to a LexEVS repository. Requires that the pprj file be configured with a database URN, username, password for an FMA MySQL based database. The FMA.pprj file and MySQL dump file are available at http://sig.biostr.washington.edu/projects/fm/ upon registration.
|
LoadHL7RIM |
Converts an HL7 RIM MS Access database to a LexGrid database
|
LoadMetaData |
Loads optional XML-based metadata to be associated with an existing coding scheme.
|
LoadMrMap |
Loads mappings file(s), provided in UMLS RRF format. Specifically MRMAP.RRF and MRSAT.RRF.
|
Special Batch Loading Functions
Shell Script |
Use and Function |
---|---|
LoadMetaBatch |
Loads the NCI MetaThesaurus, provided as a collection of RRF files, using a batch loading strategy allowing a faster, more memory efficient load to occur.
|
LoadUMLBatch |
Loads UMLS content, provided as a collection of RRF files in a single directory. Files may comprise the entire UMLS distribution or pruned via the MetamorphoSys tool. A complete list of source vocabularies is available online at http://www.nlm.nih.gov/research/umls/metaa1.html.
|
ResumeMetaBatch |
Resume a UMLS load. Loads will usually be restartable if they fail due to an error. The loader will keep all loaded content and restart at the point of failure.
|
ResumeUmlsBatch |
Resume a UMLS load. Loads will usually be restartable if they fail due to an error. The loader will keep all loaded content and restart at the point of failure.
|
Scheme and Metadata Removal
Shell Script |
Use and Function |
---|---|
RemoveScheme |
Removes a coding scheme based on unique URN and version.
|
RemoveMetadata |
Clears optionally loaded metadata associated with the specified coding scheme.
|
Pick List and Value Set Load Administration
Shell Script |
Use and Function |
---|---|
LoadPickListDefinition |
Loads Pick List Definition content, provided in LexGrid canonical xml format.
|
LoadValueSetDefinition |
Loads Value Set Definition content, provided in LexGrid canonical xml format.
|
LexEVS Validation Test Script
Shell Script |
Use and Function |
---|---|
TestRunner |
Located in {LEXBIG_DIRECTORY}/test. Runs the test suite by invoking the Ant launcher. Note The LexEVS runtime and database environments must still be configured prior to invoking the test suite. Running any option other than -v will cause a large set of junits to be run Usage: TestRunner
|
Command Line Scripts and Wrappers Overview
LexEVS provides a set of shell scripts and code wrappers that provide command line users standard options and automatic printouts of current coding schemes when performing administrative functions using the command line.
Displaying Current Coding Schemes
When you have coding Schemes Loaded to LexEVS many command line script calls will bring up a printout of a table of the loaded coding schemes to be used as a selective menu for other functions.
Using Optional Parameters
A standard parameter interface allows users to customize command line calls to LexEVS. These options control things such as loader synchronicity, supplemental file locations, and fault conditions. Shell scripts contain comments on usage for each function wrappers options.
# Loads an OWL file. You can provide a manifest file to configure coding scheme # meta data. # # Options: # -in,--input <uri> URI or path specifying location of the source file # -mf,--manifest <uri> URI or path specifying location of the manifest file # -lp,--loaderPrefs<uri> URI or path specifying location of the loader preference file # -a, --activate ActivateScheme on successful load; if unspecified the # vocabulary is loaded but not activated. # -t, --tag <id> An optional tag ID (e.g. 'PRODUCTION' or 'TEST') to assign. # # Example: LoadOWL -in "file:///path/to/somefile.owl" -a # java -Xmx3000m -XX:MaxPermSize=256M -cp "../runtime/lbPatch.jar:../runtime/lbRuntime.jar" org.LexGrid.LexBIG.admin.LoadOWL $@
Setting Java Virtual Machine Options
Heap size:
Memory management for heap size is a particular issue when loading larger Terminologies such as SNOMED or the NCI Thesaurus. Users need to be prepared to increase heap size if loads crash with heap size errors.
Example:
-Xmx:4000m
Permanent Generation:
LexEVS now depends on some Permanent Generation memory management during runtime.
Typically set as follows:
-XX:MaxPermSize:256m
Installing Sample Vocabularies
This LexEVS installation provides a sample vocabulary, Automobiles.xml, that can be loaded into the database.
- In a Command Prompt window, enter the following to go to the example programs.
cd {LEXBIG_DIRECTORY}/examples
- To load an example vocabulary, run the LoadSampleData script, LoadSampleData.bat for Windows; LoadSampleData.sh for Linux.
- To load other example vocabularies, run the appropriate script for any sample vocabulary in {LEXBIG_DIRECTORY}/test/resources/testData.
Note
Vocabularies cannot be loaded until configuration of the LexEVS runtime and database server are complete.
Running the sample query programs
A set of sample programs are provided in the {LEXBIG_DIRECTORY}/examples directory. To run the sample query programs successfully a vocabulary must be loaded.
- Enter
cd {LEXBIG_DIRECTORY}/examples
- Execute one of sample programs.
.bat for windows or .sh for Linux.
FindPropsandAssocForCode.bat
FindRelatedCodes
Installing NCI Vocabularies
NCI Thesaurus Vocabulary
This section describes the steps to download and install a full version of the NCI Thesaurus for the LexEVS Service.
- Using a web or ftp client go to URL: ftp://ftp1.nci.nih.gov/pub/cacore/EVS/
- Select the version of NCI Thesaurus OWL you wish to download. Save the file to a directory on your machine.
- Extract the OWL file from the zip download and save in a directory on your machine. This directory will be referred to as NCI_THESAURUS_DIRECTORY
- Using the LexEVS utilities load the NCI Thesaurus
cd {LexBIG_DIRECTORY}/admin
- For Windows installation use the following command
LoadOWL.bat -in "file:///{NCI_THESAURUS_DIRECTORY}/Thesaurus_10.10d.owl"
- For Linux installation use the following command
LoadOWL.sh -in "file:///{NCI_THESAURUS_DIRECTORY}/Thesaurus_10.10d.owl"
Note
The NCI Thesaurus has grown large enough that it can no longer be loaded on many typical desktop machines. We recommend a 64-bit operating system running on a multiprocessor computer with a minimum of 4g of memory. Server class Linux machines are the typical target for these loads. The time to load NCI Thesaurus will vary depending on machine, memory, and disk speed. Expect a couple of hours for a higher end machine.
The following code sample shows the example output from load of NCI Thesaurus 05.12f
[LexBIG] Processing TOP Node... Retired_Kind [LexBIG] Clearing target of NCI_Thesaurus... [LexBIG] Writing NCI_Thesaurus to target... [LexBIG] Finished loading DB - loading transitive expansion table [LexBIG] ComputeTransitive - Processing Anatomic_Structure_Has_Location [LexBIG] ComputeTransitive - Processing Anatomic_Structure_is_Physical_Part_of [LexBIG] ComputeTransitive - Processing Biological_Process_Has_Initiator_Process [LexBIG] ComputeTransitive - Processing Biological_Process_Has_Result_Biological_Process [LexBIG] ComputeTransitive - Processing Biological_Process_Is_Part_of_Process [LexBIG] ComputeTransitive - Processing Conceptual_Part_Of [LexBIG] ComputeTransitive - Processing Disease_Excludes_Finding [LexBIG] ComputeTransitive - Processing Disease_Has_Associated_Disease [LexBIG] ComputeTransitive - Processing Disease_Has_Finding [LexBIG] ComputeTransitive - Processing Disease_May_Have_Associated_Disease [LexBIG] ComputeTransitive - Processing Disease_May_Have_Finding [LexBIG] ComputeTransitive - Processing Gene_Product_Has_Biochemical_Function [LexBIG] ComputeTransitive - Processing Gene_Product_Has_Chemical_Classification [LexBIG] ComputeTransitive - Processing Gene_Product_is_Physical_Part_of [LexBIG] ComputeTransitive - Processing hasSubtype [LexBIG] Finished building transitive expansion - building index [LexBIG] Getting a results from sql (a page if using mysql) [LexBIG] Indexed 0 concepts. [LexBIG] Indexed 5000 concepts. [LexBIG] Indexed 10000 concepts. [LexBIG] Indexed 15000 concepts. [LexBIG] Indexed 20000 concepts. [LexBIG] Indexed 25000 concepts. [LexBIG] Indexed 30000 concepts. [LexBIG] Indexed 35000 concepts. [LexBIG] Indexed 40000 concepts. [LexBIG] Indexed 45000 concepts. [LexBIG] Indexed 46000 concepts. [LexBIG] Getting a results from sql (a page if using mysql) [LexBIG] Closing Indexes Mon, 27 Feb 2006 01:44:22 [LexBIG] Finished indexing
NCI Metathesaurus Vocabulary
Loading the Metathesaurus
This section describes the steps to download and install a full version of the NCI Metathesaurus for the LexEVS Service.
- Using a web or ftp client go to URL: ftp://ftp1.nci.nih.gov/pub/cacore/EVS/
- Select the version of NCI Metathesaurus RRF you wish to download. Save the file to a directory on your machine.
- Extract the RRF files from the zip download and save in a directory on your machine. This directory will be referred to as NCI_METATHESAURUS_DIRECTORY. RELEASE_INFO.RRF is required to be present for the load utility to work.
- Using the LexEVS utilities load the NCI Thesaurus
cd {LexBIG_DIRECTORY}/admin
- For Windows installation use the following command
For Linux installation use the following command
LoadMetaBatch.bat -in "file:///{NCI_METATHESAURUS_DIRECTORY}/"
LoadMetaBatch.sh -in "file:///{NCI_THESAURUS_DIRECTORY}/"
Note
NCI Metathesaurus contains many individual vocabularies some of which are large vocabularies in and of themselves. It requires many hours to load and index. It can require 36 hours on a multiprocessor machine with 6g plus memory. The total time to load NCI MetaThesaurus will vary depending on machine, memory, and disk speed. Because this loader uses a batch loading strategy it is less dependent on memory, but some users will see 3 or 4 day load times with average multiprocessor processing power.
Resuming Loads
Since this loader is resource hungry we provide the option to restart should you find your resource settings to be inadequate. Resuming loads which have crashed or been interrupted by server problems is possible using the ResumeBatchLoad script set.
- Using the LexEVS utilities load the NCI Thesaurus
cd {LexBIG_DIRECTORY}/admin
- For Windows installation use the following command
ResumeMetaBatch.bat -in "file:///{NCI_METATHESAURUS_DIRECTORY}/" -s "NCI Metathesaurus" -uri "urn:oid:2.16.840.1.113883.3.26.1.2" -version "200601"
- For Linux installation use the following command
ResumeMetaBatch.sh -in "file:///{NCI_THESAURUS_DIRECTORY}/" -s "NCI Metathesaurus" -uri "urn:oid:2.16.840.1.113883.3.26.1.2" -version "200601"
NCI History
This section describes the steps to download and install a history file for NCI Thesaurus.
- Using a web or ftp client go to URL:
ftp://ftp1.nci.nih.gov/pub/cacore/EVS/
- Select the version of NCI History you wish to download. Save the file to a directory on your machine. Select the VersionFile download to the same directory as the history file.
- Extract the History files from the zip download and save in a directory on your machine. This directory will be referred to as NCI_HISTORY_DIRECTORY
- Using the LexEVS utilities load the NCI Thesaurus
cd {LexBIG_DIRECTORY}/admin
- For Windows installation use the following command:
LoadNCIHistory.bat -nf -in "file:///{NCI_HISTORY_DIRECTORY}" -vf "[file:///NCI_HISTORY_DIRECTORY]}/VersionFile"
- For Linux installation use the following command:
LoadNCIHistory.sh -nf -in "file:///{NCI_HISTORY_DIRECTORY}" -vf "[file:///{NCI_HISTORY_DIRECTORY]}/VersionFile"
Note
If a 'releaseId' occurs twice in the file, the last occurrence will be stored. If LexEVS already knows about a releaseId (from a previous history load), the information is updated to match what is provided in the file. This file has to be provided to the load API on every load because you will need to maintain it in the future as each new release is made. We have created this file that should be valid as of today from the information that we found in the archive folder on your ftp server. You can find this file in the 'resources' directory of the LexEVS install.
Deactivating and Removing a Vocabulary
This section describes the steps to deactivate a coding scheme and remove coding scheme from LexEVS Service.
- Change directory to LexEVS administration directory. Enter:
cd {LEXBIG_DIRECTORY}/admin
- Use the DeactiveScheme utility to prevent access to coding scheme. Once a coding scheme is deactivated, client programs will not be able to access the content for the specific coding scheme and version. Example:
DeactivateScheme -u "urn:oid:2.16.840.1.113883.3.26.1.1" -v "10.10d"
- Use RemoveScheme utility to remove coding scheme from LexEVS service and database.
Example:RemoveScheme -u "urn:oid:2.16.840.1.113883.3.26.1.1" -v "10.10d"
Tagging a Vocabulary
This section describes the steps to tag a coding scheme to be used via LexEVS API.
- Change directory to LexEVS administration directory. Enter:
cd {LEXBIG_DIRECTORY}/admin
- Use the TagScheme to tag a coding system and version with a local tag name (e.g. PRODUCTION). This tag name can be used via LexEVS API for query restriction. Example:
TagScheme -u "urn:oid:2.16.840.1.113883.3.26.1.1" -v "10.10d" -t "PRODUCTION"
IndexManagement
LexEVS indexes can be manually updated, removed and optimized by users. This is a useful function particularly when metadata, authoring and post processing may have taken place.
Remove Index
- Change directory to LexEVS administration directory. Enter:
cd {LEXBIG_DIRECTORY}/admin
- Use RemoveIndex script to remove an index for a particular coding scheme.
Example:
Windows:
RemoveIndex.bat
Linux:
RemoveIndex.sh
- Interactive commmand:
Enter the number of the Coding Scheme to use, then <Enter>.
User responds with column number for appropriate coding scheme. - Interactive command:
CLEAR? ('Y' to confirm, any other key to cancel).
User responds with 'Y' - Command Line example:
Note
Concepts cannot be resolved without an index. Rebuilding the index is required.
Rebuild Index
- Change directory to LexEVS administration directory. Enter
cd {LEXBIG_DIRECTORY}/admin
- Use RebuildIndex script to rebuild an index for a particular coding scheme.
Example:
Windows:
RebuildIndex.bat
Linux:
RebuildIndex.sh
- Interactive command:
Enter the number of the Coding Scheme to use, then <Enter>.
User responds with column number for appropriate coding scheme.
- Interactive command:
REBUILD INDEX FOR URI? <uri displayed here> ('Y' to confirm, any other key to cancel).
User responds with 'Y' - Command Line example:
Note
It may be useful and performance enhancing to run OptimizeLuceneIndex script after removing and rebuilding.