NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
{page-info:title}

If you are reading this page then you know that LexEVS is an enterprise wide terminology server. When first installed it comes with no terminologies loaded into it. This documentation will cover the means for loading most content types that can be loaded. LexEVS was built to accommodate a wide variety of input and meld it into a common form unifying many common formats. This necessitates a variety of loaders, each used on a specific incoming format.


LexEVS provides both a LexEVS administrative GUI and LexEVS loader commands to load terminologies.  While the LexEVS administrative GUI is very functional, a system administrator may prefer the command line interface because command scripts can be adjusted to increase memory and tune other java virtual machine settings to insure that loads of larger terminologies have adequate resources.  For example, a user may select a loading script, open it in an editor, increase the java heap size and PermGen memory, depending on the machine’s resources, and save the script before running with the appropriate options written into the command line. Still, the GUI can be convenient for loading smaller terminologies and, in many cases, works fine for loading moderately large terminologies like the NCI Thesaurus.  Loading terminologies requires some knowledge of the source of the terminology.

h4. Generic loading

Most terminology loads can be easily accomplished by pointing either the LexEVS commands or the LexEVS administrative GUI at the terminology source file and running the loader.   Generic loading instructions can be found for the [LexEVS administrative GUI|https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/LexEVS_6.0_Administration_Using_the_GUI_Tool#Load_Terminology_Menu] or the [LexEVS loader commands|https://cabig-kc.nci.nih.gov/Vocab/KC/index.php/Administering_LexEVS_6.0_with_the_Command_Line#Command_Line_Scripts_and_Wrappers_Overview]. For many sources you can use a variation of the following LexEVS command:

Linux
{code}
Info

This is the top level page for a VKC documentation project on LexEVS loaders. This page and all child pages will move to the LexEVS space once it is created.

...

LexEVS provides both a GUI and a command line interface to load terminologies.  While the GUI is very functional, many system administrators prefer the command line interface so that scripts can be adjusted to increase memory and tune other java virtual machine settings to insure loads of larger terminologies load with adequate resources.  Typically a user may select a loading script, open it in an editor, and increase java heap size and PermGen memory, depending on the machine’s resources, and save the script before running with the appropriate options written into the command line.    Still, the GUI can be convenient for loading smaller terminologies and in many cases works fine for loading moderately large terminologies like the NCI Thesaurus.  Loading resources requires some knowledge of the source. 

Most source loads can be easily accomplished by pointing either the command line script or the lbGUI interface at the source file and running the loader.   Generic loading instructions can be found in the administrative gui guide or the administrative command line guide, but for many sources you can use a variation of the following command line script:

Linux

Code Block
./LoadOWL.sh -in "file:///ontologies/owl/amino-acid.owl"
{code}

Windows

...


{code
}
LoadOWL.bat -in "file:///ontologies/owl/amino-acid.owl"

Substituting the loader script for the needed format and any path needed to point the loader to a local source file.

Some sources are special cases and need special handling.  Included in this category are the NCI Thesaurus in OWL format and any files loaded from UMLS RRF formatted sources.  The NCI MetaThesaurus is the largest terminology we load and as such it also requires special handling.  OWL terminologies do not normally require special handling, but LexEVS offers some advanced loading options users can take advantage of. At the bottom of the page we link to tutorials for each of these.

Loading larger terminologies can be very time and resource consuming and this can be helped by following recommendations for database optimization and for proper configuration of the lbconfig.props file. 

The default setting for the value for the database primary key is the following:

Image Removed

Because this is very taxing on the index processing at the end of the load, we recommend changing it to SEQUENTIAL_INTEGER unless you have a priority need for Global Unique Identifiers.

The guide includes the following sections:

...


{code}
This LexEVS loader command loads input in OWL format. Substituting the matching LexEVS loader command for the format being used and point the loader to a local source file. In the LexEVS administrative GUI, loading is accomplished using the "Load Terminology" menu. The administrative options must be enabled first in the Command menu.


h4. Large Terminologies





Loading any larger terminologies can be very time consuming and resource intensive and this can be helped by the following recommendations for database optimization.  The primary LexEVS configuration file, {LEXEVS_HOME}/resources/config/lbconfig.props, should be changed depending on how the primary key for the database should be generated. The default setting for the value of the database primary key is the following:
{code}# DB_PRIMARY_KEY_STRATEGY indicates which strategy will be used
# for the primary key of the database tables.
# WARNING - This cannot be changed after the initial
# schema installation.
#
# Allowable values include:
#
#	"GUID"
#		- Primary Keys are implemented as random GUIDs.
#	"SEQUENTIAL_INTEGER"
#		- Primary Keys will be sequentially incremented
#		- as Ingeter values.
DB_PRIMARY_KEY_STRATEGY=GUID{code}
Because this default is very taxing on the index processing at the end of the load, we recommend changing it to SEQUENTIAL_INTEGER unless you have a priority need for Global Unique Identifiers.

h4. Special Case Loading


Some terminologies are special cases and need special  handling.  Included in this category are the NCI Thesaurus in OWL format  and any files loaded from UMLS RRF formatted sources.  The NCI  MetaThesaurus is the largest terminology we load and as such it also  requires special handling.  OWL terminologies do not normally require  special handling, but LexEVS offers some advanced loading options users  may take advantage of. At the bottom of the page we link to tutorials  for each of these.

The guide includes the following sections:
{children}