NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

1

Install Metathesaurus to a local folder:
Loading from these sources requires that the UMLS Metathesaurus be installed locally so it can be accessed from LexEVS.  This can be done according to documentation available on the UMLS website.  

2

Subset the desired terminology (optional):
Once the Metathesaurus is downloaded and installed, users can either load from the entire set of files by pointing to the containing file directory or they can use the UMLS tools to subset a terminology (recommended).  Subsetting the terminology beforehand provides improved performance during loads.

3

Set command line options in the loading script:
If you load larger ontologies, we recommend use of the command line options, this will allow manipulation of the memory allocated for loads of larger terminologies such as SNOMED. Scripting options can be added to the scripts contained at <LexEVS install base>/admin  If a user is working on a Linux environment with a 64 bit architecture, then they can use the LoadUmlsBatch.sh file.  On a server class computer with say 16  gigabytes of memory and 8 four core processors users can access fairly substantial resources to load content.   Open the .sh file with a text editor and edit the values for -Xmx and -XX:MaxPermSize as follows "-Xmx6000M -XX:MaxPermSize=256M"  or more if you have adequate resources available. If you have not set the DB_PRIMARY_KEY value to SEQUENTIAL_INTEGER as described earlier it could 33 hours to load a terminology as large as SNOMED which otherwise could complete in 4 hours.

4

Find the SAB (RSAB) in the MRSAB.RRF file
Both the lbGUI and the command line require the user to enter a SAB or source abbreviation. This requires that you either know this source abbreviation, or find it in the MRSAB.RRF file contained in the folder of the UMLS installation or the subset you made for the terminology you wish to load. We recommend you open this file in a text editor and search on the terminology name, for instance SNOMED and you should find a line with that name in a row of text separated by a “pipe” character or “ | ”.


The current format of UMLS has the RSAB in column four. Which in the case of SNOMED is SNOMEDCT. Notice that this is a licensed terminology and all use must be in accordance with the licensing agreement.

5

Load the Terminology from the command line referencing the SAB.
 

Code Block
./LoadUmlsBatch.sh -in "file:///data/phont/ontologies/2011AA -s SNOMEDCT"

Note: The file path is pointing to the directory directly above the .RRF files.\

6

Monitor output (optional):
The output from the UMLS batch loader indicates steps of the batch load and can be monitored from the logs at <LexEVS Install Root>/logs/LexBIG_load_log.text.
If you are on Linux this can be done using:


Code Block
watch --n .1 --d tail LexBIG_load_log.text.

Sample output of an early load step is as follows:

...