NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Note
titleNote

NCI Metathesaurus contains many individual vocabularies some of which are large vocabularies in and of themselves. It requires many hours to load and index. It can require 36 hours on a multiprocessor machine with 6 gb plus GB or more of memory. The total time to load the NCI MetaThesaurus will vary depending on machine, memory, and disk speed. Because this loader uses a batch loading strategy it is less dependent on memory, but some users will see 3 or 4 day load times with average multiprocessor processing power.  While the batch loading process itself is not heavily memory dependent, the creating the index following the load will require at least 3 GB of memory.

Step

Action

1

Using a web or ftp client go to the URL: http://ncicb.nci.nih.gov/download/evsportal.jsp
Note that a valid UMLS license is required to download the NCI Metathesaurus due to the UMLS content inside. You will find out how to manage this on the site above if you do not have a license.

2

Select the version of NCI Metathesaurus RRF you wish to download. There may only be one. Save the file to a directory on your machine.

3

Extract the files from the ZIP download and save to a directory on your machine. This directory will be referred to as NCI_METATHESAURUS_DIRECTORY.  RELASE_INFO.RRF is required to be present for the load utility to work.

4

Check that you are able to open a large number of files before starting the load.

Code Block
ulimit -Hn

Usually having around 10,000 available open files is sufficient. If your limit is set to low this will need to be raised.
|

5

Using the LexEVS utilities load the NCI Thesaurus:

Code Block
{LEXEVS_HOME}/admin

For Windows installation use the following command:

Code Block
LoadMetaBatch.bat –in "file:///{NCI_METATHESAURUS_DIRECTORY}/"

For Linux installation use the following command:

Code Block
LoadMetaBatch.sh –in "file:///{NCI_THESAURUS_DIRECTORY}/"

...