NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Scrollbar
iconsfalse

Page info
title
title

Section
Column
Panel
titleContents of this Page
Table of Contents
minLevel2
Column
Align
alignright
Include Page
Menu LexEVS 6.x Loader to Include
Menu LexEVS 6.x Loader to Include

...

While terminologies are being loaded, you can monitor the progress using the LexEVS logs (both 'load' and 'full' log) and if using MySQL, use INNODB tools to monitor Inserts per second. (SHOW INNODB STATUS

Multiexcerpt include
MultiExcerptNameExitDisclaimer
nopaneltrue
MultiExcerptNameExitDisclaimer
PageWithExcerptwikicontent:Exit Disclaimer to Include
)

...

Some terminologies are special cases and need special handling. Each of these has its own documentation:

Special Case

How to handle

Installing NCI Vocabularies in LexEVS 6.x

The NCI Thesaurus differs from other OWL formatted resources and as a result you should follow this documentation.
The NCI Metathesaurus is the largest terminology to be loaded and as such it also requires special handling.

Installing OWL Formatted terminologies

OWL terminologies do not normally require special handling, but LexEVS offers some advanced loading options users may take advantage of.

Installing Vocabularies from the UMLS Metathesaurus (RRF)

Terminologies in RRF format typically come from the National Library of Medicine's (NLM) Unified Medical Language System (UMLS). Many terminologies are a subset of the UMLS such as LOINC, SNOMED, MedDRA, HUGO, GO, and ICD to name a few. The terminology you're interested in is a subset of the UMLS if:

  • The terminology is documented by the NLM as a source for the UMLS or
  • The "Download" column on the Vocabulary Knowledge Center's Index of Terminologies includes RRF or
    • The FORMAT type listed in BioPortal
      Multiexcerpt include
      MultiExcerptNameExitDisclaimer
      nopaneltrue
MultiExcerptNameExitDisclaimer
    • PageWithExcerptwikicontent:Exit Disclaimer to Include
      is RRF.
Loading Asserted Value Set Definitions and IndexesSource Asserted Value Set Loads require that value set definitions based on source asserted value sets be loaded. Supporting Indexes for searches on these value sets also need a separate load from the source terminology load (At the moment this is focused on the NCI Thesaurus.)

None of the above match and you cannot find a suitable source format.

Many terminology providers produce more than one source format that can be downloaded. Source formats such as text only, CSV, tab delimited, and spreadsheets are not acceptable source formats for LexEVS. If you can not find an acceptable source format for a terminology to load into LexEVS then one option is to download versions that have been placed on BioPortal

Multiexcerpt include
MultiExcerptNameExitDisclaimer
nopaneltrue

MultiExcerptNameExitDisclaimer

PageWithExcerptwikicontent:Exit Disclaimer to Include
. These may not be the latest versions available but are easy to download. If you end up with a with a terminology that is not one of the special cases above then you should return to the generic loading instructions.

Common Errors When Loading

ErrorRemedy or Indication

Out of Memory Error

Heap Error

Perm Gen Error

Generally memory related errors that indicate the heap space and/or perm gen space need to be increased when starting the Java VM.
Data Truncation ErrorSource formats change over the years and sometimes this results in this database related error.  It indicates that a column size is too small for whatever element has been pulled from the source file.   This will require an update to LexEVS to fix and a temporary workaround can be to edit the source so that the element is made short enough to fit the database column.
Data Base Connection ErrorCheck to see if the DBMS is up and running, that your database exists, that the connection parameters are correct, and that proper privileges exist for the connecting user.  This error may manifest itself as DAO related errors being generated by the spring framework at some levels of execution in the logs, including failing to create a dao list. 
Too Many Files OpenLinux system error requires setting of system properties.  This happens when the Lucene index is large and is being reindexed on loading a new terminology or is being optimized or cleaned up. 
Loader Hangs with no ErrorsThis can happen when there is processor capacity is maxed out, or there is network latency of one variety or another.  For very large terminologies it may be necessary to just wait this out, but this can be helped by working with local Lucene files (highly recommended), a local database, and moving load operations to a system where there are multiple processor cores and adequate memory (16GB and more).  Even for loads that are memory efficient, such as those using the Spring Batch functions, indexing is still memory bound and can go much faster with more memory.

Scrollbar
iconsfalse