NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

LexGrid Vocabulary Services for caBIG® (LexBIG)

Version 1.0
Last Modified: April 13, 2009

...

The UMLS' large medical thesaurus is available as a set of text based, "|' separated files which can be made subset into individual terminologies depending on the user's needs. NCI's MetaThesaurus is also RRF formatted. We map individual terminologies, the entire NCI MetaThesaurus and the UMLS terminology SEMNET into LexGrid Using specific loaders and mappings for each.

  • Supported Coding Scheme Attributes:
    • These are not mapped as categories to a model element. That is, a supported association has an attributeTag column with a corresponding name, but it's context is implied in the name of the supported attribute. For instance, supported associations will have an attributeTag of "association" but that tag corresponds to no element in the model element SupportedAssociation. Instead the context is implied in the name of the element SupportedAssociation.
  • Preferred Presentation Selection:
    • Preferred Presentation is determined first by sorting the presentations to include first those in the default language of the Terminology. Following that and given there is more than one presentation in the default language the "most preferred" is determined in the following manner:
      Using the "isPref" column, the "TS" and "STT" columns in the MRCONSO RRF file, or a combination of these columns. The MRRANK file overrides these columns.
  • Preferred Definition Selection:
    • Definitions in UMLs are not ranked, the first definition found for a concept in the source file MRDEF.RRF is set to preferred.
    • Special SNOMED adjustments for concept presentation language:
    • Snomed handles it's language default settings differently than other UMLS terminologies, we hard code it's default language as "en" as a result.
    • Presentation language is determined by combining the values of SUI, LUI and CUI from MRCONSO and selecting the ATV value from MRSAT where SAB always equals SNOMEDCT and the ATN value is either LANGUAGECODE or SUBSETLANGUAGECODE.
  • Association Qualifiers for medDRA and others:
    • MedDRA employs SMQ's or Standardized Medical Queries as a method of classifying portions of this terminology. These are expressed in MRSAT.RRF when the AUI in the METAUI column is replaced by a RUI code. In LexBIG is RUI is identified in the MRREL.RRF source as relationships are loaded and the associated ATN and ATV values from the MRSAT.RRF row are populated as association qualifier name and value.
  • Hierarchies expressed in source contexts:
    • Hierarchies in the UMLS are expressed in the MRREL.RRF file as source, target pairs. However source hierarchies may also be expressed in the MRHEIR.RRF file. These context based hierarchies are realized in LexBIG by accessing the MRHEIR source where the HCD column value is populate. When this is the case, as in MESH, the path of AUI's to root from the code in the HCD column is processed as a hierarchy. LexBIG's behavior is as follows:
      • Entries in MRHIER that define multiple contexts (HCD field) per CUI will trigger additional tracking within the LexBIG environment.
      • Each link is tracked via the corresponding contextual chain(Path To Root field). To do this, we add association qualifiers that tag the association between each participating concept. The qualifier name is 'HCD' and the value will be the HCD field value from the MRHIER file.
      • An individual association between two concepts can participate in multiple context chains by assigning additional association qualifiers. A complete flow across the entire chain of links (essentially reconstructing PTR field) can be derived by recursive evaluation of surrounding links that have the same context qualifications. Since each concept can carry multiple text presentations, property qualifiers will be used to track the individual terms used in each context.
      • As with associations, multiple qualifiers can be assigned to each text property. Once again, the qualifier name will be 'HCD' and the value will be the HCD field value from the MRHIER file.
      • In order to query context-specific relationships, we can first use the API to filter the relationships a concept participates in, then query neighboring nodes to determine the complete context path, and finally map back to specific terms through the registered HCD qualifiers .

...