NIH | National Cancer Institute | NCI Wiki  

The purpose of this page is to explain the differences between the NCI Thesaurus and the NCI Metathesaurus. This page was provided by the Vocabulary Knowledge Center.

Point of Comparison

NCI Thesaurus

NCI Metathesaurus


The NCI Thesaurus is a standard reference terminology/ontology used by the NCI to support cancer research. It is curated and maintained by NCI staff and contractors. It contains concepts and relationships needed to support NCI business and scientific programs, as well as other users and collaborators such as the FDA and CDISC. While it is cancer-centric other terminology has been added for NCI collaboration with other institutions.

The NCI Metathesaurus is a synthesis of many different terminologies. It is based on the National Library of Medicine's UMLS Metathesaurus, removing some proprietary and out of scope terminologies and inserting additional terminologies of interest to the NCI research community. It integrates terms and definitions from different terminologies.


The original intent was to create a stand-alone, controlled terminology in support of the NCI's software systems used for annotating, database search, data mining, text indexing, and natural language processing.

The original intent was to create a comprehensive repository as a dictionary and thesaurus containing most of the terminologies from the UMLS Metathesaurus, as well as many other biomedical terminologies created by or of interest to NCI and its partners.

Release Format


RRF - Modeled after the UMLS Metathesaurus


Creative Commons Attribution 4.0 International license (CC BY 4.0)

Users may encounter each sub-terminology license separately. Some proprietary terminologies are included, with permission, and have restrictions on their use. Click here to see the current list of licenses.

Size as of May 2011

Hierarchical arrangement of nearly 250,000 Terms in 89,000 concepts. Over 200,000 cross-links between concepts.

Includes 3,600,000 terms from 76 sources (NCI Thesaurus is one of the sources, however the representation is not identical) into 1,400,000 biomedical concepts that represent their meaning. Contains 20,000,000 cross-links between content elements.

Subsets Concepts in Subsets are identified within NCI Thesaurus by associations and properties, but also published separately for a number of users who require them in this format. In the future many of these will also be converted and published as value sets.

Subsets are not made available by NCI; however, individual sources can be extracted from the NCI Metathesaurus build using publicly available UMLS tools.



Bi-annually and as needed

General Public Description

Related Links

Browse the NCI Thesaurus
BioPortal listing

Browse the NCI Metathesaurus
NCI Metathesaurus Wikipedia article

  • No labels