What is the NCI Thesaurus?

The NCI Thesaurus is a reference terminology and biomedical ontology used in a growing number of NCI and other systems. It covers vocabulary for clinical care, translational and basic research, and public information and administrative activities. The NCI Thesaurus provides definitions, synonyms, and other information on nearly 10,000 cancers and related diseases, 8,000 single agents and combination therapies, and a wide range of other topics related to cancer and biomedical research. It is maintained by a multidisciplinary team of editors, who add about 900 new entries each month and is published monthly by NCI.

The NCI Thesaurus is different from the NCI Metathesaurus. Find out how in this comparison.

NCI Thesaurus Processing

The NCI Thesaurus is authored using the open-source Protege editor. On a monthly basis the Thesaurus is exported as an OWL file for publication. This OWL file goes through several processing steps before being loaded into LexBIG.

  • The OWL file is scrubbed of any internal properties such as editor notes and review status.
  • The OWL file goes through a formatting program that makes sure all concepts treed under Retired_Concepts are declared as owl:DeprecatedClass
  • The OWL file is run through a QA program that detects and flags any violations of the EVS business rules, allowing us to notify the editors.

Once the OWL file is processed, it is loaded into LexBIG using the OWL loader.

For a list of significant changes made to the NCI Thesaurus over time see NCI Thesaurus Changes Over Time

