Skip Navigation
NIH | National Cancer Institute | NCI Wiki   New Account Help Tips
Child pages
  • cTAKES 2.5 (See Apache cTAKES for the latest version)
Skip to end of metadata
Go to start of metadata


NOTE: For the latest version of cTAKES, see Apache cTAKES.

Mayo Clinic's Clinical Text Analysis and Knowledge Extraction System (cTAKES) is a system through which one creates one or more pipelines to process clinical notes and to identify clinical named entities. For example, a pipeline can identify mentions of drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity that is found is given attributes for the text span, the ontology mapping code, the context (probable/possible, family history of, or history of), and negated/not negated.


cTAKES is built on the UIMA framework. cTAKES 2.5 does not provide a GUI of its own for installation or processing. The cTAKES documentation shows how to use the GUIs provided by the UIMA framework, and how to run cTAKES from a command line.

Before using cTAKES you need to know that cTAKES does not provide any mechanisms of its own to handle patient data securely. It is assumed that cTAKES is installed on a system that can process patient data, or that any data being processed by cTAKES has already been through a deidentification step in order to comply with any applicable laws.

After cTAKES 2.5 - Apache cTAKES

cTAKES is now an Apache top level project. The first release of Apache cTAKES was 3.0.  For the latest version of cTAKES, see Apache cTAKES.

What was New in 2.5

The cTAKES community brings you the following functions in cTAKES 2.5:

  • Attributes extractor component (aka Assertion Annotator)
  • Semantic Role Labeler Component (into Dependency parser)
  • Co-ref Resolver updates
  • A new model for Part of Speech tagging (using the newly treebanked data)
  • Sectionizer (part of Core)

Download and Install 2.5

For the latest version of cTAKES, see Apache cTAKES (incubating).

For cTAKES 2.5:

There are two kinds of cTAKES users. One is the user that wishes to use the tool, without any compile steps or use of development environments. The goal for them is primarily to get the tool up and running and to apply it to clinical documents. These users will install the software like any other product that you might buy or download. Note that there still may be file configuration required by using an editor.

The second is a developer that will be taking the cTAKES code with the intent to extend or modify it here and there to suit project needs. More parameters and such can be customized in the developer's environment for fine tuned control, but use does require that you have an integrated development environment in which to modify, build and deploy the code.

There is a third install path for developers, so it does not count as a third user type. It is possible to build a back-end server out of the software. That is, there will be no graphical user interface available. The developer will have only application programming interfaces (APIs) available with which to interact. Instructions for this are embedded in the Developer install instructions. At one point during the install, pay attention to the Important icon which marks steps only required if you want a back-end server.

It is important to reiterate that the cTAKES downloaded as a ZIP file is the new integrated structure. Developers that obtain the code through the SVN code repository will still obtain the original structure of cTAKES. We hope to address this in the future.

Instructions for downloading are embedded in the install instructions.


NOTE: For the latest version of cTAKES, see Apache cTAKES.

cTAKES 2.5 documentation:


After cTAKES 2.5 was released, cTAKES development moved to See Apache cTAKES for the latest mailing list information and for bug tracking.

Prior to the move to, we encouraged using the following forums as the first line for inquiries. You might wish to search these for cTAKES 2.5 information, but please post any new questions and report newly-found bugs via Apache cTAKES.

If you have an issue that can not be placed into public forums or mailing lists, then we also have an email address for you to use: