Skip Navigation
National Cancer Institute U.S. National Institutes of Health www.cancer.gov
NCI Wiki New Account Help Tips
Skip to end of metadata
Go to start of metadata

cTAKES 2.0

Contents of this Page

Introduction

Mayo Clinic's Clinical Text Analysis and Knowledge Extraction System (cTAKES) is a system through which one creates one or more pipelines to process clinical notes and to identify clinical named entities. For example, a pipeline can identify mentions of drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity that is found is given attributes for the text span, the ontology mapping code, the context (probable/possible, family history of, or history of), and negated/not negated.

Overview

cTAKES is built on the UIMA framework. cTAKES 2.0 does not provide a GUI of its own for installation or processing. The cTAKES documentation shows how to use the GUIs provided by the UIMA framework, and how to run cTAKES from a command line.

Before using cTAKES you need to know that cTAKES does not provide any mechanisms of its own to handle patient data securely. It is assumed that cTAKES is installed on a system that can process patient data, or that any data being processed by cTAKES has already been through a deidentification step in order to comply with any applicable laws.

What's New

The cTAKES community brings you the following functions in 2.0:

  • Consolidation to a Common Type System
  • Merged original and integrated cTAKES into integrated cTAKES for download and use.
  • A new tokenizer that follows Penn Treebank tokenization rules

Download and Install

There are two kinds of cTAKES users. One is the user that wishes to use the tool, without any compile steps or use of development environments. The goal for them is primarily to get the tool up and running and to apply it to clinical documents. These users will install the software like any other product that you might buy or download. Note that there still may be file configuration required by using an editor.

The second is a developer that will be taking the cTAKES code with the intent to extend or modify it here and there to suit project needs. More parameters and such can be customized in the developer's environment for fine tuned control, but use does require that you have an integrated development environment in which to modify, build and deploy the code.

There is a third install path for developers, so it does not count as a third user type. It is possible to build a back-end server out of the software. That is, there will be no graphical user interface available. The developer will have only application programming interfaces (APIs) available with which to interact. Instructions for this are embedded in the Developer install instructions. At one point during the install, pay attention to the Important icon which marks steps only required if you want a back-end server.

It is important to reiterate that the cTAKES downloaded as a ZIP file is the new integrated structure. Developers that obtain the code through the SVN code repository will still obtain the original structure of cTAKES. We hope to address this in the future.

Instructions for downloading are embedded in the install instructions.

Documentation

Support

We encourage use of our forums as the first line for inquiries. The forums are split into these places for discussion which match up to the user types:

If you have an issue that can not be placed into public forums then we also have an email address for you to use: clinicalnlp@mayo.edu