NIH | National Cancer Institute | NCI Wiki  

Contents of this Page

This document provides information about the National Cancer Institute Common Data Elements (CDEs) developed with the Cancer Therapy Evaluation Program (CTEP). For questions concerning CTEP data in the caDSR, please contact the NCI CTEP CDE Compliance Review Team.

Introduction to CDEs

Common Data Elements (CDEs) are standardized terms for the collection and exchange of data. CDEs are metadata; they describe the type of data being collected, not the data itself. A basic example of metadata is the question presented on a form, "Patient Name," whereas an example of data would be "Jane Smith."

Overview of the CDE Project

The National Cancer Institute (NCI) developed the CDE initiative to address the need for consistent cancer research terminology. To date, the Cancer Therapy Evaluation Program (CTEP) has focused its CDE efforts on metadata used in data collection and reporting for phase 3 clinical trials, by standardizing terminology for questions and values on case report forms (CRFs). The goals of this project are the following:

  • to identify discrete, defined items for data collection
  • to promote consistent data collection in the field
  • to eliminate unneeded or redundant data collection
  • to promote consistent reporting and analysis
  • to reduce the possibility of error related to data translation and transmission
  • to facilitate data sharing

Developing CDEs

To build its collection of CDEs, CTEP has established a collaborative process to engage members of expert committees to identify and define disease-specific terminology. Members of these committees include representatives from NCI and the Clinical Trials Cooperative Groups who are involved in study design, implementation, data collection, and analysis.

The CDE disease committees consider terms that are being used in their field of study to determine whether there is a general need for each, as well as what other terms may be needed. The committee then develops consensus to standardize the language for the question and any associated values. Where possible, committees base the language and values on established standards, such as the Commission on Cancer, the American Joint Committee on Cancer, the World Health Organization, and NCI resources.

As a result of the committee meetings, CDEs are identified, defined, refined, and classified. Template CRFs, which provide a graphic representation of the "core" CDEs for each disease, are also created during these meetings. The designation of a CDE as "core" for a disease indicates the committee's determination that it is likely to be used for most phase 3 clinical trials. Other CDEs, for which a less frequent need is anticipated, are marked as "non-core" for the disease.

Collections of disease-specific CDEs have been developed and released for public use for bladder, breast, colorectal, gynecologic, lung, prostate, and upper gastrointestinal cancers, as well as for melanoma and leukemia. In addition, expert committees were also established in collaboration with the Special Programs of Research Excellence (SPORES) program to develop CDEs related to pathology and specimen banking; these collections of CDEs have also been released and are available for use by the oncology community.

Disease committees were convened in 2002 to develop CDEs for brain and head and neck cancers, as well as for lymphoma, myeloma, sarcoma with release in Fall 2003. An effort is also in progress to develop CDEs specific to pediatric clinical trials and a plan to expand the effort to create CDEs for phase 1 and 2 clinical trials.

Using CDEs on CRFs

Once CDEs for a disease are released by CTEP, these CDEs must be implemented on CRFs for all phase 3 studies of that disease submitted to CTEP. A review process has been established to compare the CRF questions and values for a submitted protocol with existing CDEs. The result of this comparison is a series of reports that indicates whether CRF questions and values match the standard language of existing CDEs.

If the language of a question or its corresponding values do not match a CDE with a related definition, it is recommended that the CRF be revised to replace the CRF question and values with the existing CDE. If there is a match for neither language or meaning, a new term is developed for temporary use. This CDE may be used on the CRFs for the submitted protocol but will not be released for general use until it has been reviewed and approved by the appropriate CDE disease committee.

NCI CDEs are stored in the caDSR, a robust metatdata registry developed and maintained by the NCI Center for Bioinformatics and Information Technology (CBIIT), and storing important attributes that are useful both to those constructing CRFs and to those developing information systems. The CDE Browser is the primary user interface to search, browse, and export CDEs from the caDSR and offers information regarding the development and use of CDEs for the oncology community.

Application of ISO 11179 to CDEs

Understanding ISO 11179

The framework of the caDSR is based on ISO 11179: Information Technology - Metatdata Registries Exit Disclaimer logo . Just as the goal of CDEs is to facilitate the sharing of data through common language, the goal of ISO 11179 is to facilitate the sharing of metadata though a common data model. As such, this standard specifies the data (that is, attributes and associated administered components) that need to be stored for each CDE and how the data should be stored. CBIIT has provided documentation on the caDSR wiki about how it has implemented this standard.

An ISO 11179 database is organized into Contexts. A Context may represent a business unit or some other content division. All administered components within the database are associated with a Context, either that in which they originated or are used. In the caDSR, Contexts represent NCI programs and divisions. All CDEs that were created by CTEP are associated with the "CTEP" Context. The caDSR also allows for Contexts to indicate their endorsement of a CDE created by another program or division. Such a designation indicates to users that the CDE is approved for use in this other Context as well.

ISO 11179 Terminology

An administered component is an item about which administrative data is collected. Four types of administered components are integral to an ISO 11179 database. Additional types of administered components also exist within the ISO 11179 data model and the caDSR.

The most familiar of these four is the Data Element. A Data Element is the basic unit of data that is being collected in an ISO 11179 database, a metadata descriptor. It represents a semantic concept and indicates the specific type of data to be collected. Data Elements are named and defined in a standardized manner according to Context-specific naming conventions. Within the "CTEP" Context, a Data Element can be thought of as a question on a CRF.

A Data Element Concept is similar in nature to a Data Element. It represents a semantic concept but is not tied to a specific data type. A Data Element Concept may, therefore, be associated with several Data Elements representing the same semantic concept. For example, the Data Elements "Patient Residence Country Code" and "Patient Residence Country Name" both represent the same semantic concept of "Patient Residence Country."

A Value Domain describes in detail the type of data to be collected, independent of the semantic concept. Attributes of a Value Domain include data type, maximum and minimum field lengths, high and low values, unit of measure, and number of decimal places. A Value Domain may also include an enumerated list of specific Valid Values. Within the "CTEP" Context, a Value Domain describes the type of data that is being collected by a question on a CRF. If there is an enumerated list of Valid Values, it is those Valid Values that may appear on the CRF as potential answers.

A Conceptual Domain is a collection or description of related Value Meanings. A Value Meaning is the essence of the data that is being collected, rather than the actual data itself. For example, a response to the question, "Patient Name" might be "Jane Smith". "Jane Smith" is actual data, whereas the essence of the data is "the name of a person." Another example is the question, "Country of Residence," which includes as responses the two-letter code for each country in the world. The codes would be Valid Values in a Value Domain, but the Value Meanings would be the list of countries in the world.

Basic Relationships of Administered Components

ISO 11179 specifies that each Data Element is associated with one and only one Data Element Concept and with one and only one Value Domain. In this way, the combination of a Data Element Concept and Value Domain define a Data Element.

Each Data Element Concept and each Value Domain are associated with one and only one Conceptual Domain. For a given element, the Data Element Concept and Value Domain do not have to be associated with the same Conceptual Domain, although they might be.

Naming Data Elements

ISO 11179 requires that Data Elements be named in a consistent manner, allowing for easier searching and retrieval of data. CTEP has developed naming conventions for Data Elements associated with the "CTEP" Context of the caDSR. Please refer to CTEP's Naming Conventions for a full explanation of these rules and guidelines.

Most names are composed of one or some combination of the following types of terms, defined by ISO 11179 as the basic components of Data Elements and other administered components.

Component of Data Element Names

Definition

Example

Object Class term

thing about which data is being collected; within the "CTEP" Context, typically represents an object or activity

Treatment

Property term

a characteristic or possession of the Object Class

Report Period

Representation term

specifies the form of the data that is being collected

Date

Qualifier

a modifier that describes any other term, similar to an adjective; within the "CTEP" Context, qualifiers should be used sparingly because of limited name lengths

End

Data Element Long Name

A Data Element Long Name is composed of one Object Class term, one Property term, and one Representation term. A maximum of three Qualifiers (optional), one modifying each of the other terms, may be added to the name if needed to further clarify the name or to make it distinct from other Data Element Long Names. Data Element Long Names must be unique and distinct from one another.

The Object Class term shall occupy the first position in the name and the Property term shall occupy the second position. A Qualifier shall directly precede the term it modifies. The Representation term shall occupy the last position in the name.

Words or terms in the name are to be separated by spaces. No punctuation or abbreviations are to be used. Each word should have the initial letter in uppercase, with all others in lower-case, unless the word is commonly written otherwise.

The total length of the name, including spaces, is restricted to a maximum of 120 characters.

Example: Treatment Report Period End Date

Data Element Short Name

A Data Element Short Name is an abbreviated form of the Data Element Long Name; it is, therefore, composed of one Object Class term, one Property term, and one Representation term, and up to a maximum of three Qualifiers, one modifying each of the other terms. The Data Element Long Name is to be determined first, and then abbreviated as described below. Data Element Preferred Names must be unique and distinct from one another.

The Object Class term shall occupy the first position in the name and the Property term shall occupy the second position. A Qualifier shall directly precede the term it modifies. The Representation term shall occupy the last position in the name.

Words or terms in the name are to be separated by underscores. No punctuation is to be used. The name should be written in uppercase.
The total length of the name, including underscores, is restricted to a maximum of 20 characters. All words are to be abbreviated if a standard abbreviation has been determined by CTEP. If after abbreviations are implemented, name length exceeds 20 characters, unabbreviated terms will be truncated to 3 letters as needed in the order of Qualifiers, Property term, Representation term, Object Class term.

Example: TX_REPPD_END_DT

Best Practice Recommendations have approved system-generated short name (preferred name) in curation for CTEP CDEs.  The following recommendation was approved on Content Meeting on 12/14/09:

  • Curators should retain system-generated short names created by the curation tool
  • If user-entered short names are required, follow the guidelines imposed by the consuming application and register an alternate name in caDSR with the appropriate Alt Name Type
  • For items that are moved into OC systems alternate names will need to be created
  • Standards based short name (DICOM, HL7) should follow best practice

Representation Terms

Below is the current list of Representation terms used by CTEP in naming Data Elements.

Representation term

Definition

Date

calendar date

Time

time of day

Date/Time

combined date and time

Interval

length of time between specified events

Duration

length of occurrence of an event (number/time period)

Frequency

how often an event occurs (e.g., daily, weekly)

Age-Months

age in months

Age-Years

age in years

Number

assigned identifier (e.g., patient number, specimen number, telephone number, cycle number, treatment arm number)

Count

quantity, number of items

Dose

amount of therapy administered or prescribed to or taken by a patient

Measurement

dimensions or capacity of an object, or resulting calculation (e.g., diameter, area, volume)

Value

numeric laboratory measurement

LLN

lower limit normal

ULN

upper limit normal

UOM

units of measure

Rate

relationship between two numbers (e.g., blood pressure rate)

Average

mathematical mean

Grade

numerical scale to describe extent of something, assigned according to standard criteria

Stage

disease staging, assigned according to standard criteria

Score

number assigned from standardized test or procedure

Amount

numeric value of otherwise unspecified type

Ind

response to a yes/no question; includes yes, no, unknown, not available, not assessed, etc.

Ind-2

response to a yes/no question; includes yes, no

Ind-3

response to a yes/no question; includes yes, no, unknown

Name

designation for a person or object

Code

values that substitute for others

E-mail

e-mail address

Procedure

enumerated list of treatment procedures

Site

anatomic site

Reason

explanatory action

Source

source of information provided

Category

classification

Scale

spectrum of values

Status

response to a binary question (e.g., positive/negative, left/right)

Type

list of values of otherwise unspecified type

Specify

free-text description where needed value was not available in associated/related question (i.e., "Other, specify")

Text

free-text description of procedure or event

Character Set and Symbols

The caDSR and CTEP's naming conventions make use of the standard ASCII character set in all administered component names, Valid Values, and Value Meanings. This character set includes all letters in the Latin alphabet (A through Z), in both lower- an upper-case and numbers 0 through 9. The following additional characters are included:

<space>  `  ~  !  @  #  $  %  ^  &  *  (  )  -  _  =  +  \  |  [  ]  {  }  ;  :  '  "  ,  <  .  >  /  ?

All Long Names and Preferred Names may only begin with a letter. Valid Values and Value Meanings may begin with symbols unless restricted by the software being used.
Conventions have been established by CTEP for other symbols and formatting that may be needed for Document Text entries or Valid Values.

  • Superscript will be indicated by the symbol ^ (e.g., 10^3)
  • Subscript will be indicated by the symbol \ (e.g., A\2)
  • Symbol for degrees will be written out as "degrees"
  • Symbol for plus or minus will be indicated by +/-
  • Symbol for check mark will be written out as "check"
  • Symbol for less than or equal to will be indicated by <=
  • Symbol for greater than or equal to will be indicated by >=

Using the CDE Browser

Overview of the CDE Browser

The CDE Browser (http://cdebrowser.nci.nih.gov) is the primary user interface for the caDSR. It is a public web site that has a real-time connection to the caDSR, so users of the CDE Browser see updates and edits immediately as they occur.

Using the CDE Browser, users can search, browse, and export Data Elements from the caDSR. The CDE Browser also provides users access to view CDE collections and template CRFs developed by the CTEP CDE disease committees and others.

Definition of Terms

Fields that appear in the CDE Browser are defined below.

Field in CDE Browser

Definition

Data Element

the basic unit of data that is being collected in an ISO 11179 database, a metadata descriptor. Within the "CTEP" Context, a Data Element can be thought of a question on a CRF, may also be referred to as a CDE.

CDE ID

a unique seven-digit identifier assigned to each Data Element, may also be referred to as the Data Element's Public ID; each Data Element has one and only one CDE ID.

Preferred Name

the field that stores the short, 20-30 character name ("computer" name) of a Data Element; other administered components (i.e., Value Domains, Data Element Concepts, and Conceptual Domains) also have Preferred Names.

Long Name

the field that stores the primary name of a Data Element; other administered components (i.e., Value Domains, Data Element Concepts, and Conceptual Domains) also have Long Names.

Document Text

the field in which additional Data Element names or documentation may be stored. Document Text associated with a Document Type of "Long Name" ("Long Name" in old CDE Browser) or "Historic Short CDE Name" ("Short Name" in old CDE Browser) are additional Data Element names, often containing the text or question most likely to be used on a CRF. Instructions will be associated with a Document Type of "Comment".

Context

the business unit or other content division that is responsible for creating and managing associated content; in the caDSR, Contexts currently represent NCI programs and divisions.

Workflow Status

the administrative status of a Data Element or other administered component. Within the "CTEP" Context, this refers to the Data Element's progress in the CDE disease committee review process. Please refer to the caDSR Business Rules for definition and usage information for each workflow status.

Version

the version number of an administered component; the version number is incremented when significant changes are made to an administered component. Please refer to the caDSR Business Rules for an explanation of rules governing the creation of new versions of administered components.

Origin

the source of the administered component or standard on which it is based.

Historical CDE ID

a number that was previously assigned to a Data Element as an identifier; a Data Element may have many Historical CDE IDs.

Public ID

the unique seven-digit identifier assigned to each administered component, for a Data Element may also be referred to as the CDE ID; each administered component has one and only one Public ID.

Designation

the indication by a Context of their endorsement of a Data Element created by another program or division, may also be referred to as "Used By"; indicates to users that the Data Element is approved for use in this other Context as well. Please refer to the caDSR Business Rules for more information about the use of designations.

Data Element Concept

the representation of a semantic concept without ties to a specific data type, similar in nature to a Data Element.

Value Domain

the collection of attributes that describe in detail the type of data to be collected. Within the "CTEP" Context, a Value Domain describes the type of data that is being collected by a question on a CRF. If there is an enumerated list of Valid Values, it is these Valid Values that appear on the CRF as potential answers.

Valid Values

the enumerated responses, defined by Value Meanings, associated with a Data Element through its associated Value Domain, may also be referred to as "Permissible Values". Within the "CTEP" Context, values that appear on the CRF as potential answers to a question.
Value Meaning: the essence of the data that is being collected, rather than the actual data itself.

Conceptual Domain

a collection or description of related Value Meanings

Classification

the relational categorization of Data Elements or other administered components for purposes of organization and ease of searching. Within the "CTEP" Context, classifications indicate the collections of Data Elements approved through the CDE disease committee review process and group Data Elements according to probable form use.

Classification Scheme

a defined system for categorizing Data Elements or other administered components, may also be referred to as "CS"; a Classification Scheme is composed of related Classification Scheme Items that serve as categories defining the scope of the scheme. Within the "CTEP" Context, there are three main Classification Schemes: "Disease", "Trial Type Usage", and "Category".

Classification Scheme Item

a category within a Classification Scheme to which Data Elements or other administered components may be assigned, may also be referred to as "CSI".

Core

the designation by a disease committee of a Data Element, indicating the committee's determination that it is likely to be used in most phase 3 clinical trials; used on one or more template CRFs.

Non-core

the designation by a disease committee of a Date Element for which a less frequent need is anticipated; does not appear on any template CRFs for the disease.

Locating CDEs

The CDE Browser provides two main mechanisms for locating CDEs.

The first of these is to enter search criteria in the text boxes on the right frame of the screen. You may search by keyword or CDE ID. In addition or instead, you may search by associated Value Domain or Data Element Concept, Workflow Status(es), and Classification assignments. You may also specify whether you wish to retrieve all versions of each matching CDE or only the most recent version; in most cases, the latest version will be the approved or "Released" version of the Data Element. Keyword searches can be limited to one or more types of Data Element names, if preferred. Searches by CDE ID will search both CDE IDs and Historical CDE IDs.

The second mechanism for locating CDEs is to use the navigation tree in the left frame. Each time you click on a node in the tree, all CDEs associated with that Context, Classification Scheme, Classification Scheme Item, or Protocol Form Template (template CRF) are displayed in the right frame under the search criteria. Navigation Links are displayed at the bottom of the frame so you may view a different page of CDEs.

You may also use a combination of these mechanisms, clicking on a node in the navigation tree and then entering search criteria. The criteria will only be matched against the CDEs associated with the selected Context, Classification Scheme, Classification Scheme Item, or Protocol Form Template. If you do not click on any nodes or if you click on "caDSR Contexts," your search criteria will be matched against all CDEs in the caDSR.

You can find CTEP CDEs by clicking on "NCI Cancer Therapy Evaluation Program (CTEP)" in the tree. You may further refine your search by clicking on the CTEP folder icon. From here, you may click on "Protocol Form Templates" or "Classifications".

"Protocol Form Templates" allow you to search, by Phase or Disease, the CDEs contained on each template CRF developed by the CDE disease committees. These template CRFs are intended to provide examples of use of the most common CDEs for a particular disease. Within each disease, forms are classified further by type of form and study phase. Once you have selected a template CRF, you can view the template CRF in Microsoft Word or download the associated CDEs using XML or Excel; these links are under the "Search Data Elements" and "Clear" buttons. You may also browse the details of the associated CDEs online.

"Classifications" allow you to search the CDEs by category. "Type of Category" classifies CDEs by their typical use in a clinical trial (e.g., Patient Demographics, Labs, Adverse Events). "Type of Disease" categorizes CDEs by disease, according to the decisions of the CDE disease committees. "Trial Type Usages" provides classification by disease phase or disease description, according to use on template CRFs.

"Type of Disease" allows you to search the CDEs by CDE disease committee designation as "core" or "non-core".

Once a list of CDEs has been selected, either through searching or use of the navigation tree, you may view the details of each individual element online or the entire list may be downloaded using XML or Excel. To download selected CDEs, click on the appropriate link for the preferred format (XML or Excel); these links are under the "Search Data Elements" and "Clear" buttons. The Excel download contains the most pertinent details for each Data Element, including the names of the associated Data Element Concept and Value Domain, and all of the associated Valid Values. The download in XML provides significant detail, including many of the attributes specified in ISO 11179 for each Data Element. The DTD used by the CDE Browser to download Data Elements shows the fields from which data is included and how the data is ordered.

CDE Details

Once you have located the desired CDE by searching or using the navigation tree, you may click on its underlined Preferred Name to view its details online. This will open a pop-up window with five tabs. Those that will be of most use to the general CTEP user include Data Element (details about the CDE), Valid Values (attributes of the associated Value Domain and Valid Values), and Classifications (categorization of the CDE).

Building Case Report Forms (CRFs)

Using CDEs on CRFs

Once CDEs for a disease are released by CTEP, they must be implemented on CRFs for all phase 3 studies of that disease submitted to CTEP. Collections of disease-specific CDEs have been developed and released for public use for bladder, breast, colorectal, gynecologic, lung, prostate, and upper gastrointestinal cancers, as well as for melanoma and leukemia. The exact language of the CDE, both Data Element name and Valid Values, must be used.

CDEs that have been approved for use on CRFs being submitted to CTEP include those that have "CTEP" as their Context or designating Context and also have "Released" or "Released-non-compliant" as their Workflow Status. CDEs in the CTEP Context with other Workflow Statuses are either in the process of being reviewed by a disease committee or have been retired or removed from use. Please consult the Workflow Status definitions for more information.

Entries in the following fields may be used as questions on CRFs: Long Name, or entries in Document Text of Type "Long Name" or "Historic Short CDE Name". When choosing a CDE to use, look carefully at its names and definition to determine whether it is appropriate for your needs. For a given CDE, the entire set or a subset of the Valid Values may be used as answers to a question.

If you cannot locate a "Released" or "Released-non-compliant" CDE that is appropriate for the question you would like to ask on your CRF, expand your search to include CDEs with other Workflow Statuses. Many of these CDEs are currently being reviewed by the CDE disease committees, but special approval may be given for their use on your CRFs if suitable and there are no "Released" terms that might be recommended. Specifically, you might look for CDEs with the following Workflow Statuses: Committee Approved, Committee Submitted Used, Approved for Trial Use, Draft Mod, Draft New, Committee Submitted, Retired Withdrawn.

If you are unable to locate any CDEs that are appropriate for the question you would like to ask, please word the question in a manner similar to other CDEs and submit it on your CRF. The reviewers will conduct an extensive search to determine if there is an existing CDE to recommend or if there is need for a "Draft New" CDE to be created.

For questions concerning CTEP data in the caDSR, please contact the NCI CTEP CDE Compliance Review Team.

CDE Compliance Review

When your CRFs have been developed, submit them to CTEP's Protocol and Information Office (PIO), who will forward them to the CDE reviewers for compliance review. It is preferred that the CRFs be submitted as e-mail attachments without security that prevents copying text from the files.

The initial review consists of three Excel spreadsheets that report the results of the CDE Compliance Review and a statement of whether the CRFs are considered CDE-compliant.

The Question Comparison Report indicates for each CRF question whether it was considered an exact match to an existing CDE, whether it should be replaced with a recommended term, or whether it has been created as a new element to meet the specific needs of the protocol. Where indicated, a response will be required of you in the Group Comments column, such as whether you agree with and will use the recommendations or would like to suggest another Data Element for use.

The Valid Value Comparison Report indicates whether, for each CRF question, each CRF valid value was considered an exact match to an existing value, whether it should be replaced with a recommended term, or whether it has been created as a new value to meet the specific needs of the protocol. Again, where indicated, a response will be required as to whether you agree with and will use recommendations or would like to suggest another Valid Value or Data Element for use.

The Proposed New Data Elements Report includes those CRF questions for which there was no possible match in the dictionary. It is requested that you develop the definition for each of these so new Data Elements may be created. Use of these new Data Elements is allowed on a one-time basis for the particular protocol for which they were created; they will also be forwarded to the appropriate CDE disease committee as part of the CDE change management process. If approved by the committee, these Data Elements will be published and will be available for use in future studies.

Unless the CRFs are CDE-compliant, a response is required of you. Please indicate your responses, where required, on the three spreadsheets and submit these and your revised forms for re-review. The same review process will be conducted on any non-compliant questions, as well as any questions or values new to or modified on the CRFs.

When the CRFs are CDE-compliant, final spreadsheets indicating the CDE ID numbers used will be sent to you through PIO and to the Cancer Trials Support Unit (CTSU), if appropriate.

It is required that you resubmit your CRFs when any changes are made to them. To speed the review process, please include a memo outlining what changes were made.

CDE Development

The development of CDEs has been a collaboration of CTEP and the Clinical Trials Cooperative Group Program, Specialized Programs of Research Excellence (SPOREs), the Cancer Biomarkers Research Group, the Early Detection Research Network, NCI Center for Bioinformatics, Oracle Corporation, and The EMMES Corporation.

Standards Leveraged by the CDE Project

The CDE project extensively leverages all existing work supporting the collection of common data, such as the CRFs, surveys, and data reporting formats developed by a variety of groups.

  • Expanded Participation Project (EPP)
  • Clinical Data Update System (CDUS) through the Cancer Therapy Evaluation Program (CTEP)
  • Clinical Trials Cooperative Group Program
  • Cancer Family Registry
  • Cancer Genetics Network
  • Early Detection Research Network (EDRN)
  • Lung Cancer Biomarker Chemoprevention Consortium (LCBCC)
  • Specialized Programs of Research Excellence (SPOREs)

The following cancer-specific standards have also been considered in developing CDEs:

  • American Joint Committee on Cancer (AJCC)
  • American College of Surgeon's Commission on Cancer (COC)
  • NCI's Surveillance Epidemiology and End Results (SEER) program
  • North American Association of Central Cancer Registries (NAACCR)

The following national and international standards and standards organizations were consulted in developing CDEs:

  • World Health Organization (WHO)
  • International Classification of Diseases (ICD)
  • Standard Industry Classification (SIC)
  • National Drug Codes (NDC)
  • International Medical Terminology (IMT)
  • Medical Dictionary for Regulatory Activities (MedDRA)
  • Unified Medical Language System (UMLS)
  • Digital Imaging and Communications in Medicine (DICOM)

Refer to the Glossary at the end of this document for a more complete list of all organizations contributing to the standards used in CDE development.

Disease Committee Participation

The disease and special topics committees include representatives from one or more groups listed below. Oracle Corporation and The EMMES Corporation provide technical support to the CDE project.

  • Cancer Therapy Evaluation Program (CTEP)
  • Lung Cancer Biomarkers and Chemoprevention Consortium (LCBCC)
  • American College of Surgeons Oncology Group (ACOSOG)
  • Cancer and Leukemia Group B (CALGB)
  • Children's Oncology Group (COG)
  • Eastern Cooperative Oncology Group (ECOG)
  • European Organization for Research and Treatment of Cancer (EORTC)
  • Gynecologic Oncology Group (GOG)
  • National Cancer Institute (NCI)
  • National Cancer Institute of Canada (NCIC)
  • National Surgical Adjuvant Breast and Bowel Project (NSABP)
  • New Approaches to Brain Tumor Therapy (NABTT)
  • North Central Cancer Treatment Group (NCCTG)
  • Radiation Therapy Oncology Group (RTOG)
  • Southwest Oncology Group (SWOG)

Glossary

Acronyms and Abbreviations

Acronym

Definition

AJCC

American Joint Committee on Cancer http://www.cancerstaging.org/ Exit Disclaimer logo

CaPCURE

Association for the Cure of Cancer of the Prostate http://www.capcure.org Exit Disclaimer logo

CDC

U.S. Centers for Disease Control and Prevention http://www.cdc.gov

CDEs

common data elements http://cdebrowser.nci.nih.gov

CDUS

Clinical Data Update System http://ctep.cancer.gov/reporting/cdus.html

CFR

U.S. Code of Federal Regulations http://www.access.gpo.gov/nara/cfr/cfr-table-search.html

COC

Commission on Cancer (American College of Surgeons) http://www.facs.org/cancer/index.html Exit Disclaimer logo

CRFs

case report forms

CTC (2.0)

Common Toxicity Criteria http://ctep.cancer.gov/reporting/ctc.html

CTCAE (3.0)

Common Terminology Criteria for Adverse Events http://ctep.cancer.gov/reporting/ctc.html

CTCAE (3.0)

Common Terminology Criteria for Adverse Events

CTEP

Cancer Therapy Evaluation Program http://ctep.cancer.gov/

CTSU

Cancer Trials Support Unit http://www.ctsu.org/ Exit Disclaimer logo

DICOM

Digital Imaging and Communications in Medicine (Radiological Society of North America) http://www.rsna.org/practice/dicom/ Exit Disclaimer logo

EDRN

Early Detection Research Network http://www3.cancer.gov/prevention/cbrg/edrn/

ELCAP

Early Lung Cancer Action Program http://icscreen.med.cornell.edu/ Exit Disclaimer logo

EPP

Expanded Participation Project http://spitfire.emmes.com/study/epp/ Exit Disclaimer logo

FDA

U.S. Food and Drug Administration http://www.fda.gov/

FIGO

International Federation of Gynecology and Obstetrics http://www.figo.org/ Exit Disclaimer logo

HIPAA

Health Insurance Portability and Accountability Act http://www.jhita.org/admsimp.htm Exit Disclaimer logo .

HISB

Healthcare Informatics Standards Board (American National Standards Institute http://www.ansi.org/standards_activities/standards_boards_panels/hisb/overview.aspx?menuid=3 Exit Disclaimer logo .

HL7

Health Level Seven http://www.hl7.org/ Exit Disclaimer logo .

IBCSG

International Breast Cancer Study Group http://www.ibcsg.org/ Exit Disclaimer logo

ICD

International Classification of Diseases http://www.who.int/whosis/icd10/ Exit Disclaimer logo

ICH

International Conference on Harmonisation [sic] of Technical Requirements for Registration of Pharmaceuticals for Human Use http://www.ich.org/ Exit Disclaimer logo .

IEC

International Electrotechnical Commission http://www.iec.ch/ Exit Disclaimer logo

IMT

International Medical Terminology

ISCO 88

International Standard Classification of Occupations http://www.ilo.org/public/english/bureau/stat/class/isco.htm Exit Disclaimer logo

ISO

International Organization for Standardization http://www.iso.org/ Exit Disclaimer logo

ISO 3166

International Organization for Standardization, Codes for the Representation of Names of Countries and Their Subdivisions http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/index.html Exit Disclaimer logo .

ISO 8601

International Organization for Standardization, Data Elements and Interchange Formats - Information Interchange - Representation of Dates and Times http://www.iso.ch/iso/en/prods-services/popstds/datesandtime.html Exit Disclaimer logo .

ISO 11179

International Organization for Standardization, Information Technology - Specification and Standardization of Data Elements
http://isotc.iso.ch/livelink/livelink/fetch/2000/2489/Ittf_Home/PubliclyAvailableStandards.htm??Redirect=1 Exit Disclaimer logo

LCBCC

Lung Cancer Biomarkers and Chemoprevention Consortium

LOINC

Logical Observation Identifiers, Names and Codes http://www.loinc.org Exit Disclaimer logo

MedDRA

Medical Dictionary for Regulatory Activities http://www.fda.gov/MedWatch/report/meddra.htm

NAACCR

North American Association of Central Cancer Registries http://www.naaccr.org Exit Disclaimer logo

NABTT

New Approaches to Brain Tumor Therapy http://www.nabtt.org Exit Disclaimer logo

NAICS

North American Industry Classification System http://www.census.gov/epcd/www/naics.html

NCCTG

North Central Cancer Treatment Group http://ncctg.mayo.edu/ Exit Disclaimer logo

NCI

U.S. National Cancer Institute http://www.cancer.gov

NCICB

U.S. National Cancer Institute Center for Bioinformatics http://ncicb.nci.nih.gov/

NDC

National Drug Code http://www.fda.gov/cder/ndc

NHANES

National Health and Nutrition Examination Survey http://www.cdc.gov/nchs/nhanes.htm

NIH

U.S. National Institutes of Health http://www.nih.gov/

NLM

U.S. National Library of Medicine http://www.nlm.nih.gov

RECIST

Response Evaluation Criteria in Solid Tumors http://www.nci.nih.gov/bip/RECIST.htm

SEER

Surveillance, Epidemiology, and End Results Program http://seer.cancer.gov/

SIC

Standard Industrial Classification http://www.osha.gov/oshstats/sicser.html

SPOREs

Specialized Programs of Research Excellence http://spores.nci.nih.gov SPOREs: Specialized Programs of Research Excellence

UMLS

Unified Medical Language System http://www.nlm.nih.gov/research/umls/

USHIK

United States Health Information Knowledgebase http://www.ushik.org/ Exit Disclaimer logo

WHO

World Health Organization http://www.who.int Exit Disclaimer logo

Clinical Trials Cooperative Groups

CTEP CDE Category Definitions

Category

Definition

Adverse Events

CDEs that characterize the untoward effects of the therapeutic intervention using the Common Toxicity Criteria (CTC) and Common Terminology Criteria for Adverse Events (CTCAE), to consistently grade the severity of the event and to provide information as to whether the treatment may have been a cause.

Cytogenetics

CDEs that characterize the results of cellular analysis to identify genetic abnormalities present in cancer by examining cellular components concerned with the structure and function of chromosomes responsible for development and differentiation of cells.

Disease Description

CDEs that characterize the disease, such as diagnosis, location, and extent.
Eligibility Criteria: CDEs that characterize the criteria used to assess whether the individual is eligible for the clinical trial or study.

Follow-up

CDEs that characterize the sequential assessment of the disease or vital status of the individual, such as progression, long-term toxicity, and date of death.

Immunophenotype

CDEs that characterize the results of analysis to divide leukemias and lymphomas into clonal subgroups on the basis of differences in their cell surfaces and cytoplasmic antigens, detecting these differences using monoclonal antibodies, flow cytometry, etc.

Labs

CDEs that characterize the results of laboratory tests such as LDH, WBC, creatinine, and glucose.
Molecular Analysis: CDEs that characterize the results of genetic tests to identify abnormalities present in cancer that are related to the chemical structure, function, replication, and mutation of DNA and RNA molecules in the transmission of genetic information, and health effects relating to gene arrangement and RNA transcription to direct the formation of proteins.

Patient Characteristics

CDEs that characterize the health and emotional state of the individual enrolled on the study including treatments for a prior cancer.

Patient Demographics

CDEs that characterize the individual enrolled on the study, such as name, address, date of birth, weight, and height.

Protocol/Administrative

CDEs that characterize the regulatory, reporting, and data management aspects of clinical trials, such as IRB date and protocol number.

Response

CDEs that characterize the outcome of the study, such as overall tumor response, partial response, and first date observed.

Treatment

CDEs that characterize the properties of the intervention regimen, such as agent, dose, procedure, and modalities.

Tumor Markers

CDEs that characterize biological markers for presence or level of involvement; specific to a disease, such as PSA, CA125.

CDE Terms

CDE Term

Definition

Administered component

an item about which administrative data is collected. A Data Element is the most familiar type of administered component, although many administered component types exist within an ISO 11179 database and within the caDSR.

caDSR

Cancer Data Standards Registry and Repository, a robust metadata registry, developed and maintained by the NCI Center for Bioinformatics and Information Technology, that stores NCI CDEs and related attributes.

Case report form (CRF)

a data collection form.

Case report form (CRF) module

a collection and sequence of elements grouped to provide context for the information requested by the CRF questions.

CDE Browser

the primary user interface (http://cdebrowser.nci.nih.gov) to search, browse, and export Data Elements from the caDSR.

CDE ID

a unique seven-digit identifier assigned to each Data Element, may also be referred to as the Data Element's Public ID; each Data Element has one and only one CDE ID.

Classification

the relational categorization of Data Elements or other administered components for purposes of organization and ease of searching. Within the "CTEP" Context, classifications indicate the collections of Data Elements approved through the CDE disease committee review process and group Data Elements according to probable form use.

Classification Scheme

a defined system for categorizing Data Elements or other administered components, may also be referred to as "CS"; a Classification Scheme is composed of related Classification Scheme Items that serve as categories defining the scope of the scheme. Within the "CTEP" Context, there are three main Classification Schemes: "Disease", "Trial Type Usage", and "Category".

Classification Scheme Item

a category within a Classification Scheme to which Data Elements or other administered components may be assigned, may also be referred to as "CSI".

Common data element (CDE)

a standardized term for the collection and exchange of data. CDEs are metadata; they describe the type of data being collected, not the data itself. A basic example of metadata is the question presented on a form, "Patient Name," whereas an example of data would be "Jane Smith".

Conceptual Domain

a collection of or description of related Value Meanings; may be either enumerated or descriptive only.

Context

an organizational division within an ISO 11179 database. A Context may represent a business unit or some other content division that is responsible for creating and managing associated content. All administered components within the database are associated with a Context, either that in which they originated or are used. All CDEs that were created by CTEP are associated with the "CTEP" context.

Core CDE

a Data Element that is included on one or more template CRFs. The designation of a CDE as "core" for a disease indicates the committee's determination that it is likely to be used in most phase 3 clinical trials.

Data Element

the basic unit of data that is being collected in an ISO 11179 database, a metadata descriptor. It represents a semantic concept and indicates the specific type of data to be collected. Data Elements are named and defined in a standardized manner according to Context-specific naming conventions.

Data Element Concept

the representation of a semantic concept without ties to a specific data type, similar in nature to a Data Element.
Data type: the form in which data is being collected (e.g., character, date, integer) in response to a Data Element; specified for each Value Domain.

Decimal Place

the number of places behind the decimal point in the response to a Data Element; specified for a Value Domain.

Definition

the detailed meaning of a Data Element or other administered component.

Designation

the indication by a Context of their endorsement of a Data Element created by another program or division, may also be referred to as "Used By"; indicates to users that the Data Element is approved for use in the designating Context as well. Please refer to the caDSR Business Rules for more information about the use of designations.

Document Text

the field in which additional Data Element names or documentation may be stored; names in this field are not bound by naming conventions.

Historical CDE ID

a number that was previously assigned to a Data Element as an identifier; a Data Element may have many Historical CDE IDs.

ISO 11179

Information Technology - Metadata Registries (http://standards.iso.org/ittf/PubliclyAvailableStandards/index.html Exit Disclaimer logo developed by the International Organization for Standardization and the International Electrotechnical Commission.

Long Name

the field that stores the primary name of a Data Element or other administered component.

Maximum length

the maximum number of storage units (of the corresponding data type) that may be used in representing the response to a Data Element; specified for a Value Domain.

Metadata

data (attributes) that describe the type of data being collected.

Minimum length

the minimum number of storage units (of the corresponding data type) that must be used in representing the response to a Data Element; specified for a Value Domain.

Non-core CDE

a Data Element for which a CDE disease committee anticipates a less frequent need. Non-core CDEs are not included on any template CRFs for a disease.

Object Class term

an administered component, frequently used in naming Data Elements; a thing about which data is being collected.

Origin

the source of the administered component or standard on which it is based.

Phase 1 Clinical Trials

these first studies in people evaluate how a new drug should be given (by mouth, injected into the blood, or injected into the muscle), how often, and what dose is safe. A phase 1 trial usually enrolls only a small number of patients.

Phase 2 Clinical Trials

a phase 2 trial continues to test the safety of the drug and begins to evaluate how well the new drug works. Phase 2 studies usually focus on a particular type of cancer.

Phase 3 Clinical Trials

these studies test a new drug, a new combination of drugs, or a new surgical procedure in comparison to the current standard for treatment. A participant will usually be assigned to the standard treatment group or the new treatment group at random (called randomization). Phase 3 trials often enroll large numbers of people and may be conducted at many doctors' offices, clinics, and cancer centers nationwide.

Preferred Name

the field that stores the short, 20- or 30-character name ("computer" name) of a Data Element or other administered component.

Property term

an administered component, frequently used in naming Data Elements; a characteristic or possession of the object class.

Public ID

the unique seven-digit identifier assigned to each administered component, for a Data Element may also be referred to as the CDE ID; each administered component has one and only one Public ID.

Qualifier

an attribute, frequently used in naming Data Elements and other administered components; a modifier that describes any other term, similar to an adjective.

Representation term

an attribute, frequently used in naming Data Elements and other administered components; specifies the form of the data that is being collected.

Template CRFs

a CRF developed as a guideline by a CDE disease committee to provide a graphic representation of the "core" CDEs for each disease. The designation of a CDE as "core" for a disease indicates the committee's determination that it is likely to be used in most phase 3 clinical trials.

Valid Values

the enumerated response, defined by Value Meanings, associated with a Data Element through its associated Value Domain, may also be referred to as "Permissible Values". Within the "CTEP" context, values that will appear on the CRF as potential answers to a question.

Value Domain

the collection of attributes that describe in detail the type of data to be collected. Attributes of a Value Domain include data type, maximum and minimum field lengths, high and low values, unit of measure, and number of decimal places. A Value Domain may also include an enumerated list of specific Valid Values.

Value Meaning

the essence of the data that is being collected, rather than the actual data itself. For example, a response to the question "Patient Name" might be "Jane Smith." "Jane Smith" is actual data, whereas the essence of the data is "the name of a person." Another example is the question "Country of Residence," which includes as responses the two-letter code for each country in the world. The codes would be Valid Values in a Value Domain, but the Value Meanings would be the list of countries in the world.

Version

the version number of an administered component; the version number is incremented when significant changes are made to an administered component. Please refer to the caDSR Business Rules for an explanation of rules governing the creation of new versions of administered components.

Workflow Status

indication of the administrative status of a Data Element or other administered component. Within the "CTEP" Context, this refers to the Data Element's progress in the CDE disease committee review process. Please refer to the caDSR Business Rules for definition and usage information for each workflow status.

Additional Resources

 

 

  • No labels