NIH | National Cancer Institute | NCI Wiki  

Each line of the data file contains one record, is prefixed by the table name, and is comma delimited.

Important!

Use blank fields for those fields that are included in the CDUS standard but not used by the CTRP.

Valid Record Formats and Field Sequence
COLLECTIONS,<Study_Identifier>,,,,,,,,,<Change_Code>
PATIENTS,<Study_Identifier>,<Study_Subject_Identifier>,<Zip_Code>,<Country_Code>,<Birth_Date>,<Gender>,<Ethnicity>,<Payment_Method>,<Subject_Registration_Date>,<Registering_Group_Identifier>,<Study_Site_Identifier>,,,,,,,,,,<Subject_Disease_Code>,,
PATIENT_RACES,<Study_Identifier>,<Study_Subject_Identifier>,<Race>

The following is an example batch file for a study that has three study subjects, and one race per subject using CTRP accepted valid values. In the example below "Male", for example, is used instead of a CDUS accepted numeric value of "1".

Example Batch File
COLLECTIONS,"NCI-2011-03861",,,,,,,,,1
PATIENTS,"NCI-2011-03861",873222899999999,84124,,196311,Male,Unknown,Private Insurance,20060809,CALGB,149280,,,,,,,,,,238.7,,
PATIENTS,"NCI-2011-03861",8732228,84124,,196311,Male,Unknown,Private Insurance,20060809,CALGB,149280,,,,,,,,,,238.7,,
PATIENTS,"NCI-2011-03861",1,84124,,196311,Male,Unknown,Private Insurance,20060809,CALGB,149280,,,,,,,,,,185.0,,
"PATIENT_RACES","NCI-2011-03861",8732228,White
"PATIENT_RACES","NCI-2011-03861",873222899999999,Asian
"PATIENT_RACES","NCI-2011-03861",1,White

The following is another example batch file, accepted by CTRP, for the same study but using CDUS accepted numeric codes instead of the text values used in the example above .

Example Batch File
COLLECTIONS,"NCI-2011-03861",,,,,,,,,1
PATIENTS,"NCI-2011-03861",873222899999999,84124,,196311,1,9,1,20060809,CALGB,149280,,,,,,,,,,238.7,,
PATIENTS,"NCI-2011-03861",8732228,84124,,196311,1,9,1,20060809,CALGB,149280,,,,,,,,,,238.7,,
PATIENTS,"NCI-2011-03861",1,84124,,196311,1,9,1,20060809,CALGB,149280,,,,,,,,,,185.0,,
"PATIENT_RACES","NCI-2011-03861",8732228,01
"PATIENT_RACES","NCI-2011-03861",873222899999999,05
"PATIENT_RACES","NCI-2011-03861",1,01

Special Characters

If you include any of the following characters in a value, enclose the field with double quotes:

! " # $ % & ' ( ) * +  , -  . / : ; < > = ? @ [] \^ _{} | ~

If you enclose a field with double quotes (as in "NCI-2012-00225"), CTRP interprets the string inside the quote exactly as presented. If the field does not contain any special characters, the quote marks are optional.


ICD-O-3 Trial Format for Topography Codes

For trials using ICD-O-3 codes, use the Subject Disease Code position for ICD-O-3 Topography and Morphology (which must include Histology and Behavior codes). When you use both Topography and Morphology codes, separate them by a semi-colon as per the example below.
Format:  site code; histology code
Code: C64.9;8000/3

Examples:

SDC Topography
PATIENTS,"NCI-2011-03861",8732228,84124,,196311,1,9,1,20060809,CALGB,149280,,,,,,,,,,238.7,,
ICD-O-3 Topography and Morphology
PATIENTS,"NCI-2011-03861",8732228,84124,,196311,1,9,1,20060809,CALGB,149280,,,,,,,,,,C64.9;8000/3,,
  • No labels