Skip Navigation
National Cancer Institute U.S. National Institutes of Health www.cancer.gov
NCI Wiki New Account Help Tips
Skip to end of metadata
Go to start of metadata


This page describes the different ways data can be classified in TCGA. The following topics are included in this section:

Data Type

A data type is a label to categorize the many forms of platform data within the TCGA Network.

Each platform can potentially produce many kinds of data (data types). For example, SNP-based array platforms are the most complex in that the platform yields three data types: Copy Number Results, LOH and SNP. The following table identifies data types produced by the six listed platforms.

Value

Agilent Human Genome CGH Custom Microarray 2x415K

Agilent Human Genome CGH Microarray 244A

Agilent SurePrint G3 Human CGH Microarray Kit 1x1M

Affymetrix Genome-Wide Human SNP Array 6.0

Illumina 550K Infinium HumanHap550 SNP Chip

Illumina Human1M-Duo BeadChip

Copy Number Results

yes

yes

yes

yes

yes

yes

LOH

yes

yes

yes

SNP

yes

yes

yes

Data Level Classification

Data level is a method of data categorization used within the TCGA network to facilitate researchers in communicating and locating their data of interest.

Data levels are assigned for each data type, platform and center. There are four data levels: Level 1 (for Raw Data), Level 2 (for Processed Data), Level 3 (for Segmented or Interpreted Data) and Level 4 (for Region of Interest Data).

The following table outlines and describes the four TCGA data levels.

Data Level

Level Type

Description

1

Raw

  • Low-level data for single sample
  • Not normalized

2

Processed

  • Normalized single sample data
  • Interpreted for presence or absence of specific molecular abnormalities

3

Segmented/ Interpreted

  • Aggregate of processed data from single sample
  • Grouped by probed loci to form larger contiguous regions (in some cases)

4

Summary/Regions of Interest (ROI)

  • Quantified association across classes of samples
  • Associations based on two or more
    • Molecular abnormalities
    • Sample characteristics
    • Clinical variables

 

Relationships Between Data Type and Data Level

Each platform can produce multiple data types. To understand data categorization, it is important to clarify the relationship between data type and data level.

Each data type is associated with sets of data that span one more data levels. Each center and platform may have a slightly different concept of data level depending on their data types, and the algorithms used for analysis.

For descriptions of data types and corresponding data levels see the Data Types and Data Levels section in the TCGA Data Portal.

Labels
  • None