Skip Navigation
NIH | National Cancer Institute | NCI Wiki   New Account Help Tips
Page tree

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Scrollbar
iconsfalse
Panel
borderWidth2px
Wiki Markup
{multi-excerpt:name=definition}
*Data level* is a method of data categorization used within the [TCGA|https://wiki.nci.nih.gov/x/bZNXAg] network to facilitate researchers in communicating and locating their data of interest.

Data levels are assigned for each data type, platform and center. There are four data levels: Level 1 (for Raw Data), Level 2 (for Processed Data), Level 3 (for Segmented or Interpreted Data) and Level 4 (for [Region of Interest|https://wiki.nci.nih.gov/x/AZhXAg] Data).
{multi-excerpt}
Panel
titleContents of this Page
Table of Contents

Data Level Classification

Data level distinguishes raw data from derived data, from higher-level analysis or interpreted results for each data type, platform, and center.

The following table lists and describes each TCGA data level.

Data Level

Level Type

Description

Example

1

Raw

Wiki Markup
{multi-excerpt:name=level_raw}* Low-level data for single sample
* Not normalized{multi-excerpt}
  • Sequence trace file
  • Affymetrix CEL file

    Footnote

    For affymetrix platform, If CEL files (level1 data) are normalized by RMA (Robust Multi-chip Average) method, the outputs are expression values normalized across all samples (level 3 data).

  • BAM file

2

Processed

Wiki Markup
{multi-excerpt:name=processed}* Normalized single sample data
* Interpreted for presence or absence of specific molecular abnormalities{multi-excerpt}
  • Putative mutation call for a single sample
  • Probed locus amplification/deletion/Loss of Heterozygosity (LOH) calls in a sample
  • Signal of a probe or probe set for a sample

3

Segmented/Interpreted

Wiki Markup
{multi-excerpt:name=segmented}* Aggregate of processed data from single sample
* Grouped by probed loci to form larger contiguous regions (in some cases){multi-excerpt}
  • Validated mutation call for a single sample
  • Amplification/deletion/Loss of Heterozygosity (LOH) calls for a sample region
  • Expression signal of a gene for a sample
  • Genomic copy-number data

4

Summary/Regions of Interest (ROI)

Wiki Markup
{multi-excerpt:name=roi}* Quantified association across classes of samples
* Associations based on two or more
** Molecular abnormalities
** Sample characteristics
** Clinical variables{multi-excerpt}
  • Discovery that a genomic region is amplified in 10% of TCGA glioma samples.

Footnotes Display
resettrue

 

Multiexcerpt
MultiExcerptNamerelationship

Relationships Between Data Type and Data Level

Each platform can produce multiple data types. To understand data categorization, it is important to clarify the relationship between data type and data level.

Each data type is associated with sets of data that span one more data levels. Each center and platform may have a slightly different concept of data level depending on their data types, and the algorithms used for analysis.

For descriptions of data types and corresponding data levels see Data Types and Data Levels.