NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Introduction to the CSSI DCC Portal

Data repositories are important tools in cancer research, providing safe and sustainable locations to store data, providing access to input data for meta-analyses, and allowing researchers to collaborate and share information across a common resource. The Center for Strategic Scientific Initiatives (CSSI) sponsors a diverse array of projects that generate datasets that vary in content and format, yet are related across certain defining characteristics or metadata. Integrated management of the datasets across all sponsored projects  make the data more accessible, easily accessed, and potentially reused by the cancer research community. 

The CSSI Data Coordinating Center (CSSI DCC) stores and manages access to data generated in support of cancer research funded or supported by the CSSI. This data is in CSSI DCC is a public repository for archive files that describe a scientific investigation, its study or studies, and each study's assay(s). Archives in this repository are in the standard Investigation-Study-Assay tab-delimited format (ISA-TAB) format; this format allows you to curate, manage, and reuse your own datasets and those that others create, which describes a scientific investigation, its study or studies, and each study's assay(s). For more information on the ISA-TAB format, refer to the following section, Chapter 1: Getting Started with the CSSI DCC Portal, as well as the ISA-TAB specification

Multiexcerpt include
MultiExcerptNameExitDisclaimer
nopaneltrue
PageWithExcerptwikicontent:Exit Disclaimer to Include
.You can download the full archive files for any public investigation without logging in. Full archive files include the metadata describing the investigation, its study or studies, and each study's assay(s) and any data files associated with the investigation. You can download the full archive, metadata, and selected data files separately.

The CSSI DCC Portal is the repository for CSSI DCC data. It serves the following purposes:

  • Provides a common location and web access to data from disparate data types including gene expression results from Next Generation Sequencing, microarray experiments, histopathological images, metabolomics data and proteomics data, allowing for easy access by multiple collaborators and researchers located at different geographic locations. Is flexible enough to handle new and unspecified data types.
  • Stores the data in one common location so that you can make biological insights that would otherwise be missed by having data in multiple locations.
  • Applies the information gained from one study to multiple studies and projects.
  • Allows you to search the metadata from each study to identify datasets of interest.
  • Develops data storage and data mining modules that can be applied across studies, avoiding duplication of effort and saving costs.
  • Develops and/or adopts common vocabularies, data standards, and ontologies for data representation, storage, and comparison.  

What is ISA-TAB?

Investigation-Study-Assay tab-delimited format (ISA-TAB) is a format based on the ISA-TAB specification Exit Disclaimer logo that is used to capture and communicate the complex metadata required to interpret investigations (experiments) employing combinations of technologies. Metadata in ISA-TAB format facilitates standards-compliant collection, curation, management, and reuse of datasets in a wide variety of life science domains. ISA-TAB builds on the existing paradigm that is Microarray Gene Expression - Tabular format (MAGE-TAB)-a tab-delimited format to exchange microarray data.

...