NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 6 Next »

Question: Which file can be parsed into caArray? What is the benefit of file parsing?

Topic: caArray Usage

Release: caArray 2.2.0 and above

Date entered: 03/30/2009

Answer

Files Recognized by caArray

caArray has the ability to upload the array design file or experiment data from many array providers, even if it doesn't have a parser available yet. Those files will be imported into caArray without being validated or parsed. Even if a file is not parsed, the user will still be able to download the file (through the user interface as well as through the programmatic API), and will be able to associate the file to samples, extracts, and hybridizations. This feature allows data to be shared and help the system identify which new parsers are need developed in the future. For more information on how files are processed by caArray, review caArray 017 - What is the meaning of the caArray Status of Importing - Imported versus Imported Not Parsed?.

The following table, File types from caArray, from Chapter 7 in the caArray User Guide) summarizes the file types that caArray currently supports with full validation and parsing as well as those that can be imported without validation and parsing. The user's guide also summarizes the array design file types.

Files That Can Be Imported into caArray

File Types

Imported after validation and processing

Imported without validation and parsing

Raw/processed data files

  • Affymetrix CEL, CHP
  • GenePix GPR*
  • Illumina CSV
  • For GenePix .GPR files, the sample names are already implicit in the data files themselves. If such files are imported as part of a set of files including a MAGE-TAB SDRF, the SDRF file must contain all the sample names that are implicit. Otherwise, a validation error occurs.
  • Affymetrix DAT, RPT, TXT,and EXP
  • gilent TSV, TXT
  • Illumina IDAT, TXT
  • ImaGene TIF, TXT
  • Nimblegen GFF, TXT
  • ScanArray CSV
  • GEO SOFT
  • GEO GSM

Array Design files

  • Affymetrix CDF, PGF CLF
  • Illumina Design CSV
  • Genepix GAL
  • Note: These can be uploaded, validated and imported only through the Manage Array Design feature described in Managing Array
  • Agilent CSV, XML
  • UCSF Spot SPT
  • ImaGene TPL
  • Nimblegen NDF
  • Note: These can be uploaded, and imported only through the Manage Array Design feature described in Managing Array

MAGE-TAB files

  • MAGE-TAB SDRF (Sample and Data Relationship Format)
  • MAGE-TAB IDF (Investigation Description Format) only, no referenced SDRFs
    Note: Only one IDF is allowed per import, since the import is in the context of a single experiment.
  • MAGE-TAB ADF
  • MAGE-TAB Data Matrix

Benefit of File Parsing

For the data that are parsed into caArray, an analytical service (like geWorkbench) can pull the data out using the programmatic API and perform analysis on it or plot graphs from it etc. Another example is web Genome, a caArray client, which pulls parsed data from caArray experiments and plots log ratio values against the chromosome location. With parsed data, a client can ask for quantitative types (columns) of data of interest, instead of having to retrieve the entire contents of the data file.

Have a comment?

Please leave your comment in the caArray End User Forum.

  • No labels