NIH | National Cancer Institute | NCI Wiki  

Question: What is the format of caArray supplementary files that can be loaded into caIntegrator?

Topic: caIntegrator Usage

Release: caIntegrator 1.2 and above

Date entered: 08/31/2010

Answer

This knowledge base entry describes the format for creating supplemental files for use by caIntegrator. The supplemental files described here are to be added to an experiment in caArray prior to configuring a study in caIntegrator.

The file itself is a tab-delimited text file. The file extension can be anything, though .txt is typically used. The name of each supplemental file has to be unique within a caArray experiment.

Inside the file, each row in the file contains the data from one reporter. Each column in the file must have a unique header name. (That is, you can not give two different columns the same column name.)

There are 2 supported formats:

  1. SINGLE SAMPLE FORMAT
  2. MULTIPLE SAMPLE FORMAT

SINGLE SAMPLE FORMAT

The format requirements for the single sample format file are following:

  1. Minimum of two required columns.
  2. One column must contain the reporter/probe name.
  3. One column must contain the value be reported by the reporter.
  4. The file can have additional columns, though other than reporter/probe name and value mentioned above, the rest will be ignored.
  5. One SINGLE SAMPLE FORMAT file for each sample in the experiment.

Refer to the illustration below for an example of SINGLE SAMPLE FORMAT.

screenshot of example of SINGLE SAMPLE FORMAT where the exact data shown is not significant

MULTIPLE SAMPLE FORMAT

The format requirements for the multiple sample format file are following:

  1. one column must contain the reporter/probe name.
  2. each additional column contains the reporter values such that there is one column per sample.
  3. one MULTIPLE SAMPLE FORMAT file for the whole experiment.
  4. CAVEAT: Currently the MULTIPLE SAMPLE FORMAT is slower to load than the SINGLE SAMPLE FORMAT for platforms other than Agilent Copy Number. This performance issue is expected to be improved in a future release.

Refer to the illustration below for an example of MULTIPLE SAMPLE FORMAT.

screenshot of example of SINGLE SAMPLE FORMAT where the exact data shown is not significant

Have a comment?

Please leave your comment in the caIntegrator End User Forum.