NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 10 Next »

Checklist for Building caArray's MAGE-TAB Annotation Files

Topic: caArray Usage

Release: Up to caArray 2.2

Date entered: 02/17/2009

Contents of this Page

Introduction

caArray has been designed to handle MAGE-TAB data using certain conventions. The caArray User Guide devotes more than one whole chapter to MAGE-TAB files with useful information from recognized data field to validation rules (Chapter 6: Submitting Data to an Experiment (page 87) and Appendix A: MAGE-TAB in caArray (page 101). Highlights are outlined below to offer you a quick start.

Recognized Fields and Validation Rules for MAGE-TAB Annotation Files

  • You may describe your experiment in great detail with the convenience of MAGE-TAB file. For example, biomaterial (Source, Sampler, Extract, or labeled Extract) can be followed by any number of Characteristics columns containing annotation information about that biomaterial.
    However, it should be noted that caArray does have requirements about which fields should be used and in which order they should be listed. For example, in an SDRF file, the Biomaterials columns must follow the order of Source Name - Sample Name - Extract Name - Labeled Extract Name.
    For a detailed list of recognized fields and validation rules for IDF and SDRF files, refer to the caArray User Guide (Appendix A, page 107-page 111).
  • Controlled vocabularies should be used in IDF and SDRF files. The commonly used vocabularies can be found in MGED Ontology (MO) or the NCI Thesaurus. Each TERM SOURCE REF should have an entry in the IDF file. If term source is unknown, use "caArray" in the TERM SOURCE REF column.
  • All column types are not mandatory in the SDRF. caArray can Auto-generate missing biomaterials and associate protocols intelligently.

MAGE-TAB Files: Upload and Import

  1. MAGE-TAB Files have to be imported into experiments. That is, an experiment needs to be created before MAGE-TAB files are imported.
  2. Although only one IDF is allowed per import session, multiple IDF files are allowed per experiment.
  3. IDF, SDRF(s), data file(s) referred to in SDRF files have to be validated and imported together.
  4. Data files that are not referred to by SDRF files can also be imported into the same experiment, but they have to be uploaded, validated and imported in a separate session from IDF/DSRF files.
  5. ADF and Data Matrix do not need to be validated. They can be imported directly.

SDRF File Has Higher Priority

Annotation can be set via the "Annotation" user interface, or via an SDRF file, (refer to caArray008], Annotation Tab vs. MAGE-TAB Annotation Files in caArray. The designation in the SDRF is authoritative. For example: the tissue site for one sample is set to be "Lung" from the annotation interface, but it is set to be "Brain" in the SDRF file. Upon data upload, the tissue site will be shown as "Brain", since SDRF file has higher priority than "Annotation" interface.

Known Bug List in caArray

Known Bug List for MAGE-TAB Files

Error Message

Known Bugs

"Term Source Ref is not preceded by valid data type"

Factor Value can not have "Term Source REF" followed

Further Readings on MAGE-TAB Files

Have a comment?

Please leave your comment in the caArray End User Forum.

  • No labels