NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Wiki Markup
{scrollbar:icons=false}

h1. {page-info:title}

{composition-setup}
cloak.toggle.type = text
cloak.toggle.open=[show]
cloak.toggle.close=[hide]
{composition-setup}
{panel:title=Contents}
{toggle-cloak:id=TOC}
{cloak:id=TOC}
{toc:minLevel=2}
{cloak}{panel}

Question: What are the MAGE-TAB Files?

*Topic*: caArray Usage

*Release*: caArray 2.X

*Date entered*: 02/12/2009

h2. Question

Question: What are the MAGE-TAB Files?

h2. Answer

MAGE-TAB format files refer to simple tab-delimited, spreadsheet-like files, which can be used for annotating and communicating microarray data in a [MIAME compliant fashion|http://www.mged.org/Workgroups/MIAME/miame.html]. 

MAGE-TAB format files (see [Sourceforge overview|http://tab2mage.sourceforge.net/docs/magetab_docs.html]) can be divided into two groups: array data files and descriptive files. 

There are two types of array data: the raw array data and the derived array data. A variety of raw data files, produced by several different scanner makes and models, are supported by the caArray MAGE-TAB parser. However, these raw data may not be in MAGE-TAB format. The derived array data refers to either the normalized array data, or a data file with data combined from more than one hybridization or scan. The caArray MAGE-TAB parser supports Affymetrix .CHP format for the derived data (which is not in MAGE-TAB format). For the rest, derived data needs to be reformatted in a MAGE-TAB Data Matrix according to the table below. 

MAGE-TAB Descriptive files can be further divided into 3 subgroups: Array Design File (ADF), Investigation Design File (IDF) and Sample Data Relationship File (SDRF). For more information about annotation files: IDF/SDRF, refer to [caArray008]. The table summarizes the definition of each MAGE-TAB format file. It is necessary to mention that MAGE-TAB format Array Design File (ADF) is not mandatory, since array design files for the common arrays are usually available from their respective array providers. If an array design file (which may not be MAGE-TAB format) is available from its array provider, it should be chosen over ADF. Furthermore, MAGE-TAB ADF is not parsed by caArray. The third party array design files are uploaded via "Manage Array Design" interface under caArray's "Curation" tab. An ADF file, on the other hand, is uploaded together with the rest of MAGE-TAB files. ADF file will not be validated or parsed by caArray. It will be imported directly into caArray according to the table that follows. 

*MAGE-TAB Formatted Files*
|| Abbreviation||  File Type || Comments || caArray compatible? || Processed by caArray? ||
| IDF | Investigation Design File | Provides an overview of the experiment, including the experimental variables (factors) used,  protocols, quality control strategy, publication information and contact details | Yes | Yes: Parsed, Validated before import |
| SDRF | Sample Data Relationship File | Describes relationships between samples, arrays, data files, protocols, factor values etc. It is a table in which each hybridization channel is represented by a row, and columns represent the steps of the experiment. The ordering of these columns is important, and reads left-to-right in chronological order. | Yes | Yes: Parsed, Validated before import |
| ADF | Array Design File | Provides the array-level annotation for the experiment. It relates the row-level identifiers in the data files to biological sequence annotation | Yes | No. Directly Import |
| TXT or other | Data Matrix | Contains processed array data files in tab-delimited text format. Rows may represent genes/ exons/ genomic locations. Columns represent samples or experimental conditions. | Yes | No. Directly Import |

h3. What are MAGE-TAB Files?

The term of "MAGE-TAB Files" (refer to [caArray002 - How do I upload MicroArray Gene Expression Data into caArray?], step 2 for an example), has been used to refer not only MAGE-TAB formatted files as summarized in the table, but also files that are supported by MAGE-TAB parser mentioned in the last section. To be more specific, MAGE-TAB files also include the third party's raw array data files, derived data files and array design files from array providers, as shown in the illustration. 

In summary, MAGE-TAB files refer to each other. Together they represent the complete experiment. 

*Diagram Identifying Content of MAGE-TAB Files*
!MAGE-TAB-Files.jpg|align=center,alt="Diagram Identifying Content of MAGE-TAB Files"!

{highlight:color=red}The content of the diagram must be put into the text.{highlight}

h3. Building MAGE-TAB Formatted files

The MAGE-TAB specification can be found at: [MGED homepage|http://www.mged.org/mage-tab/spec1.0.html]. 
To get started, you may generate a MAGE-TAB template file from [EMBL-EBI's MAGE TAB site|http://www.mged.org/mage-tab/tools.html], or create your own IDF and SDRF files based on the [Sourceforge MAGE-TAB documentation|http://tab2mage.sourceforge.net/docs/magetab_docs.html]. 

MAGE-TAB training and demos for the caArray users are currently under the development. We will add the links here once they become available.

For more information about When to use MAGE-TAB annotation files in caArray, refer to [caArray008]. For more information on How to upload MAGE-TAB files, refer to [caArray002 - How do I upload MicroArray Gene Expression Data into caArray?].

h2. Have a comment?

Please leave your comment in the [caArray End User Forum|https://cabig-kc.nci.nih.gov/Molecular/forums/viewtopic.php?f=6&t=577].

{scrollbar:icons=false}