NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The general procedure for breaking down the dataset is as follows:Each of these chunks then can be packaged into a separate ZIP archive, then uploaded, validated, and imported individually

  1. Divide the array data files into smaller batches, each of which is no larger than 2 GB in combined size.
  2. Split the original SDRF file into multiple SDRF files, each of which references only the array data files from a single batch
  3. Create multiple IDF files derived from the original IDF, with each one referencing one of the SDRF files created in the previous step.
  4. Create multiple ZIP archives, each consisting of a single IDF and its associated SDRF and raw and array data files.
  5. Upload each ZIP archive individually, then validate and import the files from each.

Prerequisites

This tutorial assumes that you have past experience and basic familiarity with uploading data into caArray. Specifically, it assumes that you have already created an experiment for your data, uploaded the corresponding array design, and associated the experiment with that design. In case you lack a basic background on uploading caArray data, please refer to the official caArray User's Guide on the NCI wiki at https://wiki.nci.nih.gov/x/LBo9Ag.

...

In preparing your data for upload, the first step is to find all the files associated with a given IDF file. To so, open any of the IDF files from your experiment in Microsoft Excel or another application suited for viewing tab-limited data. The partial screenshot below shows the first of twelve IDF files from our example experiment as viewed in Excel.


Image Modified

The field 'SDRF files' towards the bottom of your IDF file displays the name of the SDRF file that is associated with the IDF.

...