NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 131 Next »

Contents of this Page

Uploading investigation data to the CSSI DCC Portal allows you to curate, manage, and reuse datasets in a standards-compliant way. The ISA-Tab specification Exit Disclaimer logo describes this standard and how to structure your investigation data to create an Investigation-Study-Assay tab-delimited (ISA-Tab) archive file. This may involve configuring it using open-source software Exit Disclaimer logo made for this purpose.

An ISA-Tab archive file is a compressed (.zip) file containing multiple text files and data files. Each tab-delimited text file in an ISA-Tab file describes the structure, meaning the column headers and row values when considered in spreadsheet format, of the investigation, study, and assay components of the archive. The data files correspond to each assay and are included in the archive in their native format, for example, Microsoft Excel. An ISA-Tab archive file may also contain images and other files.

Only users who have registered for and logged in to the CSSI DCC Portal can upload investigation data. During registration, you must also select the I would like to upload Investigation data checkbox and be approved. For more information, refer to Registering to Use the CSSI DCC Portal. If you need an administrator to delete an investigation version from a project, click the Contact Us link at the bottom of the screen.

Uploading ISA-Tab Files

Before uploading a file to the CSSI DCC portal, consult the ISA-Tab specification Exit Disclaimer logo . This specification describes how to structure your investigation data to create an Investigation-Study-Assay tab-delimited (ISA-Tab) archive file. This may involve configuring it using open-source software Exit Disclaimer logo made for this purpose.

ISA-Tab files that you upload must be in .zip format. Use Globus to upload files larger than 10 GB.

To upload ISA-Tab files

  1. Log in to the CSSI DCC Portal.
  2. Select Investigations > Upload.
    The Upload ISA TAB Files page appears.
    Upload ISA Tab Files page
  3. Click New Project.
    The Project Properties page appears.

    A project is simply a container for the ISA-Tab file. It allows you to track multiple versions of ISA-Tab file uploads to this project.

    It is a best practice for each investigation to have its own project.


    Project properties page

  4. Enter a title for your new project. Note that this title and that of the ISA-Tab file that the project contains can be different.
  5. Click Save Project.
    The new project is listed under Investigation Projects.

    If the File Upload section of the page is not visible, click Open next to the project name to show it.


    Upload ISA TAB Files page with an unpublished project

    Changing the Project Name

    If needed now or later, click Edit to change the project name. Depending on your privileges, you may not see the Delete button.

  6. Select the file you want to upload to this project in one of the following ways:

    • Drag and drop the ISA-Tab archive file from your computer to the Drop your files here box surrounded by the dashed lines:
      Drop your files here or click to browse

    • Click the Drop your files here box image and browse to where the file is stored.
      The file is listed in the File Upload area and the status is listed as Ready.

  7. If the file is smaller than 10 GB, click Upload selected files. If the file is larger than 10 GB, use Globus instead. 

    The file begins processing and the status moves through the following stages:

    • Uploading
    • Processing Queued    
    • Queued
    • Preparing uploaded files
    • Parsing ISA TAB Metadata    
    • Preparing Data Files    
    • Validating assay files and file sizes    
    • File processed successfully    
    • Success: file processed successfully  

    Once an administrator approves the upload request, the uploaded file appears as a new project version in the Investigation Projects section of the page. It is not yet published, so you must now publish it to use it in CSSI DCC. For instructions on previewing and publishing it, refer to Publishing ISA-Tab Files to the CSSI DCC Portal.
    Project with uploaded file processed successfully

Uploading Large Files with Globus

Globus is a service that enables large file transfers securely. You must have an account with Globus before you use it to upload investigation files to CSSI DCC. If you do not already have an account, you are prompted to create one when you start the upload process.

To upload files using Globus  

  1. Begin uploading your ISA-Tab file by creating a project in CSSI DCC and selecting a file. For more information on this process, go to Uploading ISA-Tab Files.
  2. On the Upload ISA TAB Files page, click Upload with Globus button.
    If you have not yet logged into Globus, a log in page appears.
    Log in to use Globus web app

    Globus Documentation and Support

    For more information about using Globus, consult their documentation   Exit Disclaimer logo and/or support Exit Disclaimer logo .

    After you successfully log in, the Globus Transfer Files page appears. One of the Endpoints you configured is already populated, though you can change it.
    Globus Transfer Files page

  3. Select the starting endpoint (on the left) where the file(s) you want to upload reside(s). Narrow down to the path if necessary.

  4. Confirm or change the destination endpoint (on the right).

  5. Click the right arrow button that points to the destination to begin the transfer request.
    A message appears on the screen when the transfer request is submitted successfully. You receive an email when the request is granted and the transfer succeeds.
    Globus Transfer Files page, transfer request submitted successfully

    Once an administrator approves the upload request, the uploaded, unpublished file appears in the Investigation Projects section of the page.

  6. Since you have not yet published the uploaded file, you must publish it to use it in CSSI DCC. For instructions on previewing and publishing it, refer to Publishing ISA-Tab Files to the CSSI DCC Portal.
    Upload ISA TAB Files, Investigation Projects section

Understanding Upload Errors

The CSSI DCC Portal validates uploaded ISA archives using standards and conventions described in the ISA-Tab specification Exit Disclaimer logo . The ISA tools site Exit Disclaimer logo provides additional technical information about validating ISA archives.

If your ISA archive does not meet those standards and conventions, you may encounter an error when you upload it to the CSSI DCC Portal. Your ISA archive will likely process successfully but the application will detail the error in the Status column. You may then want to address the error and upload again. A sample error message follows:

Missing Data Files: File processed successfully. 1 files referenced in the assay(s) were not found. Click here to view the missing file lists (limited to 1000 entries each).

CSSI DCC looks for errors in three types of files that are commonly in an ISA archive:

  • Data files: If there are more than 1000 missing data files for an assay file, only the first 1000 will be listed.
  • Additional files: If files are found in your uploaded archive, but were not referenced in the metadata, they are listed.
  • External file references: The upload summary lists all of the external file references in your uploaded archive and notes whether or not they are verified.

After the upload completes, the error message appears in the Status column of the File Upload area on the Upload Single ISA Archive tab. It provides an overview of the error and a link to the details.

Error message in the Status column of the File Upload area on the Upload Single ISA Archive tab. 

Click the link in the status message. A page opens that details the status of the three types of files in the uploaded ISA archive.

The Missing Files section details each missing data file referenced in the metadata of the assays:

Source File  Total Files  Missing Files  Missing File List

The Additional Files sections details files you uploaded that you didn't reference in the archive's metadata.

Additional Files   The files below were found in your uploaded archive, but were not referenced in the ISA TAB metadata.     List   Realistic Demo.zip  additional.txt

The External File References section lists external file references in your uploaded archive and notes whether they are verified or not.

External File References The external file references below were found in your uploaded archive.  Source File External File List

If the portal can detect that the URL is valid without authenticating it, it appears in this list as “Verified.”

In all other cases, it appears as “Not Verified.”

File size is not tallied for external file references.

Controlling Access to Investigation Data

When uploading an ISA archive to a CSSI DCC folder, you can specify who can access its investigation data. You can do this before or after you request open access for your data. You can control access to the entire investigation or only selected assays and studies.

You must have the role of uploader or higher to control data access. Your administrator determines your role.

When you assign access, you assign it to one or more groups rather than individual users, though a single group could contain a single user. Anyone with upload privileges can create and manage these groups, but you must create the group(s) before you upload.

The following page family describes how to control access to investigation data.



Publishing ISA-Tab Files to the CSSI DCC Portal

Unable to render {include} The included page could not be found.

Viewing Publish History

Unable to render {include} The included page could not be found.
 

  • No labels