NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 122 Next »

Contents of this Page

Uploading investigation data to the CSSI DCC Portal allows you to curate, manage, and reuse datasets in a standards-compliant way. The ISA-Tab specification Exit Disclaimer logo describes this standard and how to structure your investigation data to create an Investigation-Study-Assay tab-delimited (ISA-Tab) archive file. This may involve configuring it using open-source software Exit Disclaimer logo made for this purpose.

An ISA-Tab archive file is a compressed (.zip) file containing multiple text files and data files. Each tab-delimited text file in an ISA-Tab file describes the structure, meaning the column headers and row values when considered in spreadsheet format, of the investigation, study, and assay components of the archive. The data files correspond to each assay and are included in the archive in their native format, for example, Microsoft Excel. An ISA-Tab archive file may also contain images and other files.

Only users who have registered for and logged in to the CSSI DCC Portal can upload investigation data. During registration, you must also select the I would like to upload Investigation data checkbox and be approved. For more information, refer to Registering to Use the CSSI DCC Portal.

Uploading ISA-Tab Files

Before uploading a file to the CSSI DCC portal, consult the ISA-Tab specification Exit Disclaimer logo . This specification describes how to structure your investigation data to create an Investigation-Study-Assay tab-delimited (ISA-Tab) archive file. This may involve configuring it using open-source software Exit Disclaimer logo made for this purpose.

ISA-Tab files that you upload must be in .zip format. Use Globus to upload files larger than 10 GB.

To upload ISA-Tab files

  1. Log in to the CSSI DCC Portal.
  2. Select Investigations > Upload.
    The Upload ISA TAB Files page appears.
    Upload ISA Tab Files page
  3. Click New Project.
    The Project Properties page appears.

    A project is simply a container for the ISA-Tab file. It allows you to track multiple versions of ISA-Tab file uploads to this project.

    It is a best practice for each investigation to have its own project.


    Project properties page

  4. Enter a title for your new project. Note that this title and that of the ISA-Tab file that the project contains can be different.
  5. Click Save Project.
    The new project is listed under Investigation Projects.

    If the File Upload section of the page is not visible, click Open next to the project name to show it.


    Upload ISA TAB Files page with an unpublished project

    Changing the Project Name

    If needed now or later, click Edit to change the project name. Depending on your privileges, you may not see the Delete button.

  6. Select the file you want to upload to this project in one of the following ways:

    • Drag and drop the ISA-Tab archive file from your computer to the Drop your files here box surrounded by the dashed lines:
      Drop your files here or click to browse

    • Click the Drop your files here box image and browse to where the file is stored.
      The file is listed in the File Upload area and the status is listed as Ready.

  7. If the file is smaller than 10 GB, click Upload selected files. If the file is larger than 10 GB, use Globus instead. 

    The file begins processing and the status moves through the following stages:

    • Uploading
    • Processing Queued    
    • Queued
    • Preparing uploaded files
    • Parsing ISA TAB Metadata    
    • Preparing Data Files    
    • Validating assay files and file sizes    
    • File processed successfully    
    • Success: file processed successfully  

    Once an administrator approves the upload request, the uploaded, unpublished file appears in the Investigation Projects section of the page. It is not yet published, so you must now publish it to use it in CSSI DCC. 
    Project with uploaded file processed successfully

Uploading Large Files with Globus

Globus is a service that enables large file transfers securely. You must have an account with Globus before you use it to upload investigation files to CSSI DCC. If you do not already have an account, you are prompted to create one when you start the upload process.

To upload files using Globus  

  1. Begin uploading your ISA-Tab file by creating a project in CSSI DCC and selecting a file. For more information on this process, go to Uploading ISA-Tab Files.
  2. On the Upload ISA TAB Files page, click Upload with Globus button.
    If you have not yet logged into Globus, a log in page appears.
    Log in to use Globus web app

    Globus Documentation and Support

    For more information about using Globus, consult their documentation   Exit Disclaimer logo and/or support Exit Disclaimer logo .

    After you successfully log in, the Globus Transfer Files page appears. One of the Endpoints you configured is already populated, though you can change it.
    Globus Transfer Files page

  3. Select the starting endpoint (on the left) where the file(s) you want to upload reside(s). Narrow down to the path if necessary.

  4. Confirm or change the destination endpoint (on the right).

  5. Click the right arrow button that points to the destination to begin the transfer request.
    A message appears on the screen when the transfer request is submitted successfully. You receive an email when the request is granted and the transfer succeeds.
    Globus Transfer Files page, transfer request submitted successfully

    Once an administrator approves the upload request, the uploaded, unpublished file appears in the Investigation Projects section of the page.

  6. Since you have not yet published the uploaded file, you must publish it to use it in CSSI DCC. 
    Upload ISA TAB Files, Investigation Projects section

Understanding Upload Errors

The CSSI DCC Portal validates uploaded ISA archives using standards and conventions described in the ISA-Tab specification Exit Disclaimer logo . The ISA tools site Exit Disclaimer logo provides additional technical information about validating ISA archives.

If your ISA archive does not meet those standards and conventions, you may encounter an error when you upload it to the CSSI DCC Portal. Your ISA archive will likely process successfully but the application will detail the error in the Status column. You may then want to address the error and upload again. A sample error message follows:

Missing Data Files: File processed successfully. 1 files referenced in the assay(s) were not found. Click here to view the missing file lists (limited to 1000 entries each).

CSSI DCC looks for errors in three types of files that are commonly in an ISA archive:

  • Data files: If there are more than 1000 missing data files for an assay file, only the first 1000 will be listed.
  • Additional files: If files are found in your uploaded archive, but were not referenced in the metadata, they are listed.
  • External file references: The upload summary lists all of the external file references in your uploaded archive and notes whether or not they are verified.

After the upload completes, the error message appears in the Status column of the File Upload area on the Upload Single ISA Archive tab. It provides an overview of the error and a link to the details.

Error message in the Status column of the File Upload area on the Upload Single ISA Archive tab. 

Click the link in the status message. A page opens that details the status of the three types of files in the uploaded ISA archive.

The Missing Files section details each missing data file referenced in the metadata of the assays:

Source File  Total Files  Missing Files  Missing File List

The Additional Files sections details files you uploaded that you didn't reference in the archive's metadata.

Additional Files   The files below were found in your uploaded archive, but were not referenced in the ISA TAB metadata.     List   Realistic Demo.zip  additional.txt

The External File References section lists external file references in your uploaded archive and notes whether they are verified or not.

External File References The external file references below were found in your uploaded archive.  Source File External File List

If the portal can detect that the URL is valid without authenticating it, it appears in this list as “Verified.”

In all other cases, it appears as “Not Verified.”

File size is not tallied for external file references.

Controlling Access to Investigation Data

When uploading an ISA archive to a CSSI DCC folder, you can specify who can access its investigation data. You can do this before or after you request open access for your data. You can control access to the entire investigation or only selected assays and studies.

You must have the role of uploader or higher to control data access. Your administrator determines your role.

When you assign access, you assign it to one or more groups rather than individual users, though a single group could contain a single user. Anyone with upload privileges can create and manage these groups, but you must create the group(s) before you upload.

The following page family describes how to control access to investigation data.



Publishing ISA-Tab Files to the CSSI DCC Portal

Unable to render {include} The included page could not be found.

Viewing Publish History

The first time you upload an ISA-Tab file to a new project, it is the first version of that file.

While you should associate only one investigation with a project, you can upload multiple versions of an investigation file to that project, and then publish or unpublish a version as many times as you like. You may want to do this if you change your investigation data or need to fix it due to an upload error. Each time you upload a new investigation file to a project and publish it, the version number increases by one. 

CSSI DCC tracks the history of each file you publish by date and time. Use the publish history timeline to switch between current and previous versions of the investigation file. You can download the full data or selected metadata of a previous version. You can only see the publish history for investigation files you have uploaded.

To view the publish history of an investigation file

  1. Browse or search for an investigation.
    The Investigation Details page appears.

  2. Click History to show the timeline.
    History tab of the Investigation Details page

    • The current version number appears above the timeline. You can hover over the version number to see the date when this version became the current version. In this example, Version 1 became current on 6/9/2017.



    • When a file has multiple versions, they appear on the timeline as well. The version that is currently selected is always green. In the example below, the current version is selected, so it appears in green. The current version is also open-ended to the future until you publish another version. 

      The white space between versions, as in the example below, indicate times when no version of the investigation file was published.



      History tab with previous and current versions

    • To select a previous version, hover the mouse over the timeline. If the version you want to select is not immediately visible, move the scroll wheel on the mouse down to move left and up to move right until you see a previous version. In this example, a previous version was current only on 6/22/2017.

      History tab with previous version listed

    • Click the previous version to select it. The Investigation Details page appears, showing the selected version number at the top. You can download the full data or selected metadata of this previous version. You cannot download selected data of a previous version.

Unpublishing ISA-Tab Files

After publishing a file, you may want to unpublish a version of it to remove that version completely from the CSSI DCC server.

Access Control and Unpublishing

When you unpublish a file, all access restrictions that may have been applied to it are removed from the investigation. If you decide to publish this investigation later, you must reapply any access restrictions you want.

To unpublish an ISA-Tab file

  1. Open a previously published file.
  2. Next to the version you want to unpublish, click Unpublish button.
    The status of this version becomes Unpublication Request Pending. Once an administrator approves the request, the version returns to its unpublished state.
  • No labels