Uploading investigation data to the CSSI DCC Portal allows you to curate, manage, and reuse datasets in a standards-compliant way. The ISA-Tab specification describes this standard and how to structure your investigation data to create an Investigation-Study-Assay tab-delimited (ISA-Tab) archive file. This may involve configuring it using open-source software made for this purpose.
An ISA-Tab archive file is a compressed (.zip) file containing multiple text files and data files. Each tab-delimited text file in an ISA-Tab file describes the structure, meaning the column headers and row values when considered in spreadsheet format, of the investigation, study, and assay components of the archive. The data files correspond to each assay and are included in the archive in their native format, for example, Microsoft Excel. An ISA-Tab archive file may also contain images and other files.
Only users who have registered for and logged in to the CSSI DCC Portal can upload investigation data. During registration, you must also select the I would like to upload Investigation data checkbox and be approved. For more information, refer to Registering to Use the CSSI DCC Portal. If you need an administrator to delete an investigation version from a project, click the Contact Us link at the bottom of the screen.
Uploading ISA-Tab Files
Before uploading a file to the CSSI DCC portal, consult the ISA-Tab specification . This specification describes how to structure your investigation data to create an Investigation-Study-Assay tab-delimited (ISA-Tab) archive file. This may involve configuring it using open-source software made for this purpose.
ISA-Tab files that you upload must be in .zip
format. Use Globus to upload files larger than 100 GB.
To upload ISA-Tab files
- Log in to the CSSI DCC Portal.
- Select Investigations > Upload.
The Upload ISA TAB Files page appears.
Click New Project.
The Project Properties page appears.A project is simply a container for the ISA-Tab file. It allows you to track multiple versions of ISA-Tab file uploads to this project.
It is a best practice for each investigation to have its own project.
- Enter a unique title for your new project. Note that this title and that of the ISA-Tab file that the project contains can be different.
Click Save Project.
The new project is listed under Investigation Projects.
If the File Upload section of the page is not visible, click Open next to the project name to show it.
Select the file you want to upload to this project in one of the following ways:
Drag and drop the ISA-Tab archive file from your computer to the Drop your files here box surrounded by the dashed lines:
Click the Drop your files here box image and browse to where the file is stored.
The file is listed in the File Upload area and the status is listed as Ready.
If the file is 100 GB or smaller, click Upload selected files. If the file is larger than 100 GB, use Globus instead.
The file begins processing and the status moves through the following stages:
- Uploading
- Processing Queued
- Queued
- Preparing uploaded files
- Parsing ISA TAB Metadata
- Preparing Data Files
- Validating assay files and file sizes
- File processed successfully
Success: file processed successfully
Once an administrator approves the upload request, the uploaded file appears as a new project version in the Investigation Projects section of the page. It is not yet published, so you must now publish it to use it in CSSI DCC. For instructions on previewing and publishing it, refer to Publishing ISA-Tab Files to the CSSI DCC Portal.
If needed now or later, click Edit to change the project name.
Uploading Large Files with Globus
Globus is a service that enables large file transfers securely. You must have an account with Globus before you use it to upload investigation files to CSSI DCC. If you do not already have an account, you are prompted to create one when you start the upload process.
To upload files using Globus
- Begin uploading your ISA-Tab file by creating a project in CSSI DCC and selecting a file. For more information on this process, go to Uploading ISA-Tab Files.
On the Upload ISA TAB Files page, click .
If you have not yet logged into Globus, a log in page appears.
Globus Documentation and Support
For more information about using Globus, consult their documentation and/or support .
After you successfully log in, the Globus Transfer Files page appears. One of the Endpoints you configured is already populated, though you can change it.
Select the starting endpoint (on the left) where the file(s) you want to upload reside(s). Narrow down to the path if necessary.
Confirm or change the destination endpoint (on the right).
Click the right arrow button that points to the destination to begin the transfer request. A message briefly appears on the screen when the transfer request is submitted successfully. When the transfer succeeds, Globus sends a notification email message.
To view the resulting list of files in the destination endpoint, click refresh list on the right.
Once an administrator approves the upload request, the uploaded, not-yet-published file appears as a new project version in the Investigation Projects section of the page.
Since you have not yet published the uploaded file, you must publish it to use it in CSSI DCC. For instructions on previewing and publishing it, refer to Publishing ISA-Tab Files to the CSSI DCC Portal.
Understanding Upload Errors
If your ISA archive does not meet those standards and conventions, you may encounter an error when you upload it to the CSSI DCC Portal. Your ISA archive will likely process successfully but the application will detail the error in the Status column. You may then want to address the error and upload again. A sample error message follows: CSSI DCC looks for errors in three types of files that are commonly in an ISA archive: After the upload completes, the error message appears in the Status column of the File Upload area on the Upload Single ISA Archive tab. It provides an overview of the error and a link to the details. Click the link in the status message. A page opens that details the status of the three types of files in the uploaded ISA archive. The Missing Files section details each missing data file referenced in the metadata of the assays: The Additional Files sections details files you uploaded that you didn't reference in the archive's metadata. The External File References section lists external file references in your uploaded archive and notes whether they are verified or not. If the portal can detect that the URL is valid without authenticating it, it appears in this list as “Verified.” In all other cases, it appears as “Not Verified.” File size is not tallied for external file references.Missing Data Files: File processed successfully. 1 files referenced in the assay(s) were not found. Click here to view the missing file lists (limited to 1000 entries each).
Controlling Access to Investigation Data
You must have the role of uploader or higher to control data access. Your administrator determines your role. When you assign access, you assign it to one or more groups rather than individual users, though a single group could contain a single user. Anyone with upload privileges can create and manage these groups, but you must create the group(s) before you upload. The following page family describes how to control access to investigation data.
Publishing ISA-Tab Files to the CSSI DCC Portal
Editing or Removing the Embargo Date for an Investigation
In the Comments field, provide any comments you want to associate with this request. Click Submit. The system changes the status of the investigation version to Request pending as of <date time>. An administrator processes your new request. When that administrator has approved your new request, the system changes the status of the investigation version (on the Upload ISA Archives page) back to Open Access approved, Embargoed. When system has made the investigation version open access, the system sends you an email notification with "CSSI DCC Portal Investigation Granted Open Access" as the subject line.