Skip Navigation
National Cancer Institute U.S. National Institutes of Health www.cancer.gov
NCI Wiki New Account Help Tips
Skip to end of metadata
Go to start of metadata

TCGA Data Matrix Web Service User's Guide

You can print and export wiki pages

You can send this page to a printer or convert it to a PDF, HTML, or Word document. See Printing and Exporting Wiki Pages.

This document explains the basics of using the TCGA's Data Matrix Web Services.

Getting Support

Overview

The Data Matrix Web Service is a REST service that allows for programmatic generation of data archives, as produced by the Data Matrix.

The web service currently provides two output formats: XML and JSON, and uses the same query parameters for filtering as the web-based Data Matrix. You as a client of the service can enter a job request to create an archive containing the data according to the criteria provided. The service schedules an archive build upon a successful request. You can query the service to find out whether the archive build is complete.

Upon completion of the archive, you are provided a download link to the archive. You can provide an email address within the query; the web service sends a notification email to this address when the archive has been built and is ready to be retrieved.

Archives generated by the service are available for download for 24 hours.

Further detailed information on the Data Matrix application can be found in the Data Matrix User's Guide.

Building Client Requests

Service Access URL

The base URL for the Data Matrix web service is:

The web application description (WADL) file can be found at http://tcga-data.nci.nih.gov/tcga/damws/application.wadl.

  

Filter Request Parameters List (Non-clinical)

The filtering criteria used to obtain a fine-grained selection of archive data correspond directly to those given on the filter page of the Data Matrix application. The following table consists of descriptions and examples of valid criteria. You must use the filter name as given (in particular, case-sensitively) to build a valid REST request.

Notes:

  • The arguments marked in bold are mandatory. (except for clinical type search; see below)
  • Parameters should not be (simple/double)-quoted.

    URL Encoding

    All URLs sent to DCC web services must have proper URL encoding of reserved characters. For example, the ampersand character (@) in an email address must be URL encoded as %40.

Filter

Arguments

Example

email

User email

email=john.doe@oblivion.com

center

Center abbreviation or other desginator

center=broad.mit.edu or center=BI or center=3

disease

Disease abbreviation, defaults to GBM

disease=GBM

availablity

A, P, N

availability=A

batch

Batch number or other designator

batch=9 or batch=Batch%209

consolidateFiles

Set to true to consolidate files, false otherwise (defaults to true)

consolidateFiles=false

flattenDir

Set to true to flatten the directory structure, false otherwise (defaults to false)

flattenDir=true

startDate

mm/dd/yy

startDate=01/05/2009

endDate

mm/dd/yy

endDate=02/04/2009

tumorNormal

T, TN, NT, N

tumorNormal=TN

noMeta

Include experiment metadata in generated archive. true, false

noMeta=true

protectedStatus

P, N

protectedStatus=NP

sampleList

Aliquot ID, or comma-separated list of aliquot IDs (no spaces). Wildcards are allowed.
See this page in the data primer for an explanation of current barcode format

sampleList=TCGA-01-1234-02,TCGA-02-5678-09

platform

platform alias or id  or other designator

platform=ABI or platform=17

platformType

Integer designating platform type or other designator.
See the Data Matrix User's Guide

platformType=Expression-Genes or platformType=3

level

1,2,3 or C (clinical data). Querying multiple levels at once is supported (use comma separated list)
See Data level in the TCGA Encyclopedia for a brief description of data levels.

level=3
level=1,2,3

Filter Request Parameters List (Clinical)

For a clinical type search, only platformType and disease are mandatory fields. You must not enter values for center, level or platform for a clinical type search. You will get an error message if you enter search values for center, level or platform.

Filter

Arguments

Example

disease

Disease abbreviation,defaults to GBM.

disease=GBM

platformType

Integer designating platform type or other designator. 
See the Data Matrix User Guide

platformType=c or C

Choosing A Response Format

To obtain a JSON-formatted response, add /json to the base url:

To obtain an XML-formatted response, add /xml to the base url:

Complete Request Example (Non-clinical)

Example Data Matrix web service request:

Complete Request Example (Clinical)

Example Data Matrix web service request:

Query Frequency Limitations

The Data Matrix Web Service has a query limitation of 1 connection every 10 seconds. Exceeding this quota will cause the system to return HTTP Status Code 413 until the system again falls below the quota limit.

Notes About Invoking the Web Service

  1. When using command line tools wget or curl to invoke the web service, enter the web service URL in double quotes. Otherwise the URL parameters will be skipped which will confuse the web service.
  2. It is preferable to use curl or just a standard browser. wget does not return error messages properly, if there are any.

Examples:

Web Service Responses

XML Response

Example of a XML output from the following job submission:

  • The initial acknowledgment message (note, whitespace is added here for readability):
Note

When the Job is done and successful, the status will show the appropriate code for a successful execution of the archive builder, and a new archive-url element will occur in the job-status node. This new element will contain the link to the newly created archive, available to download for 24 hours.

JSON response

Example of a JSON output from the following job submission:

  • The initial acknowledgment message (note, whitespace is added here for readability):
Note

Regarding the XML output, when the job is finished and successful, the job-status object will have an archive-url member containing the link to the archive to download for 24 hours.

Job Process XML Schema

The XML Schema definition of the job process is attached, and is shown here, as well:

image of XML schema definition

Labels