NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Revised the wiki page title & deleted excerpt, in preparation for release.
Excerpt
hiddentrue

New page for HPCDATAMGM-1862: CLU for dbGaP download tasks

If your user account has the Read permission level on various collections, you can download one or more data files in those collections, from DME to an Amazon Web Services (AWS) S3 bucketdbGaP (Aspera)

To download one or more data files to S3dbGaP:

  1. Prepare an S3 bucket, as described in Preparing to Use AWS S3 Bucket for the CLU

  2. Consider whether you want to download a single data file or multiple data files: 

    • To download a single data file: Plan to specify the path for that data file in the command. 
    • To download multiple data files: In your local system, use a command line editor (such as vi editor) to create a file that lists the paths for all of the DME data files you want to download, delimited by newline. Plan to use the -f option to specify that file in the command. 
  3. Run the following command:

    Panel
    borderColorsilver
    borderStylesolid
    Clipboard
    AllowLineWraptrue

    dm_download_dataobject_s3 aspera [optional parameters]   [DME data path]  <destination S3 <destination bucket> <destination-path> path> [AWS Aspera credentials file path]


    The following table describes each parameter:

    ParameterDescription
    [-D <REST-response>]

    An optional parameter, specifying a path and filename in your local system. The system always creates a response file:

    • If you specify this parameter, the system saves the response from the server to the specified file in the specified location.
    • If you omit this parameter, the system saves the file as download-dataobject-response-header.tmp in your home directory.
    [-o <output-json-file>]

    An optional parameter, specifying a path and filename in your local system. The system always creates an output file: 

    • If you specify this parameter, the system saves the output to the specified file in the specified location.
    • If you omit this parameter, the system saves the output as download-dataobject-response-message.json.tmp in your home directory.

    If the command is successful, the output file is empty.

    [-f <paths-file>]

    or

    [DME data path]

    One or more paths within DME. Select one of the following methods to specify the data file or data files that you want to download:

    • To specify multiple data files, use the -f parameter to specify a path and filename in your local system, of a file that lists the paths for all of the DME data files you want to download, delimited by newline.
    • To specify a single data file, specify the path for the DME data file you want to download.
    <destination
    S3
     bucket>
    The name of the destination
    S3
    Aspera bucket.
    <destination path>
    The path to and the name of the folder in the
    destination
    Aspera bucket. Do not begin the path with a slash.
     If
    If the destination folder structure you specify does not already exist, DME creates it.
    [
    AWS
    Aspera credentials file path]

    The location of the credentials file.

    If your credentials file is in the default location, as noted in Preparing to Use AWS S3 Bucket for the CLU, you can omit this parameter.

     

For some examples, consider the following code specified in a credentials file:

[default] aws_access_key_id = SAMPLEACCESSKEY aws_secret_access_key = SampleSecretAccessKey region = us-east-1
Code Block
ASPERA_HOST=gap-submit.ncbi.nlm.nih.gov
ASPERA_USER=asp-dbgap
ASPERA_SCP_PASS=1234abcf-12c6-de34-9876abc7e543

If you omit ASPERA_HOST or ASPERA_USER, then the command uses dbGaP supplied values.

Single File Example

The following example uses the credentials file in a non-default location to download a data file from DME. 

Panel
borderColorsilver
borderStylesolid
Clipboard
AllowLineWraptrue

dm_download_dataobject_s3 aspera /Example_Archive/PI_Lab1/Project_1/data.txt bucket1 folder1/subfolder1/file1.txt /NCI/JaneDoe/awsAspera/credentials


In this example, the command performs the following:

...

Panel
borderColorsilver
borderStylesolid
Clipboard
AllowLineWraptrue

dm_download_dataobject_s3 aspera -f file-list.txt bucket1 folder1/subfolder1/

...