NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Deleted based on 1/19 discussion.

If your user account has the Read , Write, or Own permission level on various on various collections, you can download can download one or more of those collections, from DME to an Amazon Web Services (AWS)  S3 S3 bucket. 

To download multiple one or more collections to S3:

  1. Prepare an S3 bucket, as described in Preparing to use Amazon Web Services S3 with DME.Use AWS S3 Bucket for the CLU

  2. Consider whether you want to download a single collection or multiple collections: 

    • To download a single collection: Plan to specify the path for that collection in the command. 
    • To download multiple collections: In your local system, use a command line editor (such as vi editor) to create a file that lists the paths for all of the DME collections you want to download, delimited by newline. Plan to use the -f option to specify that file in the command. 
  3. Run the following command:

    Panel
    borderColorsilver
    borderStylesolid
    Clipboard
    AllowLineWraptrue

    (

    Code Block

    dm_download_collection_s3 [optional

    parameters] [

    collection-logical-

    DME data path] <destination

    S3

    bucket>

    <destination-path> [AWS

    credentials

    file

    path])


    The following table describes each parameter:

    ParameterDescription
    [-
    h]If you want to print a usage (help) message for this command, specify this option.[-D
    D <REST-response>]

    An optional parameter, specifying a path and filename in your

    file

    local system. The system always creates a response file:

    • If you specify this parameter, the system saves the response from the server to the specified file in the specified location.
    • If you omit this parameter, the system saves the file as download
    -dataobject
    • -
    or download-
    • collection-response-header.tmp in your home directory.
    [-o <output-json-file>]

    An optional parameter, specifying a path and filename in your

    file

    local system. The system always creates an output file: 

    • If you specify this parameter, the system saves the output to the specified file in the specified location.
    • If you omit this parameter, the system saves the output as download-
    dataobject- or download-
    • collection-response-message.json.tmp in your home directory.

    If the command is successful, the output file is empty.

    <DME data path>

    [-f <paths-file>]

    or

    [DME data path]

    One or more paths within DME. Select one of the following methods to specify the collection or collections that you want to download:

    • To specify multiple collections, use the -f parameter to specify a path and filename in your local system, of a file that lists the paths for all of the DME collections you want to download, delimited by newline.
    • To specify a single collection, specify the path for the DME collection
    A path within DME. Specify the data file or collection that
    • you want to download.
    <destination S3 bucket>
    The name of the destination S3 bucket.
    <destination path>
    The path to and the name of the
    file or
    folder in the destination bucket. Do not begin the path with a slash.
  4. If you are downloading an individual file, always specify a filename for that file.
  5. If you are downloading a collection, always specify a name for that folder.
     If the destination folder structure you specify does not already exist, DME creates it. 
    [AWS credentials file path]

    The location of the credentials file. If your credentials file is in the default location, as noted in Preparing to

    use Amazon Web Services S3 with DME

    Use AWS S3 Bucket for the CLU, you can omit this parameter.

[-D REST-response] : Optional, saves the response from the server to the specified file.
[-o output-json-file] : Optional, saves the output to the specified file.
[-f paths-file] : New line delimited file which contains the collection paths in DME to be downloaded. (Not needed if downloading a single collection using collection-logical-path parameter instead.)
[collection-logical-path] : Collection path in DME. (Not needed if the paths-file parameter is provided.)
<destination S3 bucket> : Destination S3 bucket to transfer data to. (This is a mandatory parameter.)
<destination-path> : Full path of the destination file/folder in the S3 bucket. (This is a mandatory parameter.)
[AWS credentials file path] : Optional location of the credentials file.

This command can be used to download a single collection or a list of collections provided in a file. (New line delimited.)

...

For some examples, consider the following code specified in a credentials file:

Code Block
[default]
aws_access_key_id = SAMPLEACCESSKEY
aws_secret_access_key = SampleSecretAccessKey
region = us-east-1

Single Collection Example

The following example uses the credentials file in a non-default location to download a collection from DME. 

Panel
borderColorsilver
borderStylesolid
Clipboard
AllowLineWraptrue

dm_download_collection_s3 /Example_Archive/PI_Lab1/Project_1 bucket1 folder1/subfolder1/ /NCI/JaneDoe/aws/credentials


In this example, the command performs the following:

  • Locates or creates a folder1 folder in the bucket1 bucket.
  • Locates or creates a subfolder1 folder within the folder1 folder.
  • Downloads from DME all files in the Project_1 collection.
  • Saves those files in the subfolder1 folder with the same file names they have in DME. 

Multiple Collections Example

For another example, consider the following command.

Panel
borderColorsilver
borderStylesolid
Clipboard
AllowLineWraptrue

dm_download_collection_s3 -f collection-list.txt bucket1 folder1/subfolder1/


With the following code in the specified /Example_Archive/PI_Lab1/Project_1 testbucket testcollectionMultiple collections example:
collection-list.txt file, the above command uses the credentials file in a default location to download multiple collections from various locations in DME.

Code Block
/Example_Archive/PI_

...

Lab2/Project_1

...


/Example_Archive/PI_

...

Lab2/Project_2

...


/Example_Archive/PI_

...

Lab3/Project_1

...


In this example, the command performs the following:

  • Locates or creates a folder1 folder in the bucket1 bucket.
  • Locates or creates a subfolder1 folder within the folder1 folder.
  • Downloads from DME all files in the collections listed in the specified collection-list.txt file.
  • Saves those files in the subfolder1 folder with the same file names they have in DME. 

...