NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: For HPCDATAMGM-1336: Added S3. For HPCDATAMGM-1343: Added metadata extraction & updated syntax. For HPCDATAMGM-1345: Updated syntax.

...

  1. Choose whether to upload synchronously (from your file system) or asynchronously (from a Globus endpoint). 
    • To upload from your file system, plan to use the source file parameter to specify the file that you want to upload.
    • To upload from Globus, plan to use a JSON file to specify the file that you want to upload. 
  2. In your file system, create a JSON file that specifies the metadata for the new data file. The contents of this file depend on the source of your data:

    • If you are uploading from your file system, specify the metadata that you want to upload. Click the following link to view the syntax:

      Code Block
      collapsetrue
      { 
          "extractMetadata" : true,
          "metadataEntries": [
            {
              "attribute": "description",
              "value": "my-dataObject-description"
            },
            {
              "attribute": "my-second-attribute-name",
              "value": "my-second-attribute-value"
            }
          ]
      }

      To extract metadata from the header of TIFF or BMP image files, include the following line: 

      Code Block
      "extractMetadata" : true,
    • If you are uploading from Globus, specify the Globus endpoint, the file path on that endpoint, and the metadata that you want to upload. Click the following link to view the syntax:

      Code Block
      collapsetrue
      {
        "globusUploadSource": [
          "sourcesourceLocation": {
            "fileContainerId": "globus-shared-endpoint-uid",
            "fileId": "file-path-on-shared-globus-endpoint"
          }
        ],
        "metadataEntries": [
          {
            "attribute": "description",
            "value": "my-file-description"
          }, 
          {
            "attribute": "my-second-attribute-name",
            "value": "my-second-attribute-description"
          }
        ]
      }
    • If you are uploading from AWS S3, specify the S3 bucket, path, access key, secret access key, region, and the metadata that you want to upload. Click the following link to view the syntax:

      Code Block
      collapsetrue
      {
        "s3UploadSource": [
          "sourceLocation": {
            "fileContainerId": "s3-bucket-name",
            "fileId": "s3-object-key"
          },
          "account" : {
            "accessKey" : "aws-access-key",
            "secretKey" : "aws-secret-key",
            "region" :"aws-region"
          }
        ],
        "metadataEntries": [
          {
            "attribute": "description",
            "value": "my-file-description"
          }, 
          {
            "attribute": "my-second-attribute-name",
            "value": "my-second-attribute-description"
          }
        ]
      }
  3. Run the following command:

    Code Block
    dm_register_dataobject [optional parameters] <description.json> <destination-path> [source-file]

    The following table describes each parameter:

    ParameterDescription
    [-h]If you want to print a usage (help) message for this command, specify this option.
    [-D <REST-response>]

    An optional parameter, specifying a path and filename in your file system. The system always creates a response file:

    • If you specify this parameter, the system saves the response from the server to the specified file in the specified location.
    • If you omit this parameter, the system saves the file as dataObject-registration-response-header.tmp in your home directory.
    [-o <output-json-file>]

    An optional parameter, specifying a path and filename in your file system. The system always creates an output file:

    • If you specify this parameter, the system saves the output to the specified file in the specified location.
    • If you omit this parameter, the system saves the output as dataObject-registration-response-message.json.tmp in your home directory.

    If the command is successful, the output file is empty.

    <description.json>
    A path to the JSON file that specifies the metadata for the new data file.
    <destination-path>
    A path within DME, including the name of the file you intend to upload. Specify where you want the system to create the new data file. (If you specify an existing data file, this command updates the metadata for that data file. For details, refer to Updating Data File Metadata via the CLU.)
    [source-file]

    A path to a file in your file system:

    • If you are uploading from your file system, use this parameter to specify the file that you want to upload.
    • If you are uploading from Globus or S3, omit this parameter.

For example, the following command uploads the data.txt file from the JaneDoe folder in the file system to the Project_New collection in DME:

...

If you uploaded the file from a Globus endpoint, you can view the progress of the upload in the GUI. For instructions, refer to Viewing Upload Registration Status.