NIH | National Cancer Institute | NCI Wiki  

Below are the two approaches available to archive bulk data to DME.

Command-Line Utilities (CLU)

The DME command-line utilities (CLU) provide shell commands for programmatic access from bioinformatics pipe-lines and workflows. If you want to register your data via the CLU, you can integrate the upload process into your workflow. You would supply metadata in JSON files and upload those files through the CLU commands. You can obtain the CLU package from the following GitHub repository:

https://github.com/CBIIT/HPC_DME_APIs Exit Disclaimer logo

You would need to install and run the CLU commands on the server where the data is located or mounted. For installation instructions, refer to Getting Started with DME CLU.

Recommended commands for registration are dm_register_dataobject_multipart (for single file) and dm_register_directory (for bulk uploads). For details, refer to the following pages:

Automated Archival Workflow

The DME Archival workflow supports users requiring recurring bulk uploads. It enables fully automated archival of datasets on a pre-configured schedule. The system locates the files to archive by scanning source directories that you specify. Fault tolerance and multi-threading capabilities are built-in to achieve reliability and high throughput. The system extracts metadata from metadata input files based on the rules that you can configure in a customer user module. Supported input file formats are JSON, XML, and CSV/Excel. You can customize the workflow using flexible configuration options available. These options include:

  • The source path where you want the system to pick up the data.
  • Whether you want the system to perform any pre-processing such as tarring the folder.
  • Whether you want the system to apply any patterns to include and exclude some files/folders.
  • Whether you want the system to look for a specific file to indicate it is ready for the system to pick it up.

For details on configuration options, refer to the following page:

https://github.com/CBIIT/dme-archival-workflow/blob/master/workflow_config.md Exit Disclaimer logo

  • No labels