NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Below are the two approaches available to perform automated archival of archive data to DME.

Command-Line Utilities (CLU)

The DME command-line utilities (CLU) provide shell commands for programmatic access from bioinformatics pipe-lines and workflows. Users looking into registering their If you want to register your data via the CLU, you can integrate the upload process into their your workflow. The You would supply metadata is supplied in JSON files that are uploaded and upload those files through the CLU commands.   The You can obtain the CLU package can be obtained from the following GitHub repository.  The CLU commands needs to be installed and run :

https://github.com/CBIIT/HPC_DME_APIs Exit Disclaimer logoImage Added

You would need to install and run the CLU commands on the server where is the data is located or mounted. For installation instructions, refer to Getting Started with DME CLU.

  The commands available are described here. Recommended commands for registration are dm_register_dataobject_multipart (for single file) and dm_register_directory (for bulk uploads). For details, refer to the following pages:

Automated Archival Workflow

The DME Archival workflow was developed to support supports users requiring recurring bulk uploads. It enables fully automated archival of datasets on a pre-configured schedule. The source directories specified by the user are scanned to locate the files to be archivedsystem locates the files to archive by scanning source directories that you specify. Fault tolerance and multi-threading capabilities are built-in to achieve reliability and high throughput. The system extracts metadata is extracted from metadata input files based on the rules configured that you can configure in a customer user module. Supported input file formats are JSON, XML, and CSV/Excel. The workflow can be customized You can customize the workflow using flexible configuration options available.  These These options include the :

  • The source path where you want the

...

  • system to pick up the data.
  • Whether you want the system to perform any pre-processing

...

  • such as tarring the folder

...

  • .
  • Whether you want the system to apply any patterns

...

  • to include and exclude some files/folders

...

  • .
  • Whether you want the system to look for a specific file to indicate it is ready for the system to pick it up.

For details on configuration options, refer to the following page:

https://github.com/CBIIT/dme-archival-workflow/blob/master/workflow_config.md Exit Disclaimer logoImage Added to be picked up. The detail configuration options are described here.