The current major release is caDSR 4.0, deployed December 8, 2008.

This is the wiki home page for caDSR. You may edit pages if you are working on them with the authors. You are welcome to leave comments. The caDSR Wikis are:

  • caDSR Content
    Focuses on the Metadata Content within the caDSR and is most useful for consumers. The pages include Data Standards, Harmonization, Reuse and Business Rules. The information on these pages directs the capabilities of the caDSR Tools.
  • caDSR Database and Tools
    Focuses on the features of the Tools from the users perspective. The pages include descriptions for each tool, Release Notes and reference materials. The information on these pages is specific to the use case implementation of Business Rules recorded in the caDSR Content pages.
  • caDSR Product Information
    Focuses on Open Source adoption of the caDSR Database and Tools. The pages include technical details on XML messages produced and consumed by the caDSR Tools, references to API documentation, implementation extensions and Downloads.
  • caDSR Projects
    Focuses on specific projects currently in work. The pages include those projects spanning across tools and products and projects involving inter-metdata registry sharing. For projects isolated to a specific tool, please refer to the caDSR Database and Tools section.

Quick Links

Quick Reference to caDSR Tools

Tool name

Wiki home page

Production tool

Administration Tool

https://wiki.nci.nih.gov/x/PYEI

http://cadsradmin.nci.nih.gov/ (Login required)

Sentinel Tool

https://wiki.nci.nih.gov/x/PIEI

http://cadsrsentinel.nci.nih.gov/cadsrsentinel/do/logon (Login required)

CDE Browser

https://wiki.nci.nih.gov/x/OoEI

https://cdebrowser.nci.nih.gov

Form Builder

https://wiki.nci.nih.gov/x/Q4EI

https://formbuilder.nci.nih.gov (Login required)

Freestyle Search

https://wiki.nci.nih.gov/x/QIEI

http://freestyle.nci.nih.gov/

UML Model Browser

https://wiki.nci.nih.gov/x/PoEI

http://umlmodelbrowser.nci.nih.gov/umlmodelbrowser/

CDE Curation Tool

https://wiki.nci.nih.gov/x/O4EI

http://cdecurate.nci.nih.gov/cdecurate/ (Login required)

Semantic Integration Workbench (SIW)

https://wiki.nci.nih.gov/x/QYEI

http://cadsrsiw.nci.nih.gov/

UML Loader

https://wiki.nci.nih.gov/x/QoEI

Contact NCICB Application Support, ncicb@pop.nci.nih.gov

Several tools perform various tasks in creating, managing, and deploying CDEs. There are also tools that support reviewing externally generated forms to see if they are CDE-compliant, that is, are comprised of approved CDEs found in the caDSR. The public CDE Browser lets you search for data elements, create forms and download CDEs. The UML Model Browser is specifically designed for browsing registered UML information models. Online help is available, but you will find that using the tools is easier if you have first read through the description of the caDSR implementation of the ISO/IEC 11179 Standard.

caDSR Overview

The Cancer Data Standards Registry and Repository (caDSR) is database and a set of APIs and tools based on the ISO/IEC 11179 Information Technology Metadata Registries (MDR) standard. The caDSR provides the means to create, edit, control, and deploy data elements for metadata consumers.

Metadata is defined as "data about data" or "the description of a piece of information." Standardizing and registering metadata addresses a significant problem in biomedical data management: the wide variety of ways that similar data are collected and described.

The ISO/IEC 11179 standard defines a framework for how metadata can be specified, maintained in a consistent manner, shared, and used across diverse domains. The caDSR is a conforming implementation of the ISO/IEC 11179 metadata standard with NCI extensions. The caDSR has been designed to support creation, maintenance, registration, and use of metadata in accordance with the metadata standard; in addition, the NCI has extended the standard, most notably in the use of controlled terminology. This enables metadata consumers to register the descriptive information needed to render cancer research data reusable and interoperable.

The fundamental unit of data in the ISO/IEC 11179 standard is called a data element. According to the ISO metadata standard, any item represented by a data element has two distinct parts: an explicit definition that is independent of any particular implementation, and an explicit description of implementation--specific details regarding how the item is represented in computer storage. Capturing these two aspects makes it possible to compare data elements that describe the same thing across different applications, and to understand what data transformation may be necessary in order to make the data comparable.

caCORE-like systems follow an object-oriented paradigm where classes of data are described using UML models. A UML model, serialized into XMI, can then be used to transform the UML model objects into caDSR registered items. Once registered, the items in caDSR can be re-used in other systems' models. If different systems are using the same registered terms (metadata) for the data in their models, those systems can more easily communicate and share information.

The caDSR itself is a database that contains Administered Items. As defined in the ISO/IEC 11179 standard, an Administered Item is an item (a Data Element or one of the associated components that comprise a Data Element) for which administrative information must be recorded. caDSR administered items are supported by the use of externally defined terminologies and controlled vocabularies, such as the NCI Thesaurus.

To support the database, the caDSR also has a suite of tools for creating, sharing, and deploying data elements (also called common data elements or CDEs). These tools include a public CDE Browser that enables you to search for data elements, create forms, and download CDEs, and a UML Model Browser viewer that makes it easier to find CDEs that are registered as part of UML modeling projects. All of the caDSR tools and interfaces connect to the same central database. Links to further information regarding the caDSR tools appear in the section "caDSR Tools."

By complying with the ISO/IEC 11179 standard, caDSR provides, among other things, a semantic bridge between the data elements contained in registered data objects and standard vocabularies and ontologies. caDSR was originally designed to support the development and deployment of data elements as metadata descriptors for NCI-sponsored research, but now supports an ever-widening group of users and metadata consumers in caBIG®.

caDSR Database and Implementation

The caDSR is based on an Oracle database. All of the various tools and interfaces connect to the same central database.

The software applications that access caDSR content are based on open source standards and are freely available for use by other government agencies and for download and use by interested parties.

caDSR follows the ISO/IEC 11179 standard to harmonize, register and integrate user-defined UML information models with existing and new caDSR content and to represent the CDEs in the database. This standard is somewhat complex, but it offers a richly expressive model for metadata that does a good job of supporting the variations needed for biomedical applications. If you are interested in working with the caDSR, please review the background material on the way we have implemented the ISO/IEC 11179 standard.

In addition to implementing the ISO/IEC 11179 model, we have added a few additional types of content to the caDSR. The two most important additional items are Forms and Protocols.

A Form is a collection of CDEs, and a Protocol is a collection of Forms. For clinical trials applications, the Forms correspond to Case Report Forms (CRFs), and Protocols correspond to a clinical trial protocol.

Template forms are generic forms that can be used as the basis for creating the actual forms used in a Protocol. Templates are stored both as a collection of CDEs that comprise the form, and an MS Word or PDF file that shows the CDEs laid out.