Requirements from researchers and/or their supporting informatics groups drive the creation of metadata in the caDSR, and not vice versa. Metadata content development usually starts with a request for assistance by a researcher planning a clinical or research data collection. Metadata curators work with the user and EVS to identify an appropriate vocabulary while identifying a mix of new and existing CDE content to support the scientific requirement. Curators always attempt to reuse existing metadata (where that content supports the scientific requirement) as a way to help scientists ensure the compatibility of their data with other data collected across the enterprise..
The word cloud below illustrates the broad variety of collections of data elements that are reposited in the caDSR for various communities and types of studies.
Because so many groups need metadata content, and to ensure that the recording of metadata does not become a bottleneck in the research process, all caDSR roles are open to individuals in the community upon completion of appropriate training. Training is role based and includes courses on infrastructure, methodologies, and tool usage. Most of the training is managed through self-paced modules, while the actual tool use modules are done through web sessions. More information can be found on the caCORE training wiki.
CBIIT’s management of metadata began as part of an effort to support CTEP’s reporting for breast cancer trials, and from a need to develop and disseminate standards that would ensure consistency and accuracy in reporting across the Cooperative Groups. This led to the establishment of a centralized resource and associated web-based tools for clearly documenting and sharing human- and machine-readable data descriptions. The need to maintain and share data about data, or metadata, became the basis for the NCI’s repository of CDEs, metadata and data standards, what is now known as the caDSR. A CDE Steering Committee was formed to define what kind of metadata was needed for the repository. Driven by the needs from community to create, share, and manage CDEs, a set of metadata attributes was established, which included name, definition, valid values, and workflow status. Consultation with appropriate experts identified ISO 11179, an international standard for data-element registries, as meeting the needs identified by the CDE Steering committee. As time went on, more groups wanted to record their data elements and share them via the caDSR, so additional features were added, including extensions of ISO 11179 to enable storage templates for CRFs that use CDEs.