Overview
The Metadata Maintenance task will clean up of existing content in caDSR database and develop automated programs to keep the content synchronized in conjunction with the vocabulary changes in NCI terminology software system (NCIt)
Scope
The scope of the Metadata Maintenance work comprises three primary tasks:
- Retire redundant Representation Terms, Object Classes and Properties
- Synchronize NCI CBIIT and caDSR concepts: names and definitions
- Develop monthly maintenance program to keep caDSR concepts synchronized with NCIt monthly maintenance
Related Metadata Cleanup GForge Trackers
GF ID |
Summary |
Priority |
Comments |
---|---|---|---|
7706, 10738 |
Add filter for special characters to Concepts "before row" trigger |
3 |
The special characters are non-printing control characters that cause data rendering problems in the caDSR tools and formatting problems in XML documents. |
6338 |
Write Script to Retire Unused OCs, Property and Rep Terms |
5 |
'Retire Withdrawn' all unused OCs, Property, Rep Term components that are not in caBIG Context. For this Tracker, ignore unused Concepts. |
2694, 2708 |
Write script to set EVS_Source to NCI_CONCEPT_CODE |
3 |
Not possible through the user interface |
7741 |
Retire VMs with strange characters in the Short Meaning, Description and preferredDefinition |
3 |
Some characters may be extended ASCII (8 bit vs. current 7 bit); moving to 8 bit character set may resolve this. |
819 |
When object classes or property is changed, the preferred name for data element concept is not changed |
3 |
--- |
1051 |
Make sure that Effective Date is being set for Concepts created by the Excel Loader |
3 |
--- |
2565 |
Write scripts to correct the origin and database columns for concepts |
3 |
--- |
4387 |
There are.some concept derivation rules with no concepts associated |
3 |
--- |
4598 |
Trigger on value meaning not copying preferred definition to description |
3 |
--- |
5440 |
Some data has leading/trailing spaces. |
3 |
--- |
6361 |
Write script to change all OCs and Properties with "Side Effect" in the concept derivation rule to use Adverse Event instead |
4 |
--- |
6985 |
Retire C25367 "Assessment" and merge components into C25217 |
4 |
--- |
9983 |
QUEST_CONTENTS_EXT table lacks trigger logic to set BEGIN_DATE |
3 |
--- |
Objectives
- Reduce the cost of manual identification of redundant metadata
- Reduce the cost of retrospective metadata harmonization
- Reduce the time and effort required to identify and reuse semantically identical or similar content
Approach
The following steps should be taken to maintain the metadata:
- CBLOAD will be used as the development environment
- All cleanup scripts will be owned by cadsr_maint. This account has already been created on CBLOAD
- CBLOAD is loaded with the SBR and SBREXT schema from CBPROD. Attached are instructions on preparing the CBLOAD environment for writing and testing scripts by cadsr_maint
- Scripts written in CBLOAD by and owned cadsr_maint are tested. This process can be done several times
- Once the scripts are tested reports are generated to get approval from content group
- Once the reports are approved then the scripts in cadsr_maint are moved to cadsr_maint staging
- The changes are made and can be tested by the content group on stage
- Once approved, make a backup of cbprod and run the scripts on CBPROD