Notice: This application will enforce Multi-factor authentication (MFA) for NIH users beginning the evening of Wed Aug 3rd.
NIH | National Cancer Institute | NCI Wiki  

Contents of this Page

Full Report

See attached file for full report content.

Executive Summary

The CBIIT caDSR repository is based on the ISO/IEC standard version 2. The standard has evolved through several iterations and is now at Edition 3, submitted for review at the standards committee. For planning the evolution of the CBIIT metadata repository there is a need to assess the differences between the current model used in the caDSR and the potential target model. This document provides an analysis that could determine with sufficient detail, the extent of the changes that would be required to migrate the current caDSR domain model to the latest ISO/IEC 11179 Standard Model. This analysis identifies and reports on the differences between the current implementation of the caDSR 4.0 Domain Model (Current State) and the ISO/IEC 11179 Edition 3 Conceptual Model (Desired or End State). The caDSR 4.0 Domain model is a logical model used by the community and implemented in the service of a read API for the caDSR. It is NOT identical to the physical implementation database schema of the repository.

After careful study of the Second Committee Draft version (CD2) of Part 3 of the ISO/IEC 11179 Edition 3 Domain Model, which documents the latest changes in the Standard, the class diagrams of the current caDSR model were mapped to the Edition 3 model. At the conclusion of this analysis, there were five main areas of discrepancy found:

  1. The handling of administered objects differs in the following fashion:
    • caDSR defines a single class type called AdministeredComponent, for all registry items that require administration. This central class contains a duplication of the identification, designation (names), definition and registration information from the specific classes of all 14 administered component types defined in the registry.
    • The ISO/IEC 11179 Standard model also includes a class type called Administered_Item for all registry items that require administration. However while designation, definition and identification information is represented under the single administered component type in caDSR, in the Standard, multiple metadata item class types are defined, in order to capture this information: Identified_Item for metadata items that need to be identified, Designatable_Item for metadata items that require naming (designation) and definition, and Classifiable_Item or metadata items that require classification. Although the Identified_Item type is required for the representation of an Administered_Item, the Designatable_Item and Classifiable_Item class types may or may not be represented as Administered_Items.
    • An implementation option to migrate caDSR administered components to a conformant implementation of the Edition 3 model would be to extract the identification, definition and designation attributes of the caDSR AdministeredComponent class, into the Identified_Item and Designatable_Item class types defined in the Standard. Both of these classes would be then be included as inherited types of the Edition 3 Administered_Item class.
    • The identified_item and designatable_item types could be implemented either as extended types of each specific administered item type or they could remain as separate class types with an association to the specific administered item type.
  2. The representation of concepts is another area of discrepancy between the two models:
    • caDSR imports concept information from EVS and stores only the actual identification, designation and definition of concepts while the standard provides a more extensive metamodel for the complete representation of terminology vocabularies.
    • caDSR assigns semantic meaning to a number of administered component types with the use of concepts and derivation rule associations. The Standard represents concepts as a higher level category with defined subtypes.
    • The mapping of the concept derivation association that relates metadata items to multiple concepts joined based on a derivation rule, can be implemented with the Edition 3 model with the creation of linked concept groupings through the usage of concept relations and relation roles as derivation rules.
    • Concepts are defined as Administered Components in caDSR while in the Standard there is no specification for classifying them as registered items.
  3. Classification functions are also represented differently in both models:
    • caDSR defines two classification categories (Classification Schemes and Classification Scheme items) that can be used uniquely or in combination to classify all metadata items of type administered component.
    • In the Standard, classification is essentially the result of associating metadata items to concepts within the context of predefined classification schemes or concept systems.
    • The Concept and Concept_System classes in the standard are used in place of Classification_Scheme_Item and Classification_Scheme, respectively. So the Classification_Scheme(s) in caDSR could be registered as Concept_Systems under Edition 3, and the constituent Classification_Scheme_Items as Concepts.
    • Classification categories are administered component items in caDSR while in the Standard there is no specification for identifying them as either administered items.
  4. The concept of contexts is also represented differently in both models:
    • While context usage in caDSR is synonymous to that of a classification category for all administered components, the Standard defines the Context class only in relation to the naming and definition of metadata items.
    • An implementation option for modeling caDSR contexts would be to follow the Standard guidelines and implement and make contexts relevant to names and definitions only.
  5. Finally two important model areas within the current caDSR domain model will have to remain as extended components in case of migration to the Edition 3 domain model:
    • Although UML model components can be stored in the original structures of both models, extensions will continue to be needed in order to facilitate data retrieval in the Edition 3domain model.
    • Protocols and Forms are the other CBIIT domain specific components of the current caDSR model that will also remain as extensions to the Edition 3 domain model.
    • A number of classes were added in caDSR to support specific functionalities such as the pairing of types of metadata items. None of these extended classes exist in the new standard and will have to be addressed separately.

For full details on these findings, please see the attached report file caDSR_Domain_Model_Gap_Analysis_v1.doc



  • No labels