NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Domain Description: In many cases, data elements can be reused but the allowable values need to be extended or restricted.  For example, one researcher may want to capture diseases of the nervous system while another may want to capture diseases of the cirulatory system.  These both can be captured in the same data element (disease) using the same controlled terminology (ICD-9).  HoweverHowever, the list of allowable values is quite different.  Furthermore, yet another researcher may want to focus only on certain circulatory diseases, such as those of the heart.  A metadata specialist can sit with a domain specialist to identify the appropriate ontologies and constrain or expand them as needed.

...

https://wiki.nci.nih.gov/x/qxJyAQSupporting interoperability standards (e.g. Healthcare Datatypes)
Domain Description: ISO 21090, otherwise known as HL7 Healthcare Datatypes, provide a basic representation of common chunks of data exchanged in the healthcare community, such as Address, Document, and Coded List.  A metadata specialist has been tasked to expose some clinical research data in a standards-based approach.  She sits down to her modeling tool, and, as a first step, imports the healthcare data types from the caBIG metadata repository.  She begins replacing what were complex sets of classes and attributes in her existing model with these standard datatypes.  The resulting system is not only simplified, but is also interoperable by virtue of using ISO 21090.

...

Support data transformations in order to allow different flow cytrometry tools to work together

Domain Description:

Technical Description:

Even semantically harmonized tooling may utilize data of different formats.  For example, in flow cytrometry alone, there are a large number of standards for encoding data, such as MIFlowCyt, ACS, Flow cytometry (FCM) is a technique for counting and examining microscopic particles, which is routinely used in the diagnosis of health disorders, especially blood cancers, but has many other applications in both research and clinical practice.  Automated identification systems could potentially help findings of rare and hidden populations.  An informatics specialist is working on objectively comparing many of the FCM analytical methods available in the community for use in automated population identification using computational methods.  The primary barrier to this evaluation is the wide variety of data standards used by the tooling, which includes MIFlowCyt, ACS, NetCDF, Gating-ML, FuGEFlow, and OBI.  When exchanging data between these systems, it is important to be able to describe the relationships between the standards, data elements, and value sets.  The structure of the data, the naming of the data elements, and the actual values used to encode the same data may need to be transformed in order to interoperate on the data.  On one hand, relationships between the standards can be manually described, and on the other hand, computable metadata enables automated transformation  The informaticist decides to take an approach of defining semantic relationships and transformation services.  The result is a system in which FCM analytical workflows are able to discover and perform translations as needed during analytical comparisons.

Technical Description: semantic relationship and rules between data elements can be formed, stored, and shared in the metadata repository.  Furthermore, these relationships can be reasoned on using a inference engines and workflow engines.  Translation services can be defined and identified as such, which would allow for them to be discovered and applied as needed.

D. Developer Stories

Iterative development and management of information models

Domain Description: Iterative and Incremental development is a cyclic software development process developed in response to the weaknesses of the waterfall model. It starts with an initial planning and ends with deployment with the cyclic interaction in between. The basic idea behind iterative enhancement is to develop a software system incrementally, allowing the developer to take advantage of what was being learned during the development of earlier, incremental, deliverable versions of the system. Learning comes from both the development and use of the system, where possible key steps in the process are to start with a simple implementation of a subset of the software requirements and iteratively enhance the evolving sequence of versions until the full system is implemented. At each iteration, design modifications are made and new functional capabilities are added.  In order to support an iterative development process, it is necessary that the metadata itself be iteratively developed.  The information model is enhanced, semantics added and removed, on a monthly basis.  

Technical Description: The metadata repository supports software engineers and metadata specialists to create mod

The metadata repository itself must support the developer to create, modify, and remove metadata on an ongoing basis.

Cross Reference:

...

Support standardized processes for software development and conformance

...

Domain Description: caEHR is the flagship project that is applying the ECCF process - , which, when applied effectively, should produce specifications that can be used to evaluate how and at what levels various information systems are interoperable. This is important to enabling coordination of IT resources across the community of NCI stakeholders. The caEHR project is currently creating and managing various artifacts (CFSS, PIM, PSM) manually. Significant challenges include: 1) managing traceability and change; 2) formulating conformance assertions so that they can be evaluated; 3) collaborating on model elements (i.e. distributed model authoring). 
The Knowledge Repository project should facilitate the application of the ECCF process is facilitated by , providing a formal model of ECCF artifacts that can be queried to, for .  As an example, determine this supports traceability among artifacts or , the ability to generate and synchronize other artifacts.

...

artifacts, and the synchronization of artifacts.

Technical Description: ECCF artifacts can be defined fully within UML, which can be stored in the metadata repository.  This would allow the artifacts to be queried, manipulated, compared, and exported.

...