Error rendering macro 'rw-search'

null

Page History

Versions Compared

Old Version 32

changes.mady.by.user Unknown User (mcconnellp)

Saved on Apr 23, 2010

compared with

New Version 33

changes.mady.by.user Unknown User (deshpans)

Saved on May 03, 2010

Key

This line was added.
This line was removed.
Formatting was changed.

...

Support caB2B Services to integrate data on grid
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=50&t=247&p=801^{Image Removed}
- Requirements statement: https://wiki.nci.nih.gov/x/UAhyAQ
- Use Case: https://wiki.nci.nih.gov/x/Y2RyAQ

...

Support development of workflows:
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=146
- Requirements statement: https://wiki.nci.nih.gov/x/VAhyAQ
- Use Case: https://wiki.nci.nih.gov/x/FxRlAQ
ICR IRWG Requirements
https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=146
ICR ICRi Use Cases
https://gforge.nci.nih.gov/plugins/wiki/index.php?Use%20Cases&id=512&type=g

B. Forms Stories

Create and reuse forms

Domain Description: Forms provide a convenient paper-like electronic mechanism to capture data in a structured way. For example, when a patient is placed on a clinical trial, data about the patient's demographics and eligibility for the trial need to be captured. The trial investigator sits with the forms curator to generate this case report form. The forms curator searches for existing demographics forms and form modules, and the investigator reviews them. They identify an appropriate set of questions, and include them in the case report form. They then move onto the eligibility checklist. The investigator drafted the checklist, and it has been approved by the IRB. The forms curator begins keying in the questions, some of which are identified as existing questions and reused, others of which are created completely new. The form is marked complete and is available by the clinical research staff for gathering and enrolling new patients.

Technical Description: Forms are a collection of data elements annotated and grouped within the metadata repository. The forms curator can search for existing forms and form modules (portions of a form) by question text, annotations, etc. These can be reused by reference, or imported and modified. When new data elements are being curated, the form curator can search the federated set of all metadata repositories to identify data elements for reuse. This can happen automatically within the curation tooling or explicitly through the metadata web interface. The final CRF is saved and annotated within the local metadata repository.

Cross Reference:

...

Statistical computing environment and sharable metadata for statistical practice.

Domain description: A team of biostatistician tries to analyze the massive amount of clinical research data generated during the various phases of clinical trials. The statisticians generate various artifacts from the highly normalized data during this process such as programs for data manipulation and statistical analysis, the analysis data sets, the results of the analysis. In addition, according to the various guiding principal for a clinical trial, the data also needs to comply with various FDA Regulations and Data Standards. Therefore the real dilemma facing the team of biostatistician is how they should carry out the statistical analysis according to good statistical practices that will maintain the credibility of results and assure data integrity.

Technical description: A Statistical Computing Environment (SCE) provides a foundation for documenting rigor in the analysis and reporting of clinical trial results while increasing productivity and quality. To ensure credibility, reliability and data integrity assurance the best way is to work in an environment that tracks all of the objects. By developing a table of contents of the objects to be created one can track the objects. The table of contents itself becomes a part of the study metadata. The environment would typically include standard programs and algorithms for producing common reports of trial data. Above all, the statistical computing environment develops electronic documentation of the entire process.

Cross reference

Requirement statements

...

: https://wiki.nci.nih.gov/x/

...

KDxyAQ

Use

...

cases:https://wiki.nci.nih.gov/

...

display/seminfra/Init1SD60-Metadata+for+statistical+practice

caGRID should support interoperability from non grid platforms.

Domain description: A cancer researcher who is not familiar with the grid wants to collaborate with his peers from the cancer research community to identify tissue specimens, microarray, clinical trials and images of his interest. Being a total stranger to the grid he does not know the data standards and the models he needs to invoke to support his search.

Technical description: One persons object is another person's attribute". Depending on one's world view, a real life entity can be modeled in UML as Object, Attribute or value set. In caBIG models now, some model Race (and Ethnicity) as an Object while others model Race (and Ethnicity) as an attribute. This is problematic, because Race data that is modeled differently cannot be "seamlessly integrated" on caGrid (there needs to be a transform). One can start to use the SAIF language in terms of Conceptual, Platform Independent (logical) and Platform Specific (Implemented). Given that the grid has CIMs, PIMs and PSMs for applications, and BAM and a DAM and other institutions, may have their own DAMs, BAMs, CIMs, PIMs and PSMs. These elements need to be mapped to each other at whatever level needed, to get to some semantic interoperability.

Cross reference

Requirement statement: https://wiki.nci.nih.gov/x/hAVyAQ

Forum post: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=39&t=167

Usecases: https://wiki.nci.nih.gov/display/seminfra/Init6SD12SD12UML+modelling+in+different+layer

Semantic search on the cancer grid.

Domain description: A cancer researcher is looking for lung cancer specimens with a histologic picture of a 'oats cell carcinoma' in males aged between 45-55 years and who have a history of smoking for at least 10 years. He invokes a data service and queries caTissue instances in DFCI, TJU, LLU for specimens and rather than using a advance query, uses a semantic query like show me all specimens of lung cancer in males aged 45-55 years who are smokers for at least 10 years and whose histologic picture is that of a oat cell carcinoma. The query runs on various instances of caTissue and comes back with the identified specimens that matched the criteria. Being able to facilitate a semantic search on the grid would facilitate greater cohesiveness of the research cancer research community.

Technical description: The Lexical Grid, coordinated by the Mayo Clinic Division of Biomedical Statistics and Informatics, provides a semantic foundation upon which multiple APIs can be developed that support consistent searching, navigation and cross terminology traversal. These open-source tools are used in a variety of projects such as the NCI Cancer Biomedical Informatics Grid, the National Center for Biomedical Ontology, the Biomedical Grid Terminology project, and the World Health Organization International Classification of Diseases (ICD-11) development process. LexGrid hosts a wide variety of terminologies and ontologies including ICD-9-CM, the Gene Ontology, the HL7 Version 3 vocabulary, and SNOMED-CT. LexGrid can also represent complete NLM Unifed Medical Language System, which currently includes over 100 source terminologies. The Lex-RDF model, maps the LexGrid model elements to corresponding constructs in W3C specifications such as RDF, OWL, and SKOS. With LexRDF, the terminological information represent in LexGrid can be translated to RDF triples, and therefore allowing LexGrid to leverage standard tools and technologies such as SPARQL and RDF triple stores.

Cross reference
Use case
https://wiki.nci.nih.gov/display/seminfra/Init4hm1.SD210-Triple+store+backend+for+LexEVS

Requirement statement: https://wiki.nci.nih.gov/x/3AJyAQ

Forum post: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=37&t=127

Integration of radiology, pathology, molecular and genomic data to better predict patient outcome and support clinical decision.

Domain Description: A patient reports to a hospital with a clinical condition of Glioblastoma multiforme. The treating oncologist wants to find out the likely outcome for this patient. He initiates a search based on patient presenting criteria in Imaging, histopathology and genomic data to look for cohort with matching criteria and survival rate to better predict outcome for his patients.

Technical description: A service is needed that can collate data from the national cancer imaging archive, caArray, cancer central clinical database to pull out information for a patient on staging, grading, and other prognostic aspects of cancer. This service can run on multiple instances of various tools and pull out corresponding data the patient. This service can also be extended to support clinical decision like if a particular cohort reports better outcome and survival rates with treatment A, then it can be used as a standard line of treatment for patients with similar picture.

Requirement statement: https://wiki.nci.nih.gov/x/HpN-AQ

Forum post: https://wiki.nci.nih.gov/display/Imaging/TCGA+Enterprise+Use+Case

B. Forms Stories

Create and reuse forms

Cross Reference:

CDEs from Man. curation, UML models and CRFs
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=122
- Requirements statement: https://wiki.nci.nih.gov/x/JgpyAQ
- Use Case: https://wiki.nci.nih.gov/x/SGxyAQ

Support of form annotations to enable form behavior
Domain Description: a forms curator is sitting down to create the case report forms for a new trial titled "Study of Ad.p53 DC Vaccine and 1-MTin Metastatic Invasive Breast Cancer." Her goal it to make the forms intuitive, reduce human error when collecting data, and as precise as possible. When building the demographics form, she decides to make the age data element derived from the date of birth data element. Entering data that can simply be calculated from other data can only introduce errors, especially since date of birth is also captured in the hospital system so can easily be validated. When building the medical history CRF, she realizes that fifteen of the questions only relate to women that have previously been pregnant. She promptly enters a skip pattern based on the gender question, as well as the pregnancy question. That should significantly save time. Now that all the questions are entered, she goes back to edit them so have minimum lengths for required text questions, maximum lengths for numeric questions, pick-lists for those questions with a particular set of possible answers, and a data mask for the social security number question. Now, the clinical data management system can render the forms via PDF using all of this handy information.

Technical Description: Forms provide a convenient paper-like electronic mechanism to capture data in a structured way. For example, when a patient is placed on a clinical trial, data about the patient's demographics and eligibility for the trial need to be captured. However, forms can also exhibit specific behavior that may or may not be reusable. These include skip patterns (if the answer to question 10 is "Yes" then skip to question 15), derived values ("what is your age" and "is your age less than, greater than, or equal to 65), and composite answers ("check all" or "more than one of the above"). Furthermore, specific requirements about how a form is rendered can exist. For example, the question description, help text, valid values, maximum and minimum answer length, the format of a data mask (such as SSN), etc. It is important to be able to allow for forms to be annotated with this behavior such that tools can appropriately render and act upon them. Furthermore, if appropriate, web- and paper-based collection instruments can be automatically generated from this metadata.

Extend allowable answers with additional permitted values

Domain Description: In many cases, data elements can be reused but the allowable values need to be extended or restricted. For example, one researcher may want to capture diseases of the nervous system while another may want to capture diseases of the cirulatory system. These both can be captured in the same data element (disease) using the same controlled terminology (ICD-9). However, the list of allowable values is quite different. Furthermore, yet another researcher may want to focus only on certain circulatory diseases, such as those of the heart. A metadata specialist can sit with a domain specialist to identify the appropriate ontologies and constrain or expand them as needed.

Technical Description: the metadata repository allows for data element to have a value domain referencing an external terminology. Furthermore, those terminologies can be constrained or expanded as needed in the local repository.

C. Metadata Specialist Stories

Creation of metadata and management of information models through modeling and web tools

Domain Description: the imaging center at a cancer center has just purchased a magnetic resonance spectroscopy (MRS) machine to add to their numerous magnetic resonance imaging (MRI) machines. MRS is used to measure the levels of different metabolites in body tissues. The MR signal produces a spectrum of resonances that correspond to different molecular arrangements of the isotope being "excited". Magnetic resonance spectroscopic imaging (MRSI) combines both spectroscopic and imaging methods to produce spatially localized spectra from within the sample or patient. A metadata specialist has been assigned to enhance their imaging repository to handle this new type data. He opens his modeling tool, and begins to add additional classes related to metabolic signatures. As the metadata specialist types the class name "Metabolite" into the modeling tool, a number of existing classes and concepts are suggested to him automatically. One of these peak's his interest, and he clicks on the link for more information. His web browser pops up showing him the data element from a system focused on drug discovery and pharmokenetics. This is the perfect term to reuse, and this type of linkage should provide for a convenient way to easily match potential drugs with MRS results. He imports the class into his modeling tool, bringing with it an number of associated classes and attributes that may be of use.

Technical Description: all data elements and referenced concepts in the metadata repository are indexed and easily accessible by type-ahead and other integrated tooling solutions. The model browser is a convenient interface for exploring the metadata in a UML or data element centric way. Furthermore, the repository supports the import and export of modeling standards, such as XMI, which facilitates direct reuse.

Managing semantic relationships in order to link and share data

Domain Descriptions: a metadata specialist has been tasked with cross-linking the hospital system and the clinical systems in her organization. Fortunately, both systems have been modeled with well defined metadata, which has been registered in a metadata repository. Unfortunately, the information models used by the systems are not harmonized, so data cannot easily be integrated. Therefore, the metadata specialist defines semantic relationships between the elements that she knows are related, though they do not share the exact same common data elements. For example, she semantically links Patient Last Name in the hospital system to Subject Surname in the clinical system. Once all of the appropriate relationships are made, clinicians are able to navigate between the system seamlessly. Furthermore, the antiquated data warehouse where all of this information is painstakingly transformed and poorly linked can be retired, and quality of care queries can now be carried out using semantic relationships.

Cross Reference:

ICR IRWG Requirements
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=146
- Requirements statement: https://wiki.nci.nih.gov/x/OARyAQ
- Use Case: https://wiki.nci.nih.gov/x/qxJyAQ

Supporting interoperability standards (e.g. Healthcare Datatypes)
Domain Description: ISO 21090, otherwise known as HL7 Healthcare Datatypes, provide a basic representation of common chunks of data exchanged in the healthcare community, such as Address, Document, and Coded List. A metadata specialist has been tasked to expose some clinical research data in a standards-based approach. She sits down to her modeling tool, and, as a first step, imports the healthcare data types from the caBIG metadata repository. She begins replacing what were complex sets of classes and attributes in her existing model with these standard datatypes. The resulting system is not only simplified, but is also interoperable by virtue of using ISO 21090.

Technical Description: the metadata repository allows for the representation of any standard as long as it can be encoded in UML. ISO 21090 is such as standard, and can easily be exported into XMI and imported into a modeling tool. In UML, these classes can be represented as complex types and applied to attributes rather than associations.

Cross Reference:

Mapping/transformation support for ISO21090 data types
- Requirements statement: https://wiki.nci.nih.gov/x/2gpyAQ
- Use Case: https://wiki.nci.nih.gov/x/IQhyAQ

Capturing data in a standard way using data element reuse

Is this one redundant with "Creation of metadata and management of information models through modeling and web tools" and "Finding touch points with other systems when building a population science application"?

Description: Core to interoperability is capturing data in a standard way using the same or similar data elements. Data elements individually can be reused, for example allowing for patient data to be joined across systems using the Patient Medical Record Number. Forms in their entirety can be reused, such as eligibility forms for multi-site clinical trials. Data formats for encoding biomedical data can be shared, such as MAGE-ML for gene expression data. This allows for data to be captured in a standard way, shared across platforms and systems, for users to search based on the data that is encoded using type-ahead Google-like functionality, and for users to build new systems based on the standards that are already in use

Extend allowable answers with additional permitted values

C. Metadata Specialist Stories

Creation of metadata and management of information models through modeling and web tools

Managing semantic relationships in order to link and share data

Cross Reference:

ICR IRWG Requirements
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=146
- Requirements statement: https://wiki.nci.nih.gov/x/OARyAQ
- Use Case: https://wiki.nci.nih.gov/x/qxJyAQ

Cross Reference:

- Use Case: https://wiki.nci.nih.gov/x/qxJyAQ
CDEs from Man. curation, UML models and CRFs
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=122
Mapping/transformation support for ISO21090 data types
- Requirements statement: https://wiki.nci.nih.gov/x/2gpyAQJgpyAQ
- Use Case: https://wiki.nci.nih.gov/x/IQhyAQ

Capturing data in a standard way using data element reuse

- SGxyAQ

Is this one redundant with "Creation of metadata and management of information models through modeling and web tools" and "Finding touch points with other systems when building a population science application"?

Cross Reference:

ICR IRWG Requirements
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=146
- Requirements statement: https://wiki.nci.nih.gov/x/OARyAQ
- Use Case: https://wiki.nci.nih.gov/x/qxJyAQ
CDEs from Man. curation, UML models and CRFs
- Forum posting: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=122
- Requirements statement: https://wiki.nci.nih.gov/x/JgpyAQ
- Use Case: https://wiki.nci.nih.gov/x/SGxyAQ

Finding touch points with other systems when building a population science application

Domain Description: The mission of population science is to reduce the risk, incidence, and deaths from cancer as well as enhance the quality of life for cancer survivors. Genetic, epidemiologic, behavioral, applied, and surveillance cancer research are typical activities of population science researchers, which combines clinical, basic, and population scientists to further individual and population health. Patients are often followed for months or years after diagnosis and/or treatment. A cancer population sciences researcher is studying chemotherapy use in young and elderly patients with advanced lung cancer. For this type of cancer, physicians and patients often have to choose between platinum-based chemotherapy or non-platinum-based chemotherapy. Platinum-based treatment is generally considered to be more aggressive and effective, but it is also more toxic. It is unclear whether physicians are avoiding platinum-based treatments in the elderly because of concerns about frailty and toxicity. The cancer researcher consults with a metadata specialist for designing the information model that will include patient, clinical, pathology, tissue, and imaging data. The metadata specialist selects a number of information models that are currently being used by other researchers, and overlays them to determine the data elements that are important for linking and capturing such diverse data. These are exported from the metadata repositories and imported into his modeling tool to be enhanced with the new fields for the population science research.

Technical Description: each information model has well defined metadata available in distributed metadata repositories. The nature of the metadata is such that simple queries can determine overlapping data elements. This can be visualized side-by-side in a tabular format, or graphically in a UML class model. The metadata repository can output data using UML standards, such as XMI, which can easily be aggregated and imported into a modeling tool.

Support data transformations in order to allow different flow cytrometry tools to work together

Domain Description: Flow cytometry (FCM) is a technique for counting and examining microscopic particles, which is routinely used in the diagnosis of health disorders, especially blood cancers, but has many other applications in both research and clinical practice. Automated identification systems could potentially help findings of rare and hidden populations. An informatics specialist is working on objectively comparing many of the FCM analytical methods available in the community for use in automated population identification using computational methods. The primary barrier to this evaluation is the wide variety of data standards used by the tooling, which includes MIFlowCyt, ACS, NetCDF, Gating-ML, FuGEFlow, and OBI. The informaticist decides to take an approach of defining semantic relationships and transformation services. The result is a system in which FCM analytical workflows are able to discover and perform translations as needed during analytical comparisons.

Support data transformations in order to allow different flow cytrometry tools to work together

Technical Description: semantic relationship and rules between data elements can be formed, stored, and shared in the metadata repository. Furthermore, these relationships can be reasoned on using a inference engines and workflow engines. Translation services can be defined and identified as such, which would allow for them to be discovered and applied as needed.

Content Driven browser
An informatics scientist modeling a new tool is browsing the CDE browsers to find the CDEs of his interest.

The CDE browser in its current shape has some usability issues. Non-regular users using the browser find the terminology used very technical and it requires training to understand. For curators whose job it is to work with these tools that may be acceptable. However, if these tools are to be usable by outside researchers, the terminology should be a better fit with less-technical terms, those researchers are likely to use. The visual presentation of controls/action is problematic and the relationship between the browse tree and the search forms (Search for CDEs, Search for Forms) is not intuitive.

Given the numerous usability issues with the CDE browser the need is to come up with an alternative and a more efficient search workflow.

Cross reference

New CDE browser workflow:

https://wiki.nci.nih.gov/pages/viewpageattachments.action?pageId=24259415

Requirement statement: https://wiki.nci.nih.gov/x/agRyAQ

Forum post: https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=43&t=109Technical Description: semantic relationship and rules between data elements can be formed, stored, and shared in the metadata repository. Furthermore, these relationships can be reasoned on using a inference engines and workflow engines. Translation services can be defined and identified as such, which would allow for them to be discovered and applied as needed.

D. Developer Stories

Iterative development and management of information models

...

Content

Space Tools

Page History

Versions Compared

Old Version 32

New Version 33

Key

B. Forms Stories

B. Forms Stories

C. Metadata Specialist Stories

C. Metadata Specialist Stories

D. Developer Stories