NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin

...

Technical Description: Biospecimen repositories are deployed locally, as well as Washington University, Thomas Jefferson University, and Fox Chase Cancer Center. Each has their information models registered in a metadata repository, as well as has standardized APIs exposed. The local instance of caTissue discovers services with compatible metadata and APIs, and performs the query. The data returned is aggregated based on standardized metadata, and presented to the user. caTissue uses CDE names, descriptions, and standard value sets to display data, help the user build the query, and issue the query.

Cross Reference:

Identify samples obtained for glioblastoma multiforme (GBM) and the corresponding CT image information.

...

Technical Description: a number of organizations have exposed pathology and image services with standardized metadata.  caB2B uses CDE names, descriptions, and value sets to allow the user to construct a query across all of these services.  The user selects the CDEs to filter on, which includes a join across information models (caTissue annotations to imaging annotations).  A semantic relationship between the two models based on biospecimen identifier has previously been established.  A distributed query is formulated and executed.  The resulting data is aggregated based on semantic relationships and presented to the user using CDE names and descriptions.

Cross Reference:

Determine if each sample used in an expression profiling experiment is available for a SNP analysis experiment.

...

Technical Description: the discovery of analytical steps utilizes inference over semantic annotations of input and output parameters. The researcher selects the metadata types that will be input to the pipeline and those that will be output from the pipeline. The inference engine performs discovery steps, chaining inputs to outputs in an expanding set until all options are exhausted or the resulting type matches.  Furthermore, when specific analytical steps are queries for, full-text and concept-based metadata searches are performed in conjunction with output/input matching to provide the bet possible results. Workflows are saved as a set of steps or as a set of constrains upon which workflows are dynamically generated to meet scientific goals.

Cross Reference:

Support patient to trial matching through the use of computable eligibility criteria

...

Technical Description: the cancer center is running caTissue with a local metadata repository.  When a new annotation is added to caTissue, the dynamic extensions module is invoked.  The caTissue information model is extended to include necessary additional classes and attributes, which in turn are propagated as new data elements in the metadata repository.  These data elements represent well formed metadata that is automatically discoverable and shareable through the public interfaces.  When another organization wishes to extend their caTissue model to include this type of data, they will be able to discover the metadata already created and instantiate a reference to it rather than creating it afresh.

Cross Reference:

When defining new datasets for caIntegrator's data-warehouse for biomedical data collection and analysis, automatically record these new datatypes in a well-defined and federated manner so that data can be shared.

...

Discover and orchestrate services to achieve LS research goals; e.g. start with a hypothesis, identify relevant services that provides the necessary analysis and data, create the worklow/pipeline, report findings.

Wiki Markup\[Baris\] This is use case is overlapping with "Search for all "pre-cancerous" biospecimens.." and "Automatically discover analytical steps for Illumina.." examples aboveunmigrated-wiki-markup

_Domain Description \ [Revised From ICRi Use Cases\]_: A scientist is trying to identify a new genetic biomarker for HER2/neu negative stage I breast cancer patients. The scientist queries for HER2/neu negative tissue specimens of Stage I breast cancer patients using services at his/her cancer center that also have corresponding microarray experiments. Analysis of the microarray experiments identify genes that are significantly over-expressed and under-expressed in a number of cases. The scientist decides that these results are significant, and related literature suggest a hypothesis that gene A may serve as a biomarker in HER2/neu negative Stage I breast cancer. To validate this hypothesis in a significant number of cases the scientist needs a larger data set, so he queries for all the HER2/neu negative specimens of Stage I breast cancer patients with corresponding microarray data and also for appropriate control data from other cancer centers. After retrieving the microarray experiments the scientist analyzes the data for over-expression of genes A. \\

Technical Description: The scientist in this case is trying to develop a workflow that will assist biomarker discovery research. S/he first needs to discover the services that provide biospecimen information with the phenotype s/he is looking for (for example, HER2/neu negative stage I breast cancer) and then the microarray experiment information. Then he needs to create a workflow (orchestrate services) where the input is a phenotype for biospecimens and output is a set of gene of interest. These steps require the support for standard terminologies (and services) and syntaxes to best describe the services' behavior and static data. Furthermore they require inference engines that relates the semantic and syntactic metadata for the inputs/outputs of the services to "assist" scientist to identify what service can be part of the workflow.

Cross References:

Statistical computing environment and sharable metadata for statistical practice.

...

Patrick: This is a very interesting one that is different from the rest. However, I am not sure the last sentence in the domain description adequately captures it. To me, the issue is how a statistician finds the appropriate standards and then integrates them into their own statistical computing environment. All of the other use cases we have focus on the metadata specialist or cancer researcher - this one focuses on the statistician who is performing the analysis and generating the data. We need to drive home his issues and how they are to be resolved. Also, I am not sure I understand the table of contents of objects.

caGRID should support interoperability from non grid platforms.

...

Patrick: This one is very specific to a technology (caGrid). It needs to be generalized. I am not sure how the title matches the descriptions (since non-grid platforms are not mentioned). Also, I am not sure how the domain description matches the technical description. The technical description seems to focus on semantic relationships and transforms, whereas the domain description seems to focus on accessibility to non-technical users.

Semantic search on the cancer grid.

...

Patrick: I think this one is very similar to some of the search use cases earlier.  Also, the technical description seems to focus on terminologies, whereas they are not really mentioned in the domain description.

Integration of radiology, pathology, molecular and genomic data to better predict patient outcome and support clinical decision.

...

Patrick: I am not sure which requirements in this are not captured in the other use cases.

B. Forms Stories

Create and reuse forms

...

Technical Description: semantic relationship and rules between data elements can be formed, stored, and shared in the metadata repository. Furthermore, these relationships can be reasoned on using a inference engines and query systems.

Cross Reference:

...

D. Developer Stories

Iterative development and management of information models

...