NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Migration of unmigrated content due to installation of a new plugin
Scrollbar
iconsfalse

...

Page info
title
title

Include Page
Semantic Infrastructure 2.0 Roadmap Draft Status
Semantic Infrastructure 2.0 Roadmap Draft Status

Semantic Infrastructure 2.0 needs to address metadata and terminology related requirements from the life sciences domain. This will enable interoperability both between different sub-domains within life sciences, and between life sciences and other domains in caBIG® such as clinical trials and electronic health records.

...

This section highlights some key use cases that depend on data semantics. These use cases provide a representative set to capture the requirements of the life sciences domain. A comprehensive set of all life sciences use-cases can be found at on the ICRi WG GForge wiki archive. This section includes the following:

Table of Contents
minLevel4

Note
titleNote

However, part of the Infrastructure Inception activities include Prototyping Orchestrations and/or Choreographies (including Life Science workflows) as well as outreach to communities to address other major use cases and requirements.The life sciences communities are engaged with the Roadmap inception efforts now on the following:

Refer to the pages listed for the use cases and requirements gathering activities. These will be moved to relevant sections in the Roadmaps when mature as use cases, requirements and resulting architecture design.

...

Version A is "Sequencing of selected genes via Maxim Gilbert Capillary (“First Generation”) sequencing." Nature. 2008 Sep 4 - Epub ahead of print (posted on GForge for the ICRi workgroup).

  1. Develop a list of 2000 to 3000 genes thought to be likely targets for cancer causing mutations.
  2. As a preliminary (lower cost) test, pick the most promising 600 genes from this list.
  3. Develop a gene model for each of these genes.
  4. Hand modify that gene model, for example, to merge small exons into a single amplicon.
  5. Design primers for PCR amplification for each of these genes.
  6. Order Primers for each exon of each of the genes.
  7. Test Primers.
  8. In parallel with steps 1-7, identify matched pairs of tumor samples and normal tissue from the same individual for the tumors of interest.
  9. Have pathologists confirm that the tumor samples are what they claim to be and that they consist of a high percentage of tumor tissue.
  10. Make DNA from the tumor samples, confirming for each tumor that quantity and quality of the DNA are adequate.
  11. PCR amplify each of the genes.
  12. Sequence each of the exons of each of the genes for each tumor and normal pair of DNA samples.
  13. Find all the differences between the tumor sequence and normal sequence.
  14. Confirm that these differences are real using custom arrays, the seqenome (Mass Spec) technology and biotage or both. (A biotage is pyrosequencing-based technology directed specifically at looking for SNP-like changes.)
  15. Identify changes that are seen at a higher frequency than what would occur by chance.
  16. Relate the genes in which these changes are seen to known signaling pathways.

...

Version B. As above, except globally sequence all genes. Science 321: 1807-1812 (2008) (posted on GForge for the ICRi workgroup). Delete steps 1 and 2 and replace step 3 with: 3) Develop a gene model for each of the genes in the Human genome.

...

Version C. Whole genome sequencing using second generation sequencers. Hypothetical (posted on GForge for the ICRi workgroup).

  1. Identify matched pairs of tumor samples and normal tissue from the same individual for the tumors of interest.
  2. Have pathologists confirm that the tumor samples are what they claim to be and that they consist of a high percentage of tumor tissue.
  3. Make DNA from the tumor samples, confirming for each tumor that the quantity and quality of the DNA are adequate.
  4. Sequence each of the sample pairs to the required fold coverage (7.5 to 35-fold, depending on the technology and read length).
  5. Map the individual reads to the canonical human genome sequence.
  6. Find all the differences between the tumor sequence and normal sequence.
  7. Confirm that these differences are real using custom arrays, the seqenome (Mass Spec) technology or biotage or both. (Biotage is a pyrosequencing-based technology directed specifically at looking for SNP-like changes).
  8. Identify changes that are seen at a higher frequency than what would occur by chance.
  9. Relate the genes in which these changes are seen to known signaling pathways.

...

The scientist submits a protocol to the institutional review board (IRB) and begins work upon approval. Libraries of surface-modified nanoparticles with appropriate pharmacokinetic and toxicity profiles are selected and screened for cell binding in vitro using cell cultures of “background” and “target” cell types or classes. The apparent concentration of binding or uptake of each nanoparticle to the different cell classes is measured. Metrics for differential binding to target versus background cells are calculated, and statistical significance is calculated by permutation. (These calculations employ analysis modules available through GenePattern (posted on GForge for the ICRi workgroup).

To validate the increased specificity for binding target cells, those that provide the best discrimination are further tested ex vivo. Under IRB approval, anatomically intact human tissue specimens containing target and background cells are collected. The tissues are incubated with nanoparticles and evaluated for nanoparticle localization using microscopy. Further validation is conducted in vivo using an animal model. Animals are injected with the nanoparticle and another tissue specific probe and intravital microscopy is used to determine the extent of co-localization. The scientist contacts the tech transfer office to pursue next steps.

...

This is a scenario based on evaluating and enriching the NanoParticle Ontology (NPO) (posted on GForge for the ICRi workgroup). The NanoParticle Ontology (posted on GForge for the ICRi workgroup) is , an ontology which is being developed at Washington University in St. Louis to serve as a reference source of controlled vocabularies and terminologies in cancer nanotechnology research. Concepts in the NPO have their instances in the data represented in a database or in literature. In a database, these instances include field names, field entries, or both for the data model. The NPO represents the knowledge supporting unambiguous annotation and semantic interpretation of data in a database or in the literature. To expedite the development of the NPO, object models must be developed to capture the concepts and inter-concept relationships from the literature. Minimum information standards should provide guidelines for developing these object models, so the minimum information is also captured for representation in the NPO.

Nanotechnology is being applied to clinical therapeutics, but this use case could be extended to development of any specialized therapeutics. There are various pre-existing databases holding experimental data that need to be accessible across the entire community to facilitate rational nanomaterial design. Two strategies are being employed. The first is to establish semantic interoperability by finding areas of semantic overlap in the current database models based on controlled vocabularies (NCI Thesaurus, NCI Metathesaurus, Nanoparticle Ontology). The second is to develop a data submission standard based on the extension of standardized models (Biomedical Research Integrated Domain Group (BRIDG), Life Sciences Domain Analysis Model (LS-DAM)) where extensions are supported by controlled vocabularies. New vocabulary is needed to support both of these strategies. New concepts are curated in the controlled vocabularies as appropriate and term definitions are reviewed by the community.

...

Scrollbar
iconsfalse