NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

This section highlights some key use cases that depend on data semantics. These use cases provide a representative set to capture the requirements of the life sciences domain. A comprehensive set of all life sciences use-cases can be found at on the ICRi WG GForge wiki archive. This section includes the following:

Table of Contents
minLevel4

Discovering a Biomarker

A scientist is trying to identify a new genetic biomarker for HER2/neu negative stage I breast cancer patients. Using a caGrid-aware client, the scientist queries for HER2/neu negative tissue specimens of Stage I breast cancer patients at LCCC that also have corresponding microarray experiments. Analysis of the microarray experiments identify genes that are significantly over-expressed and under-expressed in a number of cases. The scientist decides that these results are significant, and related literature suggest a hypothesis that gene A may serve as a biomarker in HER2/neu negative Stage I breast cancer. To validate this hypothesis in a significant number of cases the scientist needs a larger data set, so the scientist queries for all the HER2/neu negative specimens of Stage I breast cancer patients with corresponding microarray data and also for appropriate control data from other cancer centers. After retrieving the microarray experiments the scientist analyzes the data for over-expression of genes A.

...

Version A is "Sequencing of selected genes via Maxim Gilbert Capillary (“First Generation”) sequencing." Nature. 2008 Sep 4 - Epub ahead of print (posted for the workgroup).

  1. Develop a list of 2000 to 3000 genes thought to be likely targets for cancer causing mutations.
  2. As a preliminary (lower cost) test, pick the most promising 600 genes from this list.
  3. Develop a gene model for each of these genes.
  4. Hand modify that gene model, for example, to merge small exons into a single amplicon.
  5. Design primers for PCR amplification for each of these genes.
  6. Order Primers for each exon of each of the genes.
  7. Test Primers.
  8. In parallel with steps 1-7, identify match pairs of tumor samples and normal tissue from the same individual for the tumors of interest.
  9. Have pathologists confirm that the tumor samples are what they claim to be and that they consist of a high percentage of tumor tissue.
  10. Make DNA from the tumor samples, confirming for each tumor that quantity and quality of the DNA are adequate.
  11. PCR amplify each of the genes.
  12. Sequence each of the exons of each of the genes for each tumor and normal pair of DNA samples.
  13. Find all the differences between the tumor sequence and normal sequence.
  14. Confirm that these differences are real using custom arrays, the seqenome (Mass Spec) technology and biotage or both. (A biotage is pyrosequencing-based technology directed specifically at looking for SNP-like changes.)
  15. Identify changes that are seen at a higher frequency than what would occur by chance.
  16. Relate the genes in which these changes are seen to known signaling pathways.

...

Version B. As above, except globally sequence all genes. Science 321: 1807-1812 (2008) (posted for the workgroup) . Delete steps 1 and 2 and replace step 3 with: 3) Develop a gene model for each of the genes in the Human genome.

...

Version C. Whole genome sequencing using second generation sequencers. Hypothetical (posted for the workgroup).

  1. Identify matched pairs of tumor samples and normal tissue from the same individual for the tumors of interest.
  2. Have pathologists confirm that the tumor samples are what they claim to be and that they consist of a high percentage of tumor tissue.
  3. Make DNA from the tumor samples, confirming for each tumor that the quantity and quality of the DNA are adequate.
  4. Sequence each of the sample pairs to the required fold coverage (7.5 to 35-fold, depending on the technology and read length).
  5. Map the individual reads to the canonical human genome sequence.
  6. Find all the differences between the tumor sequence and normal sequence.
  7. Confirm that these differences are real using custom arrays, the seqenome (Mass Spec) technology or biotage or both. (Biotage is a pyrosequencing-based technology directed specifically at looking for SNP-like changes).
  8. Identify changes that are seen at a higher frequency than what would occur by chance.
  9. Relate the genes in which these changes are seen to known signaling pathways.

...

The scientist submits a protocol to the IRB and begins work upon approval. Libraries of surface-modified nanoparticles with appropriate pharmacokinetic and toxicity profiles are selected and screened for cell binding in vitro using cell cultures of “background” and “target” cell types/classes. The apparent concentration of binding or uptake of each nanoparticle to the different cell classes is measured. Metrics for differential binding to target versus background cells are calculated, and statistical significance is calculated by permutation. ( These calculations employ analysis modules available through GenePattern (posted for the workgroup).

To validate the increased specificity for binding target cells, those that provide the best discrimination are further tested ex vivo. Under IRB approval, anatomically intact human tissue specimens containing target and background cells are collected. The tissues are incubated with nanoparticles and evaluated for nanoparticle localization using microscopy. Further validation is conducted in vivo using an animal model. Animals are injected with the nanoparticle and another tissue specific probe and intravital microscopy is used to determine the extent of co-localization. The scientist contacts the tech transfer office to pursue next steps.

...

This is a scenario based on evaluating and enriching the NanoParticle Ontology (NPO) (posted for the workgroup). The NanoParticle Ontology (posted for the workgroup) is , an ontology which is being developed at Washington University in St. Louis to serve as a reference source of controlled vocabularies and terminologies in cancer nanotechnology research. Concepts in the NPO have their instances in the data represented in a database or in literature. In a database, these instances include field names, field entries, or both for the data model. NPO represents the knowledge supporting unambiguous annotation and semantic interpretation of data in a database or in the literature. To expedite the development of NPO, object models must be developed to capture the concepts and inter-concept relationships from the literature. Minimum information standards should provide guidelines for developing these object models, so the minimum information is also captured for representation in the NPO.

...