NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 43 Next »

Contents of this Page

Introduction to CTIIP

Imaging-based cancer research is in the beginning phase of an integrative-biology revolution. It is now feasible to extract large sets of quantitative image features relevant to cancer prognosis or treatment across three complementary research domains: clinical imaging, pre-clinical imaging, and digital pathology. These high-dimensional image feature sets can be used to infer clinical phenotypes or correlate with gene–protein signatures. This type of analysis, however, requires large volumes of data.

To serve the need for research across domains, the National Cancer Institute Clinical and Translational Imaging Informatics Project (NCI CTIIP) team is creating a set of open-source software tools that support a comprehensive and reusable exploration and fusion of clinical imaging, pre-clinical imaging, and digital pathology data. The Cancer Genome Atlas (TCGA) and The Cancer Imaging Archive (TCIA) projects, with molecular metadata and image-derived information, respectively, have created a rich multi-domain data set. This data set, however, is in an infrastructure that provides limited query capability for identifying cases based on all of the available data types. Moreover, this infrastructure is incapable of integrating data from other research domains due to a lack of common data standards.

To address these limitations, the CTIIP team is developing a unified query interface to make it easier to analyze data from different research domains. This interface, plus related open-source software and data standards, would then be applied to small animal model data, and provide a common platform and data engine for the hosting of “pilot challenges.” These pilot challenges will proactively facilitate biological and clinical research across the clinical, pre-clinical, and digital pathology imaging research domains. The algorithms used in the pilot challenges will be shared with the community via an open-source software clearinghouse.

The approach taken to development in this project emphasizes modular semantic interoperability and open source tooling, making it immediately valuable to scientists with NCI-funded research networks in the three research domains, as well as the national and international research communities, and providing a framework for enhanced adoption of these methods by biologists in the larger genomics/proteomic communities.

Most importantly, the common informatics infrastructure will provide researchers with analysis tools they can use to directly mine data from multiple high-volume information repositories, creating a foundation for research and decision support systems to better diagnose and treat patients with cancer.

The following table presents the data that the CTIIP team is integrating through various means. This integration relies on the expansion of software features and on the application of data standards, as described in subsequent sections of this document.

 

DomainData Set
Clinical ImagingThe Cancer Genome Atlas (TCGA) clinical and molecular data
 The Cancer Imaging Archive (TCIA) in-vivo imaging data
Pre-clinicalSmall animal models
Digital PathologycaMicroscope

The sub-projects, along with the solutions they provide, are discussed in this guide and listed below.

Sub-Project NameSolution it Provides
Digital Pathology and Integrated Query SystemAddress the interoperability of digital pathology data, improve integration and analytic capabilities between TCIA and TCGA, and raise the level of interoperability to create the foundation required for pilot demonstration projects in each of the targeted research domains: clinical imaging, pre-clinical imaging, and digital pathology imaging.
DICOM Standards for Small Animal Imaging; Use of Informatics for Co-clinical TrialsAddress the need for standards in pre-clinical imaging and test the informatics created in the Digital Pathology and Integrated Query System sub-project for decision support in co-clinical trials.
Pilot ChallengesChallenges will be designed to develop knowledge extraction tools and compare decision support systems for the three research domains, which will now be represented as a set of integrated data from TCIA and TCGA. The intent is not to specifically implement a rigorous “Grand Challenge,” but rather to develop pilot challenge projects. These would use limited data sets for proof-of-concept, and test the informatics infrastructure needed for such “Grand Challenges” that would later be scaled up and supported by extramural initiatives.

The Importance of Data Standards

The common infrastructure that will result from CTIIP and its sub-projects depends on data interoperability, which is greatly aided by adherence to data standards. While standards such as Annotation and Image Markup (AIM) and Digital Imaging and Communications in Medicine (DICOM) exist to support images, vendors of data viewers and other tools required for the analysis of imaging data have not widely adopted them. The lack of standards in pre-clinical and pathology prevents the ability to share and leverage data across studies and institutions.

Furthermore, because each pathology-imaging vendor produces its own image management systems, these systems are also, by extension, proprietary and not standardized. The result is that images produced on different systems cannot be analyzed via the same mechanisms. In addition, no standard currently exists for (CKK: purpose of microAIM).

 

•DICOM for small animal research
–Long-term: generate DICOM compliant images vs. non-DICOM compliant images
•µAIM
–Developing the model
–Harmonization with AIM
àStandardized annotations and markup for radiology and pathology images
àImaging and BRIDG (beyond the scope of this project at this point)
•Improvements to the EVS vocabularies

Within these three research domains, only one, clinical imaging, has made some progress in terms of establishing a framework and standards for informatics solutions. For pre-clinical imaging and digital pathology, there are no such standards that allow for the seamless viewing, integration, and analysis of disparate data sets to produce integrated views of the data, quantitative analysis, data integration, and research or clinical decision support systems.

CBIIT has worked extensively for several years in the area of data standards for both clinical research and healthcare, working with the community and Standards Development Organizations (SDOs), such as the Clinical Data Interchange Standards Consortium (CDISC), Health Level 7 (HL7) and the International Organization for Standardization (ISO). From that work, EVS and caDSR are harmonized with the BRIDG, SDTM, and HL7 RIM models. Standardized Case Report Forms (CRFs), including those for imaging, have also been created. The CBIIT project work provides the bioinformatics foundation for semantic interoperability in digital pathology and co-clinical trials integrated with clinical and patient demographic data and data contained in TCIA / TCGA.

•DICOM for small animal research
–Long-term: generate DICOM compliant images vs. non-DICOM compliant images
•µAIM
–Developing the model
–Harmonization with AIM
àStandardized annotations and markup for radiology and pathology images
àImaging and BRIDG (beyond the scope of this project at this point)
•Improvements to the EVS vocabularies

Digital Pathology and Integrated Query System

Digital pathology, unlike its more mature radiographic counterpart, has yet to standardize on a single storage and transport media. The result of this lack of uniformly accepted standards is that outside a given laboratory of small collaborative groups, the integration of pathology data with radiographic, genomic, and proteomic data is all but impossible.

This sub-project addresses the lack of uniformly accepted standards within digital pathology and the simultaneous need for integration of pathology data with radiographic, genomic, and proteomic data. Its mission is to create an open-source digital pathology image server that can host and serve digital pathology images for any of the major vendors without recoding, facilitating data integration. This image server would establish an informatics and IT infrastructure to implement pilot challenges for clinical and pre-clinical studies that integrate the (CKK: talk to Ulli about different names for the same? domains mentioned on this page) genomics, diagnostic imaging, and digital pathology domains.

 

Goals: data exploration, data connection, data mashup, make data available for analysis, make data accessible for image algorithms

Project 1: Integrated Query System for Existing TCGA Data

1)      AIM 1 - Integrated query system for existing TCGA data (including improved pathology systems)

a)      Histopathology

i)       Incorporate Openslide with caMicrosocope enabling  caMicrosocope to directly serve whole slide pathology images from the majority of digital pathology vendors.

ii)       Incorporate support for basic image analysis algorithms into caMicroscope.

iii)      Standards-based image annotation utilizing the Annotation Image Markup (AIM) standard.

b)      Integrative Queries

i)       Programmatic Access to Data to TCGA-related image data.

ii)      Extend software to support data mashups between image-derived information from TCIA and clinical and molecular metadata from TCGA.

 

Histopathology

•Incorporate Openslide with caMicrosocope enabling  caMicrosocope to directly serve whole slide pathology images from the majority of digital pathology vendors.
•Incorporate support for basic image analysis algorithms into caMicroscope.
•Standards-based image annotation utilizing the Annotation Image Markup (AIM) standard.

 Integrative Queries

•Programmatic Access to Data to TCGA-related image data.
•Extend software to support data mashups between image-derived information from TCIA and clinical and molecular metadata from TCGA.

Integrative Query System

Extend software to support data mashups between image-derived information from TCIA and clinical and molecular metadata from TCGA.

Integrative Queries

Programmatic Access to Data to TCGA-related image data.

Extend software to support data mashups between image-derived information from TCIA and clinical and molecular metadata from TCGA.

What the data is used for

Relate data from TCIA, caMicroscope, animal model

genomics, animal

how do we make a decision on a firm diagnosis?

Get queries and relate it to the human data and vice versa

System should integrate clinical data (from TCGA), preclinical data (comes from UC Davis)

Use case: Breast cancer has biomarkers (progesterone status, etc.). One question to ask is "if the estrogen status is negative in humans, what does the pathology look like?" Then compare this to mice. Is the model we have a good model for the human condition?

If you treat a mouse model that has an ER negative status with a certain drug, what is the outcome? Then see this in humans.

We are setting up the data structure so when that is done, we'll be able to see what use cases are possible.

To make data comparable, we must collect it in a structured fashion. Common Data Elements for TCGA.

We are pulling data out of caDSR (ER negative and positive, other common data elements) and we are asking Bob Cardiff's team to ask the same questions so that we can compare human and mouse data.

We are exploring the standardization of informatics. Use all the tools we have to create standard informatics to compare patient to animal data. We are using the available standards: DICOM, AIM, micro AIM. Fundamental to integrative queries.

If you did an integrative query, how would you do it? Data calls to do different integrative queries. How would you use sufficient standard data. Come out with information that will allow you to make a decision. Pilot challenges to compare the decision support systems for three domains.

We need a clear explanation of how to do this.

Data mashups that allow us to

Explain our complicated project in a simple manner so they understand why we are doing and what we are doing.

Pathology problems:1.  proprietary data formats that cannot be displayed and manipulated in the same tools. Solution is to integrate caMicroscope with OpenSlide (allows us to read prop. formats without converting images). Makes a large number of image formats accessible. 2. no standard for markups and annotations. So we're creating microAIM.

Challenges

Solutions

The first step towards the goal of image data integration is the creation of an image server that can host and serve digital pathology images for any of the major vendors without recoding, which often introduces additional compression artifacts. This image server will be caMicroscope, with its functionality expanded by the OpenSlide library.

  • caMicroscope is a digital pathology viewer provides researchers with an HTML5-based web client that can be used to view a digitized pathology image at full resolution. While it is standards-based, implementing both the Annotation and Image Markup (AIM) and Digital Imaging and Communications in Medicine (DICOM) standards, it supports limited formats adopted by whole-slide vendors.
  • OpenSlide is a C library that can read whole-slide images in many common formats adopted by whole-slide vendors.

This project is also proposing a standard for markup and annotations called microAIM.

This infrastructure can be expanded to include more data types and additional integration, which will provide analytic and decision support to researchers, who can then pursue a broader set of novel community research projects.

Small Animal/Co-clinical Improved DICOM Compliance and Data Integration

The impact on integrative research projects such as co-clinical trials would be to give researchers the ability to directly compare data from pre-clinical animal models with real-time clinical data.

Developing DICOM standards for small animal imaging and identify co-clinical datasets to test the integration of TCIA and TCGA for this data.

TCGA Infrastructure Applied to Co-Clinical

1)      AIM 2 - TCGA infrastructure ported to/applied to co-clinical setting 

a)       Pilot improve small-animal DICOM compliance

b)       Identify co-clinical pilot data set and populate integrated ‘omics/imaging infrastructure.

Co-clinical and animal model images: most imaging machines for animal model imaging do not follow the DICOM standard. We developed a supplement to the DICOM standard to accommodate small animal imaging (standard out for balloting). We want to include co-clinical/animal model data in the integrative queries. For this new standard to be used, equipment manufacturers would need to incorporate this standard when they develop machines/software.

AIM 2: TCGA Infrastructure Ported to/applied to Co-clinical Setting

 

•Improve small-animal DICOM compliance
•Identify co-clinical pilot data set and populate integrated ‘omics/imaging infrastructure.

Challenges

Solutions

Pilot Challenges

1)      AIM 3 - “Pilot Challenges” to compare the decision support systems for three imaging research domains: Clinical Imaging, Pre-clinical Imaging, and Digital Pathology.

a)       Leverage and extend the above platform and data systems to validate and share algorithms, support precision medicine and clinical decision making tools, including correlation of imaging phenotypes with genomics signatures. The aims are fashioned as four complementary “Pilot Challenges”.

i)       Clinical Imaging: QIN image data for several modalities/organ systems are already hosted on TCIA. Pilot challenge projects are being explored for X-ray CT, DWI MRI and PET CT similar to the HUBzero pilot CT challenge project.

ii)      Pre-clinical / Co-clinical Imaging leveraging the Mouse Models of Human Cancer Consortium (MHHCC) Glioblastoma co-clinical trials with associated ’omics data sets from the Human Brain Consortium. This proof of concept will focus on bringing together ‘omics and imaging data into a single platform.

iii)     Digital Pathology clinical support. Leveraging Aims1-3 develop open source image analysis algorithms which complement ‘omics data sets and provide additional decision support.

iv)     Enable community sharing of algorithms on a software clearinghouse platform such as HubZero.

Three pilot challenges–pathology, radiology, co-clinical.

Medical Image Computational and computer-assisted Intervention: MICCAI

Interventions in tumors, cardiology, etc that are image-based

Mass General will guide the pilots

Ground truth: find the compatibility of the informatics that we need to run pilots. Take images out of TCIA, CGA, clinical data and compare them.

Jasharee doing MICCAI Challenge in Munich. Segmentation of nuclear imaging in pathology. Combined radiology and pathology classification.

Want to be able to say that these informatics allow us to compare the pathology, rad, co-clinical findings.

Document the approach, technology, application to do a MICCAI challenge the way Jaysharee does it. See their order of march.

Challenges: read one-page document. We want to use pathology images in the challenges. The tool used to display the markup and annotations (for the pathology images) is caMicroscope. There will be a challenge in which animal model data will be used. Give people images they have never seen before and develop algorithms (like to circle all the nuclei). Ground truth decided by a pathologist and a radiologist. The algorithm that comes closest to ground truth is the winner.

Compare the decision support systems for three imaging research domains: Clinical Imaging, Pre-clinical Imaging, and Digital Pathology

•Leverage and extend the above platform and data systems to validate and share algorithms, support precision medicine and clinical decision-making tools, including correlation of imaging phenotypes with genomics signatures. The aims are fashioned as four complementary “Pilot Challenges”.
Four complimentary pilot challenges:

Clinical Imaging:

•QIN image data for several modalities/organ systems are already hosted on TCIA. Pilot challenge projects are being explored for X-ray CT, DWI MRI and PET CT similar to the HUBzero pilot CT challenge project.

Pre-clinical / Co-clinical Imaging:

•Leveraging the Mouse Models of Human Cancer Consortium (MHHCC) Glioblastoma co-clinical trials with associated ’omics data sets from the Human Brain Consortium. This proof of concept will focus on bringing together ‘omics and imaging data into a single platform.

Digital Pathology Clinical Support:

•Leveraging Aims1-3 develop open source image analysis algorithms which complement ‘omics data sets and provide additional decision support.

Community Sharing:

•Enable community sharing of algorithms on a software clearinghouse platform such as HubZero.

Challenge Management System, MedICI

Jaysharee's program: Medical Imaging Challenge Infrastructure: MedICI

  1. Based on open-source CodaLab
  2. ePAD (created by Daniel Rubin's group at Stanford): tool for annotating images, creates AIM images
  3. caMicroscope

http://miccai.cloudapp.net:8000/competitions/28

  1. Competition #1: MICCAI challenge has a training phase where they train their algorithms. A test phase where they run their algorithms on images they have never seen before. They are compared to the ground truth that is determined beforehand. caMicroscope is used to see what is there before and to visualize the results. Overlap/completeness match determines the winner.
  2. Competition #2: They are given slides.

From PPT: Use titles of slides

Setting up a competition by an organizer. Organizer creates competition bundle.

Can go to cancerimagingarchive.net and create shared lists. Shared lists are pulled into CodaLab. That is how they get the test and training data.

Next is to create ground truth.

Regions of interest in a tumor for annotations are necrosis, adema, and active cancer. Radiologists create the ground truth.

Once participants upload their results, they can see them in ePad.

Challenges

Solutions

Scenarios

Need to generate proper therapy for a patient. Look at in vivo imaging, radiology and pathology, run a gene panel to look for abnormal. Look at co-clinical trials (model of a tumor in a mouse that is similar to a human. Experiment therapies on mice.) Run an integrative query to develop a sophisticated diagnosis. Search big data.

Visual pathology integrative queries–Ashish at Emory. Imaging consistent with ground truth.

Need to explain how the challenge management system and integrative query system play together in a scientific scenario.

three tocs: one for challenge steps, one for int query sys. how well does it integrate; what are the common–how do we annotate the tumor in MedICI such that it is compatible with the annotations in the components of the integrative query system. What relationships can we find in the informatics in the animal and patient findings.

Describe each section separately and then see if we can merge the two to answer the scientific question.

Informatics help us communicate. It can help us better treat our patients.

  • No labels