NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Digital Pathology and Integrated Query System

The goal of this sub-project is to create a digital pathology image server that can accept images from multiple domains and run integrative queries on that data. This will create a type of data mashup whereby data can be selected from distinct imaging sources and made accessible for image algorithms.

Digital Pathology

Digital pathology, unlike its more mature radiographic counterpart, has yet to standardize on a single storage and transport media. In addition, each pathology-imaging vendor produces its own image management systems, making image analysis systems proprietary and not standardized. The result is that images produced on different systems cannot be analyzed via the same mechanisms. Not only does this lack of standards and the dominance of proprietary formats impact digital pathology, but it prevents digital pathology data from integrating with radiographic, genomic, and proteomic data.

This component of this sub-project proposes to leverage several open source to provide an open source digital pathology image server that can host and serve digital pathology images for any of the major vendors will incorporate the vendor-neutral C library OpenSlide with caMicroscope to directly serve whole slide pathology images from the majority of digital pathology vendors. This will be accomplished without recoding, which often introduces additional compression artifacts. A single digital pathology server would allow NCI to include digital pathology images within TCIA / NBIA and provide a logical bridge from proprietary pathology formats to DICOM standards. Specifically, the team will expand the functionality of the caMicroscope digital pathology platform to include support for some of the common formats adopted by whole slide vendors. The Openslide OpenSlide library would make this functionality possible.

•Incorporate support for basic image analysis algorithms into caMicroscope.
•Standards-based image annotation utilizing the Annotation Image Markup (AIM) standard.

Data federation, a process whereby data is collected from different databases without ever copying or transferring the original data, is part of the solution as well. It requires a shared semantic scheme and a supporting software framework to link databases. Most importantly for the success of CTIIP, data federation will make it possible to create integrative queries using data from TCIA and TCGA.  The software used to accomplish this data federation is Bindaas. Bindaas is middleware that is also used to build the backend infrastructure of caMicroscope. The team is extending Bindaas with a data federation capability that makes it possible to query data from TCIA and TCGA.

...

All three research domains will clearly need an imaging archive that can be leveraged for integration across multiple data types and sources. For example, TCGA program has the goal of producing a comprehensive genomic characterization and analysis of 200 types of cancer and providing this information to the research community. TCIA and the underlying National Biomedical Image Archive (NBIA) software stack were created to manage well-curated, publicly-available collections of medical image data, including diagnostic images associated with the tissue samples sequenced by TCGA. TCIA currently supports over 40 active research groups including researchers who are exploiting the existing linkages between TCGA and TCIA. TCIA has recently released an Application Program Interface (API) that provides a REST API to TCIA metadata and image collections. This API is built using a middleware platform called Bindaas,and this API is being designed to support federation of multiple information repositories using the concepts of a data mashups.  This infrastructure can be expanded to include more data types and additional integration, and provide analytic and decision support, which will act as a foundation for a broader set of novel community research projects.a foundation for a broader set of novel community research projects.

 Goals: data exploration, data connection, data mashup, make data available for analysis, make data accessible for image algorithms

Project 1: Integrated Query System for Existing TCGA Data

...

ii)      Extend software to support data mashups between image-derived information from TCIA and clinical and molecular metadata from TCGA.

Histopathology

•Incorporate Openslide with caMicrosocope enabling  caMicrosocope to directly serve whole slide pathology images from the majority of digital pathology vendors.
•Incorporate support for basic image analysis algorithms into caMicroscope.
•Standards-based image annotation utilizing the Annotation Image Markup (AIM) standard.

 Integrative Queries

•Programmatic Access to Data to TCGA-related image data.

...