NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

To address these limitations, the CTIIP team is developing a unified query interface n Integrated Query System to make it easier to analyze data from different research disciplines represented by TCGA, TCIA, and co-clinical/small animal model data. The lack of common data standards will not be a hindrance to data analysis, since the server that the unified query interface is on will accept whole slides without recoding. The unified query interface will also provide a common platform and data engine for the hosting of “pilot challenges," which are described in more detail below. Pilot challenges will advance biological and clinical research in a way that also integrates the clinical, co-clinical/small animal model, and digital pathology imaging disciplines.

...

The purpose of the digital pathology component of CTIIP is to support data mashups between image-derived information from TCIA and clinical and molecular metadata from TCGA. The team is using OpenSlide, a vendor-neutral C library, to extend the software of caMicroscope, a digital pathology server, to provide the infrastructure for these data mashups. The extended software will support some of the common formats adopted by whole slide vendors as well as basic image analysis algorithms. With the incorporation of common whole slide formats, caMicroscope will be able to read whole slides without recoding, which often introduces additional compression artifacts. With the addition of , and provide a logical bridge from proprietary pathology formats to DICOM standards. With caMicroscope's support for basic image analysis algorithms, (CKK: the following things can happen...). These additional features of caMicroscope will make it possible to integrate digital pathology images within TCIA and NBIA and provide a logical bridge from proprietary pathology formats to DICOM standardsresearchers can use this tool to enable analytic and decision support using digital pathology images from TCIA and NBIA.

Data federation, a process whereby data is collected from different databases without ever copying or transferring the original data, is part of the new infrastructure as well. It will make it possible to create integrative queries using data from TCIA and TCGA. The software used to accomplish this data federation is Bindaas. Bindaas is middleware that is also used to build the backend infrastructure of caMicroscope. The team is extending Bindaas with a data federation capability that makes it possible to query data from TCIA and TCGA.

...

To make data comparable, it must first be collected in a structured fashion. For example, TCGA relies on Common Data Elements, which are the standard elements used to validate TCGA clinical data. Second, data comparisons require common data standards. For example, when a tumor is described in a human or an animal, a data standard would require that the type of tumor match one of a discrete number of options using approved vocabulary, such as "brain".

The integrated query system Integrated Query System currently in development will serve as an archive of images from multiple imaging disciplines, shown below.

...

Given the technical challenges inherent in such a system, technical solutions are being developed. One of the most fundamental to the success of the Integrated Query System is an Application Program Interface (API) that provides a Representational State Transfer (REST) API to TCIA metadata and image collections. This API is built using a middleware platform called Bindaas that also forms the backend infrastructure of caMicroscope. Bindaas is open source and extensible, so can be expanded to include more data types and additional integration. This API is being designed to support federation of multiple information repositories using the concepts of data mashups. Data mashups have become popular in recent years for visualizing and analyzing data from distinct databases, as will comprise the Integrated Query System. The data mashups will provide analytic and decision support , which will act as a foundation for a broader set of novel community research projects.

...