NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Challenges are being increasingly viewed as a mechanism to foster advances in a number of domains, including healthcare and medicine. The US Federal government, as part of the open government initiative, has underscored the role of challenges as a way to "promote innovation through collaboration and (to) harness the ingenuity of the American Public." Large quantities of publicly available data and cultural changes in the openness of science have now made it possible to use these challenges and , as well as crowdsourcing (enlisting the services of people via the Internet)efforts , to propel the field forward.

The infrastructure requirements to both host and participate in some of the "big data" efforts can be monumental. Medical imaging data can be large, historically requiring the shipping of disks to participants. The computing resourcing needed to process these large datasets may be beyond what is available to individual participants. For the organizers, creating the infrastructure that is secure, robust, and scalable can require resources, such as IT manpower support, compute capability, and domain knowledge, that are beyond the reach of many researchers.

The medical imaging community has conducted a host of challenges at conferences such as MICCAI (Medical Image Computing and Computer Assisted Intervention) and SPIE (the international society for optics and photonics). However, these have typically have been modest in scope (both in terms of data size and number of participants). Medical imaging data poses additional challenges to both participants and organizers. For organizers, ensuring that the data is free of PHI is both critical and non-trivial. Medical data is typically acquired in DICOM format. However, ensuring that a DICOM file is free of PHI requires domain knowledge and specialized software tools. Multimodal imaging data can be extremely large. Imaging formats for pathology images can be proprietary and interoperability between formats can require additional software development efforts. Encouraging non-imaging researchers (for example, machine learning scientists) to participate in imaging challenges can be difficult due to the domain knowledge required to convert medical imaging into a set of feature vectors. For participants, access to large compute clusters with computing power, storage space, and bandwidth can prove difficult. Medical imaging data is challenging for non-imaging researchers.

Some of the key advantages of challenges over conventional methods include 1) scientific rigor (sequestering the test data), 2) comparing methods on the same datasets with the same, agreed-upon metrics, 3) allowing computer scientists without access to medical data to test their methods on large clinical datasets, 4) making resources available, such as source code, and 5) bringing together diverse communities (that may traditionally not work together) of imaging and computer scientists, machine learning algorithm developers, software developers, clinicians, and biologists.  

As explained in the Challenge Management System Evaluation Report, challenge hosts and participants cannot do it alone. Hosts must ensure that the data used in challenges are free of Protected Health Information (PHI), which is critical and non-trivial,  require access to large compute clusters with computing power, storage space, and bandwidth.. Additionally, imaging formats for pathology images can be proprietary and interoperability between formats can require additional software development efforts. For participants,

However, it is imperative that the imaging community develops the tools and infrastructure necessary to host these challenges and potentially enlarge the pool of methods by making it more feasible for non-imaging researchers to participate. Resources such as the Cancer Imaging Archive (TCIA) have greatly reduced the burden for sharing medical imaging data within the cancer community and making these data available for use in challenges.

Medical Imaging Challenge Infrastructure (MedICI), a system to support medical imaging challenges.

 

by developing knowledge extraction tools and comparing the decision support systems for clinical imaging, co-clinical imaging, and digital pathology, which will now be represented as a set of integrated data from TCIA and TCGA. The intent is not to specifically implement a rigorous “Grand Challenge”, but rather to develop “Pilot Challenge “projects. These would utilize limited data sets for proof-of-concept, and test the informatics infrastructure needed for such “Grand Challenges” that would be scaled up and supported by extramural initiatives later in 2014 and beyond.

...