NIH | National Cancer Institute | NCI Wiki  

Error rendering macro 'rw-search'

null

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Author: Jayashree Kalpathy-Cramer and Karl Helmer

Panel
titleContents

Table of Contents

...

Some of the key advantages of challenges over conventional methods include 1) scientific rigor (sequestering the test data), 2) comparing methods on the same datasets with the same, agreed-upon metrics, 3) allowing computer scientists without access to medical data to test their methods on large clinical datasets, 4) making available resources resources available, such as source code, and 5) bringing together diverse communities (that may traditionally not work together) of imaging and computer scientists, machine learning algorithm developerdevelopers, software developers, clinicians and biologists. CKK: Reviewer comment "Review wording of # 4 and 5". CKK: This paragraph also ended with a stray (4) which I just deleted. See the Word document to know what I mean.

However, despite this potential, there are a number of challenges. Medical data is usually governed by privacy and security policies such as HIPPA that make it difficult to share patient data. Patient health records can be very difficult to completely deidentifyde-identify. Medical imaging data, especially brain MRIs can be particularly challenging as once could easily reconstruct a recognizable 3D model of the subject.

...

Challenges have been popular in a number of scientific communities since the 1990s. In the text retrieval community, the Text REtrieval Conference (TREC), co-sponsored by NIST is an early example of evaluation campaigns where participants work on a common task using data provided by the organizers and evaluated with a common set of metrics. ChaLearn has organized challenges in machine learning since 2013. CKK: Word document originally said "20013". Changed to "2013". 

We begin with a brief review of a few medical imaging challenges held in the last decade and review their organization and infrastructure requirements. Medical imaging challenges are now a routine aspect of the highly regarded MICCAI annual meeting. Challenges at MICCAI began in 2007 with a liver segmentation and caudate segmentation challenges.

...

The MICCAI-BraTS challenge highlighted a number of findings that mirrored experiences from other domains. CKK: Word document had the lone word "These" following the period of the previous sentence. Did you intend to add another sentence here? 

  • The agreement between experts is not perfect (~0.8 Dice score).
  • The agreement (between experts and between algorithms) is highest for the whole tumor and relatively poor for areas of necrosis and non-enhancing tumor.
  • Combining segmentations created by "best" algorithms created a segmentation that achieves overlap with consensus "expert" labels that approaches inter-rater overlap.
  • This approach can be used to automatically create large labeled datasets.
  • However, there are cases where this does not work and we still need to validate a subset of images with human experts.

...

These platforms typically charge a hosting fee and offering monetary rewards is pretty common. They have large communities (hundreds of thousands) of registered users and coders and can be a way to introduce the problem to communities outside the core domain expert academic researchers and get solutions that are novel in the domain. CKK: Word document had the lone word "The" following the period of the previous sentence. Did you intend to add another sentence here? 

Kaggle is a very popular platform for data science competitions. It is a commercial platform used by companies to pose problems for monetary rewards, jobs and knowledge advancement. There are public and private leaderboards with the test data also being withheld from the participant. Typical hosting costs are reported to be $15,000-20,000 plus additional costs for prizes. However, Kaggle does have a free hosting option to organize challenges for educational purposes. CKK: Reviewer comment: "Questions: How are educational challenges defined? Would NCI-run challenges fall under this category? What limitations do educational challenges vs. paid challenge hosting have?" This option is primarily meant to be used by instructors as part of the class curriculum. Kaggle does not provide any support for organizers of Kaggle In Class. There is a 100GB limit on file size. There also appears to be very simple options for scoring. Almost all challenges hosted here appear to be prediction type challenges where results can be submitted as a csv file and the "truth" is also a csv file. It does not appear that imaging-based challenges (such as segmentation challenges) would lend themselves to being hosted on Kaggle In Class without significant effort.

...

Challenge Post has been used to organize hackathons, online challenges and other software collaborative activities. In person hackathons are free while the online challenges cost $1500/month (plus other optional charges). CKK: Reviewer comment: "Where does the section for commercial challenge management systems end?"  and separate comment: "Could the suitability of each platform for imaging challenges be discussed?"

...

Matrix of Features and Frameworks (1 -5)

...

CKK: Reviewer comment: "Some of the systems mentioned earlier are not listed in the matrix: Innocentive, TopCoder, ChallengePost, Midas/COVALIC"

Below is a table that rates the relative merits of the most relevant frameworks that we evaluated. The scale is 1-5 where 1 indicates excellent support for the feature while 5 indicates that that feature is not currently part of the system or there is limited support.

CKK: Reviewer comment located in the CodaLab ease of setting up new challenge cell: "Explain the scoring system" 

 KaggleSynapseHubZero (challenges/projects)COMICVISCERALCodaLab
Ease of setting up new challenge2/4 (if new metrics need to be used)22/5231

Cost (own server/hosting options)

$10-$25k/challenge
(free for class)

Free/hosted

Free/hosted

Free/hosted

Free/Azure costs

Free/hosted

License

Commercial

OS

OS

OS

OS

OS

Ease of extensibility

5

4

4

2

3

2

Cloud support for algorithms

4

3

3

4

1

3

Maturity

1

1

1/5

3

4

3

Flexibility

 

 

 

 

 

 

Number of users

1

1

1/5

3

3

3

Types of challenges

1

1

1

3

1

1

Native imaging support

No

No

No

Yes

Limited

No

API to access data, code

5

1

3

4

4

4

...

Although Kaggle, Innocentive and Topcoder all have platforms that have been used extensively for a really wide range of challenges, these were excluded from further consideration (and from the above table) as the platforms since they are not open source and cannot be modified.

...