NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Panel
titleContents of This Page

Table of Contents


July 20, 2021

WebEx recording of 7/20/2021 meeting

...

WebEx recording of the 10/12/2021 meeting

Discussion of this document:

  • Not practical for a human to review all of the images.
  • TCIA built a tool called Kaleidoscope that flattens images and saves time.
  • Radiology techs can also do this work, but sensitivity goes down as you view more images.
  • What is the cost of a data breach in terms of manpower? 
  • As screening goes up, breaches go up.

Discussion of the de-identification process:

  • Did you have a formal QC process that involved you verifying the quality of the de-identification process after it was done?
  • John Perry: developed a process and a test to make sure it worked, but didn't look at all of the images to confirm it was done without breaches.
  • Monitor logs to make sure nothing slips through without automation applied to it. Grab a random 1% and look through the headers.
  • Need a more medical model that understands the variability in what we're trying to do
  • Partial vs. complete success-field or header
  • Catch-22 that you can't crowd-source because there could be PHI
  • Build synthetic datasets that have real street addresses in real places that don't match the actual data
  • Train a model and release that but not the dataset
  • Would need a statistician
  • Judy: We are encountering issues that the black box models do not understand. Running experiments on adversarial networks. Surprising findings.
  • Amalgamate clinical and imaging data. 
  • Models have already learned sufficient information to learn age, sex, and race. We don't understand how this happens and maybe they could pick up other identification data.
  • We are not trying to hide age, sex, and race. We're trying to prevent the re-identification of a person.
  • Increasing the uniqueness of the image data is a threat for re-identification. But if you don't have a database of everyone's fingerprints, for example, it's useless.
  • At some point we have to be clear of what we are trying to reidentify and what the practical limits are.
  • Clearview.ai

Tasking

  • Justin Kirby: what TCIA encounters that is part of their human review processes


  • David Clunie: organize report topics in an outline


  • Judy: Write up some content (not the overview) on defacing


  • TJ and Ying: Can help with defacing