NIH | National Cancer Institute | NCI Wiki  

Versions Compared


  • This line was added.
  • This line was removed.
  • Formatting was changed.


  • Introduction to new potential task group member, David Brundage, professor at Cornell.
  • Whole-slide images (WSI) are not usually in DICOM format and must be converted.
  • Dave Gutman shared slides about  about protected health information in WSI.
  • Most PHI lives in the slide label. Sometimes only a partial label is scanned, so a human might not realize PHI is there, but a machine can detect that PHI.
  • The primary image is unlikely to contain PHI. 
  • Luke Geneslaw also shared some slides about detecting label presence in tissue image scans.
  • Trade-off between missing tissue that has cancer on it and bigger files that could have more PHI in the data.
  • Leaving data in the slide label causes problems for de-id. There isn't software out there to redaction of pixel data from the tissue sample. Dave Gutman is working on an NCI project to develop it.
  • JPEG stores data in 8x8 blocks, so it's possible to remove individual blocks from an image.
  • Metadata extraction is unique to format.
  • TCIA has a dictionary of private data elements from DICOM.
  • Python package:
  • Date can be in TIFF times and other data elements defined in the XML, or included as an annotation. 
  • It's our job to identify which areas need to be mitigated.
  • Not all slides are standard formats, like prostate whole mounts.
  • David Clunie would like to create a sub-group of people with a special interest and experience in WSI so that they can create content on that subject for the report. Fred Prior, Dave Gutman, David Brundage will join.
  • We need a person with significant statistical knowledge who could adapt their knowledge to defacing. Justin suggested someone and will talk to David about it.
  • David shared the new version of the task group's de-id report. Please follow the tracked changes offline and if you have any comments on it, please let him know.
  • Common stratification based on type–the task group determined that this is not sufficient and we should be looking at everything regardless. Sentence added to report.
  • We have not yet defined a best practice on how to score risk. Further research is needed.
