NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • Think about what we should say about this topic in the report.

December 7, 2021 Meeting

WebEx recording of the 12/07/2021 meeting

Dr. Fred Prior's slides

David Clunie's slides

Discussion of de-facing with Dr. Fred Prior:

The need for de-facing and the risks of faces in images.

Presentation by Fred Prior, the leader for investigating this issue for a long time.

  • A person's photograph is PII and PHI.
  • This question started in 2009 during the caBIG initiative.
  • Can average humans match a photograph to an MRI facial reconstruction?
  • It was determined in the neuroimaging field that this was a problem.
  • Today you can get free MR data. You can get free tools to do the 3D reconstructions. You can use readily available facial recognition data to face data on social media.
  • The conclusion has been drawn in the scientific community that you can recognize a photograph through digital data.
  • However, the conclusion has not been so drawn by law. HIPAA does not require that images be de-faced.
  • OCR/Civil Rights that oversees HIPAA has a rule that if the covered entity does not have evidence that the recipient of the de-ID'd data has the ability to reidentify it, it does not have to be de-faced. It's still a concern by legal people. It's just a matter of time. It will probably change in the next version of the HIPAA regulations.
  • TCIA has come to the conclusion that within TCIA, we will compile a complete list of the collections that contain faces. All brain cancers, most head/neck, and radiologic examinations of the head.
  • De-facing can hide the data of scientific interest. 
  • We are focused on the brain tumors we can fix.
  • We're not worried about the collections that contain occasional faces. We'll need to develop a search algorithm to find all of these and clean them up. Reject this part of the study/series if it's new, delete or de-face if existing. We are keeping them as restricted access until they can be de-faced. Future research to give us a method to do this.
  • We upgraded our data use agreement that says you will not re-identify. State the reason why you want the data. We are not judging these reasons, just recording them.
  • We created Masker, our own de-facing software. We have prototyped the pieces and used them on two collections but they're not ready for production. Need to integrate this with POSDA. The curation process work in progress. Sometimes Masker doesn't find the face and curators have to manually add the bounding box. Masker is reversible, so we are open to criticism.
  • This came about because of a project from this past summer to explore the de-facing algorithms that are available. We implemented them and they didn't work so we created our own.
  • We focused on FSL_deface and MRI_reface from Mayo. It replaces the real face with a generic face.
  • All of the existing tools used NIFTI, which is a problem.
  • Technical issues impede the full implementation of this. Edge cases where we can't find the face.
  • Adding de-facing to the TCIA process takes a lot of time.
  • Collections that must be restricted access have been communicated to NCI.
  • The goal is to have a communicate to our community tomorrow about doing the switchover with the data.
  • Longer term the complete TCIA database will be de-faced.

David's slides and discussion:

  • Insider attack
  • Outsider attack
  • Schwartz et al experiment is difficult to extrapolate from but has a lot of impact on the common understanding of the capabilities of AI.
  • Real-world numbers: How many people in the US with gliomas to compare with? 100,000 over a 5-year period, 65 median age.
  • If we train on reconstructions, how can you quantify reconstructions?
  • Literature needed to inform the HIPAA regulation writers.
  • Neuroimaging bias in this context. The wrong conclusions can be reached quickly.
  • HIPAA has the statistical arm and the 18 elements arm. Peoples faces may not be useful in a specific context that can be shown statistically.
  • Do we need to write a paper or do an experiment? Can we do experiments with data that could risk its status?
  • We need a statistical expert who is familiar with quantifying reidentification risk.
  • Judy would love to run experiments.
  • Could we do experiments with TCIA data? License, data use agreements...
  • Judy said she was able to get her IRB to approve experiments with head-neck data.
  • Create a sub-group of this task group to plan these experiments.
  • Can we apply Facebook's re-id algorithm to Mayo data? Federated experiment that aggregates findings to avoid risking re-identification of any individual institution's data. Could get approval for something like this. 
  • Brian Bialecki's idea to apply to experiment design: An institution may be the only group that knows the identity of some data. They then would also know whether your guess as to the identity of that data could be wrong. They could tell you if you're in the ballpark as far as location. This experiment could be done with fewer approvals.