NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Panel
titleContents of This Page

Table of Contents

Shared Bibliography

Please accept the Mendeley invitation to the MIDI private group.

Be sure to create an Elsevier account with the email address David used to invite you. If you want to use an existing Elsevier account, please send David the email address associated with it, so that he can send the invitation to the private group to that address.

Please add relevant publications to the shared bibliography. David will continue to maintain this bibliography.

Meeting Notes by Year

2021 MIDI Task Group Meeting Notes

2022 MIDI Task Group Meeting Notes

Table of Contents

July 20, 2021

WebEx recording of 7/20/2021 meeting

  • Introduction: Medical Image De-Identification Initiative (MIDI)
  • Task Group goals
  • Steering Committee
  • Timeline
  • Discussion

August 10, 2021

WebEx recording of 8/10/2021 meeting

  • Instructions to access the MIDI Task Group wiki page
  • Accept Mendeley invitation to access private group for literature review/annotated bibliography
  • Outline of approach
    • metadata vs. pixel data
    • structured (strongly typed) vs. text
    • burned-in text ("printed" and hand-written)
    • identifiable features (e.g., faces, iris, retina)
    • with or without "public" data to compare with
  • Challenging topics
    • evaluation of success of de-identification
    • quantitative comparison of performance
    • quantifying re-identification risk
    • creating test data sets
    • faces (etc.) reconstructed from cross-sections
    • burned-in text - detection, removal, cleaning
    • cleaning text descriptors (metadata or burned in)
    • buried metadata (e.g., EXIF, geotags in JPEG inside DICOM)
    • dates (incl. preserving temporal relationships)
    • pseudonym consistency across separate submissions
    • risks of hashing to create pseudonymous identifiers
    • uniqueness of images limits statistical approaches
    • loss allowable during de-identification (e.g., age fuzzing, pixels)
    • private data element preservation to retain utility
    • ultrasound - still frames and cine loops, lossy compressed
    • photographs and video
    • gross pathology and whole slide images (incl. labels)
    • IRB/ethics committee messaging wrt. de-identification decisions
    • IT security approval/audits of de-identification
    • regulatory requirements: HIPAA Privacy Rule, GDPR, CCPA, others?
    • sufficiency of standards, e.g., DICOM PS3.15 Annex E
    • risk of not following a standard (home-grown decisions)
    • threat of image "signatures", private set intersection methods
    • policy versus the technical details of recompression/decompression artifacts for JPEG
    • data minimization
  • Inventory of tools
    • user interface vs. scripted (bulk, service)
    • configurable - user vs. installer vs. hard-coded
    • platform, language
    • open source, free, commercial, service
    • on-site vs. outside (e.g., [IP]II needs to leave walls for AI on cloud)
  • Roadmap and deliverables
    • interim report
      • full report
      • "primer" on medical image de-identification for newbies/execs
      • confirm what is out of scope (non-goals) - consent, data use agreements, ...
  • Tasking: Members to think about which task they would like to contribute to.

September 14, 2021

WebEx recording of the 9/14/2021 meeting

  • Role of AI in de-identification - demand for data, opportunities, threats
    • Google has a de-id tool
    • Amazon Comprehension
    • Identifying images at risk–which images are likely to contain burned in information than others?
    • Problem with scalability in terms of building the ruleset. Better to identify selectively.
    • Barcodes, pacemaker serial numbers, implanted devices
    • There is the potential of identifying objects but not the raw data.
    • Action: Describe the steps involved in imaging and the evolution of data in different levels of processing
      Case-based data
    • Is raw data in our purview?
    • Raw data is often in proprietary format and can lack a header.
    • Post-processed data like 3D reconstructions
    • What is the harm of reidentification? High-resolution 3D image of the face
    • Penetration testers that applies to de-ID
    • How to evaluate the success of de-facing?
    • Newman, L. H. (2016). AI Can Recognize Your Face Even If You’re Pixelated. Wired. https://www.wired.com/2016/09/machine-learning-can-identify-pixelated-faces-researchers-show/
    • When is it okay to release information that you know is identifiable? Example of boy in NYT.
    • Sometimes reidentification does not provide any new data.
    • What do you now know that you didn't know before?
    • Expectations of doing better deidentification and the threats of better reidentification. What can we do now and what in the future with AI?
    • Do you expect that one day a machine will replace your manual deidentification process? Can a robot replace human review?
    • Can you accept the risk of AI/machines/code? Get to the level of risk that is tolerable.
    • Main topic for the next call: the need for human QC.
    • When will you stop using humans or a targeted subset?
    • What would increase your comfort level to help you stop using human QC.

October 12, 2021

WebEx recording of the 10/12/2021 meeting