Note: When contributing to this page please use Firefox as your browser to allow use of the "Rich Text" WYSIWYG interface for editing this page. Internet Explorer, Chrome, and Safari only allow editing the wiki markup code.
Definition of project
The goal of this project is to create a survey of Publicly Available InVivo Medical Imaging Archives and the underlying software capabilities. It is generally agreed that there is a need for public medical imaging archives to provide the biomedical research community, industry, and academia with access to images that support:
- Lesion detection and classification
- Accelerated diagnostic imaging decision
- Quantitative imaging assessment of drug response
The purpose of this project is to provide a practical guide for the community which allows them to:
- to assess existing software and instantiations that are appropriate to their research or clinical needs.
- to locate relevant publicly available data for research
Publicly Hosted Biomedical Imaging Archives
|
|
|
|
|
|||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Sponsor(s) |
Cancer Imaging Program, caBIG |
NIAMS, caBIG |
BIRN |
Lab of NeuroImaging UCLA (LONI) |
BIRN |
BIRN |
Lung Cancer Alliance |
Optical Society of America |
NDAR |
|
|
|
|
Content Type |
In Vivo Cancer Imaging |
Osteoarthritis |
Osteoarthritis |
ADNI (Alzheimers), |
Alzheimer's Medical Imaging |
ELUDE (Elderly Depression), |
Patient-contributed Medical scans |
Autism Research |
Autism |
|
|
|
|
Archive Software |
NBIA |
XNAT |
Image Data Archive |
custom |
XNAT |
MIDAS |
MIDAS |
|
|
|
|
|
|
Central curation/review |
Yes |
|
|
|
|
|
|
|
|
|
|
|
|
Explicit data sharing policy |
|
|
|
|
|
|
|
|
|
|
|
|
|
Availability/Uptime |
~99% |
|
|
|
|
|
|
|
|
|
|
|
|
Submission Technology |
|
|
|
|
|
|
|
|
|
|
|
|
|
Standard of De-Identification |
Supplement 142 w/ Keep Description, Retain Device Info options enabled |
|
|
|
|
|
|
|
|
|
|
|
|
Support for multi-site submissions |
Yes |
|
|
|
|
|
|
|
|
|
|
|
|
Project- or Collection- based groupings? |
Yes |
|
|
|
|
|
|
|
|
|
|
|
|
Size of Current Volume |
~3TB |
|
|
|
|
|
|
|
|
|
|
|
|
Federated implementation |
Yes |
|
|
|
|
|
|
|
|
|
|
|
|
number of DICOM Tags query-able |
~90? |
|
|
|
|
|
|
|
|
|
|
|
|
DICOM transfer protocol |
For submission, but not download |
|
|
|
|
|
|
|
|
|
|
|
|
methods to download DICOM data |
Web, FTP, Java Webstart client |
|
|
|
|
|
|
|
|
|
|
|
|
Affiliation with Journal |
No |
|
|
|
|
|
|
|
|
|
|
|
|
Support New Collection? |
Yes |
|
|
|
|
|
|
|
|
|
|
|
|
caBIG
NBIA
- Open source, federated model for grid based image sharing.
- Allows for "nodes" to be deployed in other institutions, which can then be connected via the grid so they are query-able from any NBIA server.
- Includes both public data access without login as well as option to build secure role-based Collections which require special access via logging in
- Simple search query options include node selection, modality, contrast enhancement, anatomical site, image slice thickness, Collections (pre-defined sets of related images), the date the images were made available, and whether or not they have associated annotations
- Advanced and Dynamic search options available to search upwards of 70 additional DICOM tag attributes
- Images can be viewed on the web site via JPG thumbnails or using the Cine functionality
- Allows for storage of annotation files and meta data, though there are plans to move most of this functionality to AIM data services in the near future
- Downloads are provided in 3 formats, either via a Java webstart client (no size requirement issues, images are not zipped) or HTTP (zipped, for downloads <3GB in size) or an emailed link to an FTP site (zipped, for downloads >3GB in size)
- Currently 2 major nodes hosted at NCI, one for CIP/CBIIT and another for NIAMS.
- CIP/CBIIT
- http://imaging.nci.nih.gov
- Contains ~25 purpose built image collections, ~15 of which are publicly accessible to anyone - more info on the contents of each collection can be found at https://wiki.nci.nih.gov/x/lRWy
- Collections range in size from as small as a handful of patients up to larger collections such as LIDC-IDRI (1010 patients) or the colonoscopy collections (over 800 patients each)
- Approximately 2.5 TB of image data as of August 2010
- NIAMS
- https://niams-imaging.nci.nih.gov/
- Contains data on nearly 5,000 osteoarthritis patients as of August 2010
- Modalities include MRI and CR
- CIP/CBIIT
LONI
LONI homepage: http://www.loni.ucla.edu/
Image Data Archive - http://www.loni.ucla.edu/Research/Databases/
- Uses LONI Image Data Archive, not sure if this can be deployed and hosted elsewhere for other archives if desired. Did not immediately notice any place to download the code.
- Requires secure login, has role based access to various Projects in their archive.
- Allows search based on project (prior to the simple query page) via Simple or Advanced query. Simple query contains subject ID, modality, research group, series description, weighting, sex, age, slice thickness, and acquisition plane. Advanced query offers a wide variety of 27 different search criteria pertaining to the patient, project, clinical info, study info, and image attributes. (Note: advanced query only seems to be an option with ADNI data set)
- Viewing images in the web site launches "LONI Image Viewer", a java based viewer which lets you scan through image slices in axial/sagittal/coronal orientations as well as zoom, pan, flip (horizontal/vertical) and adjust brightness and contrast.
- Can add specific series/studies to custom collections, collections can then be downloaded or shared. Download can be in native DICOM format or in NiFTI/ANALYZE/MINC formats? I've heard of NiFTI, not sure what the other options are. Download is managed through a java applet, came with images and a couple XML files containing some patient and equipment metadata.
- Data collections include:
- Alzheimer's Disease Neuroimaging Initiative (ADNI) - Consists of both image data (MRI and PET), image meta-data, and additional clinical/genetic/numeric summary data for 895 patients
- BRIN - Logged in with my "ADNI (GUEST)" access this Project only appeared to have one 3 year old patient, no description of this data set is on the "Available Data" page
- Cryosection Imaging (CRYO) - 3 patients, histology modality
- Public Anonymized Dataset (PAD) - 3 "normal control" patients, MRI modality
- International Consortium for Brain Mapping (ICBM) - 851 "normal control" patients, modalities including MRI, fMRI, MRA, DTI, and PET. I do not have access to this data set currently so did not review its contents.
- Australian Imaging Biomarkers & Lifestyle Flagship Study of Ageing (AIBL) - 285 patients including controls, MCI, and AD subject groups. Modalities include MRI and PET. I do not have access to this data set currently so did not review its contents.
BIRN
BIRN data portal: http://www.birncommunity.org/resources/data/
BIRN is connected with multiple institutions which host multiple archives using different software and containing different data sets.
- Function BIRN Data Repository - http://fbirnbdr.nbirn.net:8080/BDR/
- Contains 5 data sets according to the BIRN data page (but it's actually 4).
- On the home page it displays a description of each data set and then presents pre-packaged data files in a drill down tree style menu which lists what appears to be individual patients in some cases, individual image studies in others.
- Querying across the data is possible using meta data. It does not appear query against DICOM tags is possible.
- The site itself seems to be a custom web site, not something that could be easily used for other archives.
- Downloads happen in the browser. You choose what you want to download, it is processed on the server and there is an alert that notifies you when your "job" is ready for download. Testing a 100mb patient download I was notified pretty quickly that my job was done, but getting the job page to load took about a minute before I was presented with a small table of information and buttons to actually download the file. Pressing the download button resulted in saving a .TAR file to my computer. TAR structure contained many subdirectories sorted by study then series and so on. Images contained DCM file extension and there were also a couple XML files included with some meta-data.
- The data sets include the following (but the last one is not yet posted, may never be since this was done in 2007):
- BrainScape Resting State fMRI Dataset 1 - This dataset includes seventeen healthy subjects with four resting state fixation scans plus one T1 scan and one T2 scan.
- BrainScape Resting State fMRI Dataset 2 - This dataset includes ten healthy subjects scanned 3 times with 3 conditions: eyes open, eyes closed, and fixating in addition to two anatomical scans (T1 and T2).
- Neuroimaging Calibration Study (Phase II) - The FBIRN multi-site dataset of subjects with schizophrenia and controls includes functional MRI images, behavioral data, demographic, and clinical assessments on 253 subjects from around the US.
- Traveling subjects study (Phase I) - This dataset includes five healthy subjects imaged twice at each of ten FBIRN MRI scanners on successive days.
- NIRL Imaging Database - http://nirlarc.duhs.duke.edu/nirle/
- Uses XNAT for the archive software (unsure what version or if any major customizations were made)
- Contains both image data and meta-data
- Includes the following data sets:
- Efficient Longitudinal Upload of Depression in the Elderly (ELUDE) - The ELUDE dataset is a longitudinal study of late-life depression at Duke University. There are 281 depressed subjects and 154 controls included (435 total patients). An MR scan of each subject was obtained every 2 years for up to 8 years (total of 1093 scans). Clinical assessments occurred more frequently and consists of a battery of psychiatric tests including several depression-specific tests.
- Multisite Imaging Research In the Analysis of Depression (MIRIAD) - A multiple institution study of structural MRI, including raw PD and T2 MRIs and derived measures of white matter changes, basal ganglia and other regions. Demographic and extensive clinical assessment data is available for each case. (100 patients)
- Central XNAT- http://central.xnat.org/
- As of Aug 10, 2010 the site contains 169 projects, 2567 patients. However the quality and access of these appears to vary greatly, ranging from well curated data sets to pure garbage sets that were made by people testing how Central XNAT works.
- A subset of the collections mentioned on the BIRN data web page (http://www.birncommunity.org/resources/data/) are hosted here including the following:
- mBIRN Calib - http://central.xnat.org/app/template/XDATScreen_report_xnat_projectData.vm/search_element/xnat:projectData/search_field/xnat:projectData.ID/search_value/Calib
- This data set consists of spoiled gradient-recalled echo magnetic resonance imaging data from 5 healthy volunteers (four males and one female) scanned twice at four sites having 1.5T systems from different vendors (Siemens, GE, Marconi Medical Systems).
- OASIS - http://central.xnat.org/app/template/XDATScreen_report_xnat_projectData.vm/search_element/xnat:projectData/search_field/xnat:projectData.ID/search_value/CENTRAL_OASIS_CS
- This set consists of a cross-sectional collection of 416 human subjects aged 18 to 96, including individuals with Alzheimer's disease; T1-weighted MRI scans obtained in single scan sessions are included. Additionally, a reliability data set is included containing 20 nondemented subjects imaged on a subsequent visit within 90 days of their initial session.
- mBIRN Calib - http://central.xnat.org/app/template/XDATScreen_report_xnat_projectData.vm/search_element/xnat:projectData/search_field/xnat:projectData.ID/search_value/Calib
Lung Cancer Alliance: Give A Scan
From their homepage-
*Give A Scan* is the world's first patient-powered, publicly available archive of images and clinical data on lung cancer patients. All the data has been donated by patients in order to encourage more researchers to focus on lung cancer and to accelerate progress in the early detection, diagnosis and treatment of lung cancer which is now the leading cause of cancer death worldwide.
As of August 2010 the archive contains 9 "communities" which appear to be 9 patients with lung cancer totaling approximately 1GB of data. The site provides some meta data information about the images and clinical info about the subjects. Images are hosted in DICOM format. The archive can be browsed by Community/patient/study/series or searched by modality and other image meta data.
The archive is hosted by the Kitware image archive solution called MIDAS: http://www.kitware.com/products/midas.html
Optical Society of America (OSA)
http://midas.osa.org/midaspub/
This archive is a collection of optical images. Like the Lung Cancer Alliance archive it is also hosted using Kitware's MIDAS archive software. This archive hosts 6 top level "communities" which contain anywhere from 4 to 242 items within them. In this case it seems not all of the data is DICOM but some of it is. Three of the Communities appear to be unrelated demonstration collections not tied to OSA as they contain lesion sizing data sets of lung images.
WebMIRS
http://archive.nlm.nih.gov/proj/webmirs/
WebMIRS is the National Library of Medicine's (NLM) tool used for hosting two related datasets and related spine x-ray images which are part of the National Health and Nutrition Examination Survey (NHANES). The site requires registration and login through a java web client to view the data set. Most of the data is text based, but there are spine x-rays for some of the patients. The client allows for searching but not really browsing. The user must enter a boolean search query in order to retrieve any patient results. It does not appear that it is possible to actually download the images, rather that you can only view them in the WebMIRS client.
NA-MIC
National Alliance for Medical Image Computing (NA-MIC) Image Gallery: http://www.na-mic.org/publications/gallery
As of August 2010 this consisted of ~376 images. All images appear to be JPGs or similar compressed file types rather than actual DICOM. The purpose of this gallery seems to be to create a repository for images, charts, and figures referenced in publications submitted to NA-MIC's publication database (http://www.na-mic.org/publications). Images can be browsed by patient/study/series or searched by modality and a number of other image based features.
Pediatric MRI Data Repository
Part of the National Database for Autism Research (NDAR) program.
According to http://ndar.nih.gov/ndarpublicweb/aboutNDAR.go#federation-
The Pediatric MRI Data Repository will be the first in this series to be made available to ASD researchers, in the summer of 2010. At that time, investigators will be able to perform a single query in the NDAR portal to view results across multiple datasets.
The original Pediatric MRI Data Repository is located at https://nihpd.crbs.ucsd.edu/nihpd/info/index.html. Access to the data requires filling out multiple forms and faxing them to an office at NIH to receive permission. I have not yet requested access at this time to find out exactly what's in the archive, however some information about their quality control processes reveal a little about the image protocols and can be learned about here: https://nihpd.crbs.ucsd.edu/nihpd/info/quality_control.html
NIH Image Bank
The NIH Image Bank is located at http://media.nih.gov/imagebank/index.aspx
According to http://media.nih.gov/imagebank/about.aspx-
The NIH Image Bank contains images from the collections of the 27 institutes and centers that comprise the National Institutes of Health. Contents include general biomedical and science-related images, clinicians, computers, patient care-related images, microscopy images, and various exterior images.
The point of the image bank appears to be more for promotional and marketing images. I did not notice any high quality medical images of actual patients or DICOM files which might be usable for research purposes.
European Organization for Research and Treatment of Cancer (EORTC)
http://www.eortc.be/services/forms/erp/default.aspx
According to their site investigators can make requests for access to data collected as part of EORTC trials after the primary end point has been published on. I did not see any place that outlined what trials are being conducted, what trials have been completed and reached their publication of primary endpoint, or exactly what types of data are collected. However in the PDF on this page which outlines their data sharing policy in more detail it lists in the section "4.3 Data Transfer" that "data will preferentially transferred in the form of an ASCII file (with .dat extension), with associated SAS programs to load the data into SAS." I saw no mention of how they handle images in this section and thus assume they may not collect or distribute any.
Image archive software solutions
|
NBIA |
XNAT |
MIDAS |
|
|
|
---|---|---|---|---|---|---|
Interface/GUI |
Web |
Web |
|
|
|
|
Query flexibility |
Simple (9 parameters), Advanced (10 more parameters), Dynamic (boolean query of up to 90 DICOM tags) |
|
|
|
|
|
Role Based Security |
Yes |
|
|
|
|
|
Public access option |
Yes |
|
|
|
|
|
Active Development |
Yes |
|
|
|
|
|
Supports Federated Implementation |
Yes |
|
|
|
|
|
API available |
caGrid |
|
|
|
|
|
Supported image formats |
DICOM |
|
|
|
|
|
Supported metadata formats |
XML, Zip |
|
|
|
|
|
Helpdesk support |
Yes |
|
|
|
|
|
Transfer protocols (import/export) |
DICOM, HTTPS |
|
|
|
|
|
Controlled Vocabulary |
caBIG |
|
|
|
|
|
Deployment Support |
Yes |
|
|
|
|
|
Support Operating Systems |
Linux, Mac, Windows |
|
|
|
|
|
NBIA
https://gforge.nci.nih.gov/frs/?group_id=312
Xnat
http://www.xnat.org/
http://www.xnat.org/2010+XNAT+Workshop+Agenda