Skip Navigation
NIH | National Cancer Institute | NCI Wiki   New Account Help Tips
Skip to end of metadata
Go to start of metadata

Welcome to the CBIIT Speaker Series Wiki


The NCI Center for Biomedical Informatics and Information Technology (CBIIT) Speaker Series is a bi-weekly knowledge-sharing forum featuring speakers on topics of interest to the biomedical informatics and research communities. General topics to be discussed include but are not limited to novel experimental approaches in basic research that require innovative informatics solutions; general informatics methodologies for specific tasks such as natural language processing and data exchange/integration; novel software applications (proprietary or open source); standards; ontologies; open-source development projects; human/computer interactions; future trends in biomedical informatics research and development; and CBIIT/NCIP partnerships inside and outside NCI/NIH.

Speaker Series Guidelines for Speakers: Download Word document

Please refer to the Speaker Calendar below for upcoming speakers.

Presentations: Please visit the NCI CBIIT Speaker Series YouTube playlist Exit Disclaimer logo to view past speakers' presentations on video.

Location: 9609 Medical Center Drive, Rockville, Maryland 20850

Questions? Please email Eve Shalley at 



An invitation: If you are interested in presenting your work to our diverse audience of informaticists; basic, translational, and clinical researchers; software developers; and others interested in exploring the uses of informatics in cancer research, contact Eve Shalley at or 240-276-5194.


Upcoming Speakers:

May 11: Helen Berman, Rutgers University

May 25: Ewa Deelman, University of Southern California Information Sciences Institute

June 22: CTIIP Team

July 6: CTIIP Team

July 20: Ed Helton, NCI CBIIT

September 14: Aviv Regez, MIT, Broad Institute

September 28: Funda Meric-Bernstam, University of Texas MD Anderson Cancer Center

October 12: Samir Courdy, University of Utah and Joyce Niland, City of Hope

October 26: Guoqin Yu, NCI

November 9: John Schnase, NASA

Helen BermanSYNOPSIS:

As the crystal structures of biological macromolecules were being determined, a new field of structural biology was born. Inspired by these new structures, the scientific community worked to establish a home to archive and share the data emerging from these experiments. The Protein Data Bank (PDB) was established in 1971 with seven structures. The PDB provides a repository for scientists who generate the data, and an access point for researchers and students to find the information needed to drive additional studies. Today, the PDB contains and supports online access to ~117,000 biomacromolecules that help researchers understand aspects of biology, including medicine, agriculture, and biological energy. The ways in which the interrelationships among science, technology, and community have driven the evolution   of the PDB resource for more than forty years will be discussed. The PDB archive is managed by the Worldwide Protein Data Bank (, whose members are the RCSB PDB, PDBe, PDBj and BMRB.

Session details...

Curtis Langlotz SYNOPSIS:

The imaging report is an essential source of clinical imaging information. It documents critical information about the patient's health and provides a professional interpretation of the images. However, the vast majority of report information remains narrative, a major obstacle to the rapid extraction and re-use of discrete imaging data. Structured reporting facilitates linking of imaging observations to clinical and genomic data, and is increasingly being adopted by clinical imaging practices. However, most imaging reports are used only once by the clinician who ordered the imaging study and are rarely used again for research, clinical care, or analytics. This presentation will describe the likely future of the imaging report, including efforts underway to standardize radiology report information, and the use of machine learning and natural language processing techniques to extract the semantic elements of the radiology report. These novel technologies enable connections between images and the electronic health record, and represent a vital part of the future of medical research.

Session details...

Martin MorganSYNOPSIS:

Bioconductor is a widely-used collection of R packages for the statistical analysis and comprehension of high-throughput genomic data. Biocondctor has strengths in sequence (RNA-seq, ChIP-seq, called variants, ...) and microarray (expression, methylation, copy number, ...) analysis, as well as significant facilities for flow cytometry, proteomics, and many other omics domains. The breadth of available facilities, coupled with principles of interoperability and reproducibility, make Biocondctor an ideal platform for integrative approaches to cancer genomics. This presentation outlines technical aspects of recent and forthcoming facilities to enable integrative cancer genomic analysis in Bioconductor. We discuss our own work to enable routine integration of large-scale consortium (e.g., ENCODE, Ensembl), annotation into analysis work flows, development within Biocondctor of facilities to manage multiple-assay experiments, and approaches to scaling R's in-memory model to large scale data sets. The presentation concludes with a brief overview of integrative approaches contributed to Bioconductor by our international contributors.

Session details...

David Hanauer


With the continued adoption of electronic health record (EHR) systems, healthcare centers are developing large repositories of unstructured clinical notes that were created as part of routine care. These data contain rich details that are often found nowhere else in the EHR, and can be valuable for research tasks ranging from cohort identification and eligibility determination to extracting phenotypic details in support of clinical and translational research. However, access to the data "locked" within these documents has historically been challenging for research teams, many of whom lack the expertise to utilize natural language processing tools. To address this problem we developed the Electronic Medical Record Search Engine (EMERSE) which is an information retrieval tool designed with the end-user in mind. Careful attention has been paid to usability and to ensure that EMERSE has the type of functionality needed by a majority of researchers needing access to the data found within the clinical notes. EMERSE has been used, and continues to be enhanced, at the University of Michigan for over 10 years, and has had a wide and highly satisfied user base. One of the largest collective user groups has been our Cancer Center's Clinical Trials Office. EMERSE is available at no cost for academic use and we are actively seeking partners interested in adopting the tool. Additional information can be found at In this talk, we will provide a live demonstration of the tool, by walking through the various features and capabilities to show the kinds of tasks it can be used for.

Session details...

Complete List of Update Posts

Speaker Calendar


    Customise the different types of events you'd like to manage in this calendar.


    Optionally, restrict who can view or add events to the team calendar.


    Grab the calendar's URL and email it to your team, or paste it on a page to embed the calendar.


    The calendar is ready to go! Click any day on the calendar to add an event or use the Add event button.




  • No labels