NIH | National Cancer Institute | NCI Wiki  

Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Excerpt

Image RemovedPicture of Mark MusenImage AddedWhen left to their own devices, scientists do a terrible job creating the metadata that describe the experimental datasets that make their way in online repositories.  The lack of standardization makes it extremely difficult for other investigators to find relevant datasets, to perform secondary analyses, and to integrate those datasets with other data.  At Stanford, we are leading the Center for Expanded Data Annotation and Retrieval (CEDAR), a center of excellence in the NIH Big Data to Knowledge Program, which has the goal of enhancing the authoring of experimental metadata to make online datasets more useful to the scientific community.  CEDAR technology includes methods for managing a library of templates for representing metadata, and interoperability with a repository of biomedical ontologies that normalize the way in which the templates may be filled out.  CEDAR uses a repository of previously authored metadata from which it learns patterns that drive predictive data entry,  making it easier for metadata authors to perform their work.  Ongoing collaborations with several major research projects are allowing us to explore how CEDAR may ease access to scientific data sets stored in public repositories and enhance the reuse of the data to drive new discoveries.

Session details...

 

 

BIO:

...