The field of nanomedicine faces many challenges in the development of standards to support meaningful data submission and information exchange. Nanomaterial characterization requires numerous physico-chemical, in-vitro, and in-vivo assays where measurements mostly depend on non-standardized protocols and diverse technology types. Unfortunately, information describing the nanomaterial, including functionalizing entities and three-dimensional (3D) structure, is often represented in an undisciplined fashion. In addition, there has been no standard way to associate this information with the data and metadata from characterization studies. This lack of standardization has been a significant deterrent to meaningful data sharing across the nanotechnology community; few publications contain sufficient information to enable adequate interpretation of results and successful achievement of experimental reproducibility. Furthermore, there has been very limited success in using non-standardized data to represent or derive structure-activity-relationships (SARs) that are critical for understanding the effects of nanomaterial structure on biological activity in nanomedicine.
The nano-TAB specification is intended to facilitate the submission and exchange of nanomaterial descriptions and characterization data (metadata and summary data) along with the other files (raw/derived data files, image files, protocol documents, etc.) among individual researchers and to/from nanotechnology resources like the NCI’s cancer Nanotechnology Laboratory (caNanoLab) portal \[[https://cananolab.nci.nih.gov/caNanoLab]\] and the Nanomaterial-Biological Interactions (NBI) knowledgebase \[[http://nbi.oregonstate.edu/]\]. Nano-TAB also serves to empower organizations to adopt standard methods for representing data in nanotechnology publications; and to provide researchers with guidelines for representing nanomaterials and characterizations to achieve cross-material comparison.
The nano-TAB project is an effort of the National Cancer Institute (NCI) Cancer Biomedical Informatics Grid (caBIG®) Nanotechnology Informatics Working Group (Nano WG).
The nano-TAB format specification is based on an existing specification developed by the European Bioinformatics Institute (EBI), namely, the investigation/study/assay (ISA-TAB) format specification. The ISA-TAB format (http://isatab.sourceforge.net/) is used by the ‘omics’ (proteomics, genomics, metabolomics, and transcriptomics) communities to share data and metadata associated with different assays and technology types in their experiments. The ISA-TAB file structure relies on three primary files---investigation, study, and assay (ISA) files. Raw/derived data files and any other files (e.g., image files, protocol documents) specific to each assay are shared along with the three primary ISA-TAB files if the data files are referenced in the primary ISA-TAB files. ISA-TAB does not provide format specification for files other than the investigation, study, and assay files. The ISA-TAB investigation file is used for three purposes: (1) to record all declarative information referenced in other files; (2) to relate assay files to study files; and (3) to group multiple study files that are part of the same investigation. The ISA-TAB study file is used to record information about the source, sampling methodology, treatment, preparation, and characteristics of the subjects (biospecimens) studied using one or more assays under an investigation.
The NCI EVS is a project of the NCI Center for Biomedical Informatics and Information Technology (CBIIT). EVS provides controlled terminologies and ontologies in support of the biomedical research and informatics activities of the NCI and its partners, including the caBIG^®^ community. The activities of the EVS include development of terminologies, development of terminology-related software, and operations support to address the broad spectrum of terminology needs in the cancer research enterprise. Among the vocabularies that EVS supports is the NanoParticle Ontology (NPO), by providing terminology development facilities and terminology servers, which are made available both via the web and programmatically through EVS server APIs. Additionally, the NPO is presented to the public by EVS both in a standalone format and as a component of the NCI Metathesaurus, where its concepts are mapped to the concepts of other vocabularies used by the NCI community. Also, the EVS-managed NCI Thesaurus (NCIt) includes nanotechnology concepts that have been utilized in the development of the NCI caNanoLab. Data from caNanoLab has been utilized in the nano-TAB example files.
The caBIG® (cancer Bioinformatics Grid) LS DAM \[[https://wiki.nci.nih.gov/x/cxRlAQ]\] provides a shared view of the semantics of the life sciences domains that are represented by the different workspaces in the caBIG infrastructure. It has a nanotechnology subdomain, which was developed based on caNanoLab object model and NPO terms. LS DAM makes a distinction between biospecimens (for example, cell line, tissue samples, body fluid samples, organ parts) and materials that are not derived from a cell, tissue, organ, or body (for example, nanoparticle formulations, drug formulations, solvent, and so forth). This motivated the use of the term “material sample” in the nano-TAB material file. Weekly Nano WG web-conferencing was used to ensure the alignment of nano-TAB with the LS DAM.
Like ISA-TAB, nano-TAB provides fields for entering and referencing terms selected from ontologies and standard terminologies. The ontologies are available at BioPortal (http://www.bioontology.org), which is maintained by the National Center for Biomedical Ontologies. Though the investigator may use alternative ontology and vocabulary sources, the ability to evaluate and share data require that all parties have access to those being used (they should be available to the investigators). All terms and fields used in this standard utilize the NCI EVS (http://evs.nci.nih.gov) and NanoParticle Ontology elements.
NanoParticle Ontology (NPO) \[[http://www.nano-ontology.org]\] is an ontology that is designed and developed within the framework of the Basic Formal Ontology (BFO) \[2\] and implemented in the ontology web language (OWL) \[3\]. It is being developed to represent the knowledge underlying the description, preparation, and characterization of nanomaterials. NPO development began with the representation of knowledge underlying the chemical composition, preparation, physiochemical, and functional/biological characterization of nanoparticles that are formulated and tested for applications in cancer diagnostics and therapeutics. The NPO provided the knowledge framework for developing the nano-TAB material file format. The NPO provides a subset of the terms and relationships for the description and characterization of nanomaterials in the nano-TAB file format. The NPO is being further developed for the following purposes: (1) to provide terms for annotating nanotechnology research data; (2) to provide the knowledge framework required for developing data-sharing models and standards in nanomedicine; (3) to enable semantic integration of data; (4) to enable unambiguous interpretation of the description and characterization of nanomaterials; and (5) to enable knowledge-based searching and comparison of nanomaterial descriptions and characterization results.
Nano-TAB extension to ISA-TAB--- While nano-TAB leverages the three primary ISA-TAB files, it extends ISA-TAB by providing specification for a fourth file (called the material file) for representing the composition and characteristics of nanoparticle formulations and small molecules. Raw/ derived data files and any other files (e.g., image files, protocol documents) specific to each assay have to be shared along with the four primary nano-TAB files. Nano-TAB does not provide any specification for how to format files other than the four primary files: investigation, study, assay and material files. Although nano-TAB adopts ISA-TAB field names and their definition in the investigation, study, and assay files, some of the definitions are modified and additional fields are introduced. These modifications and extensions are required to expand the scope of information captured from nanotechnology data sets into the nano-TAB files.
In nanotechnology, samples from biological and non-biological sources can be the primary subjects of a study. Therefore, in nano-TAB, samples derived from biological sources are called biological specimens or biospecimens (e.g., cell line, body fluids, organs, etc.). Whereas, samples derived from non-biological sources are simply called material samples (e.g., nanomaterials, nanoparticle formulations, small molecules). For physico-chemical characterizations of nanomaterials, the sample is the nanomaterial. For in-vitro and in-vivo characterizations, the sample is the biological specimen (cell line, animal, and so forth). Hence, in nano-TAB, the concept of a sample (as used in ISA-TAB specification) is redefined to include both biological specimens and material samples. The ISA-TAB study file can only be used to record the source and characteristics of biospecimens studied in an assay, and cannot support the representation of materials. Therefore, in nano-TAB, the material file is used to describe material samples, while the study file is used to describe biospecimens.
ISA-TAB specifies that the names of the primary files end with .txt extensions. Nano-TAB file names may end in either .txt or .xls extensions. The nano-TAB files used as examples in this document were prepared in excel spreadsheets, and so their filenames have the .xls extension.
When sharing primary nano-TAB files, other files referenced in these files have to be shared along with the primary files. It is anticipated that content management systems will become available to facilitate the sharing and exchange of files. Until then, these files could be bundled together in a folder and shared as a zip file.
Nano-TAB uses the three primary files of ISA-TAB-- investigation file, study file, and assay file; and, introduces a fourth file called the material file (FIG 1). Other files such as raw/derived data files, image files, protocol documents, etc., referenced in the nano-TAB files have to be shared along with the nano-TAB files.
FIG 1. Nano-TAB File Structure
In FIG 2, the nano-TAB file development process is described. Typically, the investigation file is developed first and describes the overall investigation, associated studies and assays. The investigation file is a text file with a naming convention of “i_xxx.txt” or “i_xxx.xls,” in which xxx can be any name provided by the investigator. Once the investigation file has been completed, one or more study files (following the convention “s_xxx.txt” or “s_xxx.xls”) can be created. Similarly, one or more material files can be created. The material file describes the nanomaterial (or small molecule) and its components including structural information and follows the naming convention “m_xxx.txt” or “m_xxx.xls”. Assay files (following the convention “a_xxx.txt” or “a_xxx.xls”) are created for all assays performed. Each assay is defined by the endpoint measured and the technique used to measure that endpoint. Data files (raw or derived) specific to each type of assay can be associated to the respective assay files by referencing the names of the data files in the assay files.
FIG 2. Nano-TAB File Development Process
Once the nano-TAB files have been created, the files can be validated and submitted into nanotechnology resources that support the nano-TAB specification. It is anticipated that validation of the files may occur via a validation service that leverages a modified version of the ISA-TAB validator \[[http://isatab.sourceforge.net/validator.html]\]. It is also anticipated that nanotechnology resources like caNanoLab ([https://cananolab.nci.nih.gov/caNanoLab/]), the Nanomaterial-Biological Interactions (NBI) knowledgebase ([http://nbi.oregonstate.edu/]), and other resources will provide facilities for importing/exporting nano-TAB files as the nano-TAB specification evolves.