Introduction

Informatics, the use of systematic methods for storing, communicating, and analyzing research data, has had an impact on many fields of study, particularly in the biological research community (Hey and Trefethen 2003). Informatics has also recently gained traction in the nanomaterial research community as a method for addressing the need to (1) handle large and complex datasets; (2) organize and link information from disparate sources; and (3) mine and process data generated from across diverse research silos, including biological, environmental, ecological, and risk assessment. Scientists engaged in nanoinformatics efforts are currently focused on several foundational areas including the use of standardized or controlled vocabularies (International Organization for Standardization 2012), standardized data formatting (American Society for Testing and Materials 2013), and minimal information standards for data interoperability (Hackley et al. 2009), as in Fig. 1. The adoption of standards in these key areas would increase the number of discoveries regarding nanomaterial implications using data visualization tools and computational, predictive modeling.

Fig. 1
figure 1

Frameworks key to the nanoinformatics effort

Minimal information standards (MIS) for nanomaterial characterization, also called minimal information about nanomaterials (MIAN), have been proposed by several institutions (Boverhof and David 2010; Sayes and Berg 2011; Hull and Bowman 2009), but none have yet received community-wide adoption. To date, these efforts have centered on defining a list of physico-chemical characteristics (PCC) that should be known in order for a nanomaterial to be considered well characterized. The organizations calling for these sets of MIAN were pioneering the guidelines for nanomaterial characterization. However, the sensitivity of many nanomaterial measurands to associated process parameters and measurement protocols (for example, Sayes et al. (2007) on the effects of media type on aggregate size and surface charge) a MIAN, in practice, should include not only a list of PCC, but also important metadata that could impact the nanomaterial measurands. Furthermore, a nanomaterial should have a well-characterized PCC profile based on MIAN specifications in order for its contribution to collective analyses to be as valuable as possible.

The Nanomaterial Registry is a growing, flexible tool that is positioned to meet the community’s need for a centralized and publicly available data source. The data that go into the Registry are made authoritative through their curation by nanomaterial scientists, review by subject matter experts, structure through the MIAN, and their ability to be traced back to their original source (database and publication) through metadata collection. Via the Registry, researchers can access and utilize integrated data from sources such as reference material investigations, manufacturer specification sheets, and raw research data. The Registry has leveraged nanoinformatics community efforts to define a comprehensive MIAN for nanomaterials that includes 12 PCC, their measurands, and vital metadata associated with each PCC. The MIAN is used for systematically curating publicly available nanomaterial data in the Registry. Data from disparate sources have been brought into the Registry through curation and are available together for downstream analysis to researchers, manufacturers, regulatory agencies, and standards organizations in the nanomaterial community.

The minimal information about nanomaterials (MIAN)

Overview

Several recent efforts in MIS creation were leveraged to develop the Registry’s minimal characterization standards for nanomaterials (The Minimum Information for Nanomaterial Characterization Initiative 2008; Organization for Economic Co-operation and Development (2010). The Registry is the first online data source to promote an applied MIAN for data curation and sharing. This MIAN is the foundation of the Registry data model and was used to guide the construction of the Registry website (www.nanomaterialregistry.org). Information curated for any data record of the Registry is separated as much as possible into individual fields. This data parsing aspect of the curation process makes the data more accessible for programmatic analysis. The MIAN was also used to establish rules for the Registry website functionalities, including the compliance ratings and data matching system algorithms (Ostraat et al. 2013). The MIAN and Registry data model are flexible, meant to be evergreen, and can accommodate new information and insights regarding the importance of nanomaterial characterization as scientific discovery regarding nanomaterials continues. It is also important to note that the scope of the Registry data is MIAN specific, and resource data that are not a part of the MIAN are not curated. Characteristics, that fall outside of the MIAN, such as magnetic or optical properties, are not currently curated. If a researcher is interested in additional information, the Registry does provide sufficient metadata to allow the user to easily access the original/source study record through web links to the references.

It has been observed that the values of critical measurands of nanomaterials may change over time due to such circumstances as the method and time of storage (Abdelwahed et al. 2006), the processing of nanomaterials, or even the process of characterizing the nanomaterials (Sayes et al. 2007). For this reason, information related to the nanomaterial’s instance of characterization (IOC) (Sayes and Warheit 2009; Sayes and Berg 2011) is captured in the MIAN. IOC is used to describe the time and conditions in which a nanomaterial was characterized and studied. Each nanomaterial record may include multiple characterizations and studies, each classified with its own IOC. The three categories of IOC are:

  • As Synthesized—native state of the nanomaterial when synthesized

  • As Received—native state of the nanomaterial when purchased or received

  • As Processed—non-native state of the nanomaterial once modified (and how modified)

Each IOC category has specified metadata, including, when applicable, the processing details (e.g., suspended, aerosolized, sonicated, stored, heated, etc.), the manufacturer or synthesis laboratory name, the product name and lot number, the synthesis method, the digital object identifier (DOI) citation of the synthesis procedure, and the nanomaterial’s physical state.

Components of the PCC MIAN

General information

Within each nanomaterial entry of the Registry, general information is recorded, including such items as the uniform resource locator (URL) of the original data source, as well as the URL or DOI for the original publication of the data. This information is vital to the Registry because it ensures that the user can verify research findings and in some cases directs the user to additional information that is outside of scope of the MIAN (Hey and Trefethen 2003; Bell et al. 2009). General information data are vital for linking disparate studies, through specific commonalities for downstream analysis. For example, a researcher may want to interrogate all materials records that have a common synthesis method or manufacturing pedigree.

Physico-chemical characteristics

The twelve PCC that are currently included in the Registry MAIN are shown in Figs. 2 and 3 along with example data. The MIAN was designed with recognition that PCC are frequently reported using different measurands. Therefore, many different measurands can be captured in the Registry. The descriptors for the MIAN PCC and their measurands are also provided in an online glossary (https://www.nanomaterialregistry.org/resources/Glossary.aspx). Many of these descriptors are based upon ISO definitions, while the remaining will be updated as standard vocabularies continue to be developed. The characterization of a PCC always corresponds to some measurement technique. The value of a characterization can be found in the reporting of this technique and its protocol. As detailed in Tables 1 and 2, the MIAN includes in-depth information on the parameters and instrumentation used to measure each PCC, and also best practice information to determine the scientific quality of the measurements. The best practice questions used during curation are:

  • Raw data provided?

  • Proper controls used?

  • Instrument within calibration?

  • Replicates used?

  • Experimental protocol reported?

Fig. 2
figure 2

Six PCC within the MIAN of the Registry are considered independent of a nanomaterial’s system: a composition, b purity, c shape, d size, e size distribution, and f surface chemistry. Each PCC has a description and examples of data that may be reported by a researcher and curated into the Registry

Fig. 3
figure 3

The remaining six PCC within the MIAN of the Registry can be dependent on a nanomaterial’s system: a aggregation/agglomeration state, b solubility, c stability, d surface area, e surface charge, and f surface reactivity. Each PCC has a description and examples of data that may be reported by a researcher and curated into the Registry

Table 1 Some relevant techniques for use in characterizing each of the MIAN PCC
Table 2 Each technique curated into the Registry must also have a defined list of parameters because the parameters associated with a technique’s protocol are vital to the replication of that technique. This table summarizes many of the parameter lists defined thus far for the Registry

In summary, in order to be compliant with the MIAN and be useful as a secondary data source, a nanomaterial characterization should include: the synthesis method or processing details, the DOI citation of the synthesis procedure, the manufacturer or synthesis laboratory name, the product name and lot number, and the nanomaterial’s physical state. For each measurand reported, the following categories of information should also be reported: technique, technique protocols and parameters, and best practice information.

Nanomaterial Registry infrastructure and the MIAN

The value of the MIAN can be found in the functionality of the Registry. The MIAN is integrated with the functionality and the infrastructure of the Registry database and website. It is the framework used to browse, search, match, and compare the curated data. Also note that the Registry uses compliance scoring to measure the compliance of the curated data with the MIAN criteria (Ostraat et al. 2013). The algorithms used to calculate compliance scores utilize all curated data and metadata information. Thus, characterizations that were done using well-regarded techniques and cited methods receive higher compliance scores.

An overall compliance score communicates the quality of the overall PCC characterization done on a nanomaterial. Each PCC is assigned a weighting factor in the overall compliance algorithm. For example, composition has a higher weighting factor than shape. It is worth noting that stakeholder groups have varying values associated with each PCC characterization in the MIAN based on their needs and fields of interest. The Registry has designed a flexible interface for updating the MIAN when community consensus requires it.

All compliance weighting factors were determined in collaboration with a variety of stakeholders in the Nanomaterial Registry’s advisory board. As such, these weighting factors represent the best current knowledge of subject matter experts in nanomaterials. They will likely be updated over time as new discoveries are made and sufficient data are archived to inform these factors. Compliance scores, which are the numerical output of the algorithm, are grouped into levels of Merit, Bronze, Silver, and Gold and are visible throughout the Registry site’s features, such as the search and browse and on the individual nanomaterial record details page.

The Registry also uses the MIAN to facilitate similarity or matching of nanomaterial entries. The Registry currently uses a simple ruled-based matching system that measures how alike any two nanomaterials are based on specific areas of the MIAN, including shape, IOC, size, and technique used to characterize size, or the isoelectric point. The Registry project will develop more robust rules as more data are curated and as semantic features are incorporated into the Registry.

Nanomaterial data are curated into the Registry using a web-based curation tool that was developed to provide a strictly controlled curation process, involving curation, quality assurance (QA), and quality control (QC) of the data. The curation tool was built using the MIAN’s controlled vocabulary, and the majority of the data fields are populated using values from dropdown menus instead of free text. This reduces the burden in the QA and QC tasks of data curation, and ensures the use of a controlled vocabulary.

While most nanomaterial web-based data repositories contain data relevant to a very specific subject matter, the Registry curates broad nanomaterial information. This broad domain necessitated a broad flexible data model. The data model is sufficiently flexible to accommodate changes as new research developments show a need for different or additional MIAN.

Conclusions

The Registry MIAN is positioned to provide the nanomaterial community with several advantages. The adoption of the MIAN will help to guide the collection of regularized set of nanomaterial characterization data, thereby improving the quality of the data available. In turn, identification of structure–property relationships will be supported through higher quality characterization.

Researchers who use the MIAN for their study design will be prepared to report thorough data in publications as well as to share their data in a highly structured way via the Registry for broader use and global significance.

The MIAN-based data model will be used to inform the content of data formatting templates for the community, such as the ISA-TAB-Nano. This will stimulate an increase in the interoperability of data from not only different sources, but also across silos of research. The interoperability of these data will not only be valuable in direct comparisons, but also in predictive modeling tools.

The Registry is being developed as an authoritative resource and as such will continue to reflect new research discoveries and new standards, as they become available. This will be reflected in updated versions of the MIAN. Examples of potential areas for expansion of the MIAN are the optical and nanomechanical properties of nanomaterials. The MIAN and the Registry are publicly available at www.nanomaterialregistry.org. The website also includes user feedback features, as community input is a central aspect of the Registry project.