Skip to main content

FAIR Big Data in the Materials Design Domain

Encyclopedia of Big Data Technologies

Definitions

To speed up the progress in the field of materials design, a number of challenges related to big data need to be addressed. This entry discusses these challenges and shows the semantic technologies that alleviate the problems related to Variety, Variability, Veracity and FAIRness.

Overview

Materials design and materials informatics are central for technological progress, not the least in the green engineering domain. Many traditional materials contain toxic or critical raw materials, whose use should be avoided or eliminated. Also, there is an urgent need to develop new environmentally friendly energy technology. Presently, relevant examples of materials design challenges include energy storage, solar cells, thermoelectrics, and magnetic transport (Ceder and Persson 2013; Jain et al. 2013; Curtarolo et al. 2013).

The space of potentially useful materials yet to be discovered—the so-called “chemical white space”—is immense. The possible combinations of, say, up to six...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Agrawal A, Alok C (2016) Perspective: materials informatics and big data: realization of the Fourth paradigm of science in materials science. APL Materials 4:053,208:1–10. https://doi.org/10.1063/1.4946894

  • Armiento R (2020) Database-driven high-throughput calculations and machine learning models for materials design. In: Schütt KT, Chmiela S, von Lilienfeld OA, Tkatchenko A, Tsuda K, Müller KR (eds) Machine learning meets quantum physics. Springer International Publishing, Cham, pp 377–395. https://doi.org/10.1007/978-3-030-40245-7_17

    Chapter  Google Scholar 

  • Ashino T (2010) Materials ontology: An infrastructure for exchanging materials information and knowledge. Data Sci J 9:54–61. https://doi.org/10.2481/dsj.008-041

    Article  Google Scholar 

  • Austin T (2016) Towards a digital infrastructure for engineering materials data. Materials Discovery 3:1–12. https://doi.org/10.1016/j.md.2015.12.003

    Article  Google Scholar 

  • Belsky A, Hellenbrandt M, Karen VL, Luksch P (2002) New developments in the Inorganic Crystal Structure Database (ICSD): Accessibility in support of materials research and design. Acta Crystallogr Sect B Struct Sci 58(3):364–369. https://doi.org/10.1107/S0108768102006948

    Article  Google Scholar 

  • Bergerhoff G, Hundt R, Sievers R, Brown ID (1983) The inorganic crystal structure data base. J Chem Inf Comput Sci 23(2):66–69. https://doi.org/10.1021/ci00038a003

    Article  Google Scholar 

  • Bernstein HJ, Bollinger JC, Brown ID, Grazulis S, Hester JR, McMahon B, Spadaccini N, Westbrook JD, Westrip SP (2016) Specification of the crystallographic information file format, version 2.0. J Appl Crystallogr 49:277–284. https://doi.org/10.1107/S1600576715021871

    Article  Google Scholar 

  • Campbell CE, Kattner UR, Liu ZK (2014) File and data repositories for Next Generation CALPHAD. Scripta Materialia 70(Supplement C):7–11. https://doi.org/10.1016/j.scriptamat.2013.06.013

  • Ceder G, Persson KA (2013) How Supercomputers will yield a golden age of materials science. Scientific American 309

    Google Scholar 

  • CEN (2010) A guide to the development and use of standards compliant data formats for engineering materials test data European Committee for standardization

    Google Scholar 

  • Cheung K, Drennan J, Hunter J (2008) Towards an ontology for data-driven discovery of new materials. In: McGuinness D, Fox P, Brodaric B (eds) Semantic scientific knowledge integration AAAI/SSS workshop, pp 9–14

    Google Scholar 

  • Curtarolo S, Setyawan W, Wang S, Xue J, Yang K, Taylor R, Nelson L, Hart G, Sanvito S, Buongiorno-Nardelli M, Mingo N, Levy O (2012) AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations. Comput Mater Sci 58(Supplement C):227–235. https://doi.org/10.1016/j.commatsci.2012.02.002

  • Curtarolo S, Hart G, Buongiorno-Nardelli M, Mingo N, Sanvito S, Levy O (2013) The high-throughput highway to computational materials design. Nature Materials 12(3):191. https://doi.org/10.1038/nmat3568

    Article  Google Scholar 

  • Draxl C, Scheffler M (2018) NOMAD: The FAIR concept for big data-driven materials science. MRS Bulletin 43(9):676–682. https://doi.org/10.1557/mrs.2018.208

    Article  Google Scholar 

  • Euzenat J, Shvaiko P (2007) Ontology matching. Springer

    MATH  Google Scholar 

  • Faber F, Lindmaa A, von Lilienfeld A, Armiento R (2016) Machine learning energies of 2 million elpasolite $(AB{C}_{2}{D}_{6})$ crystals. Phys Rev Lett 117(13):135,502. https://doi.org/10.1103/PhysRevLett.117.135502

    Article  Google Scholar 

  • Frenkel M, Chiroco RD, Diky V, Dong Q, Marsh KN, Dymond JH, Wakeham WA, Stein SE, Königsberger E, Goodwin ARH (2006) XML-based IUPAC standard for experimental, predicted, and critically evaluated thermodynamic property data storage and capture (ThermoML) (IUPAC Recommendations 2006). Pure Appl Chem 78:541–612. https://doi.org/10.1351/pac200678030541

    Article  Google Scholar 

  • Frenkel M, Chirico RD, Diky V, Brown PL, Dymond JH, Goldberg RN, Goodwin ARH, Heerklotz H, Königsberger E, Ladbury JE, Marsh KN, Remeta DP, Stein SE, Wakeham WA, Williams PA (2011) Extension of ThermoML: The IUPAC standard for thermodynamic data communications (IUPAC Recommendations 2011). Pure Appl Chem 83:1937–1969. https://doi.org/10.1351/PAC-REC-11-05-01

    Article  Google Scholar 

  • Gaultois MW, Oliynyk AO, Mar A, Sparks TD, Mulholland GJ, Meredig B (2016) Perspective: Web-based machine learning models for real-time screening of thermoelectric materials properties. APL Materials 4(5):053,213. https://doi.org/10.1063/1.4952607

    Article  Google Scholar 

  • Ghiringhelli LM, Carbogno C, Levchenko S, Mohamed F, Huhs G, Lueders M, Oliveira M, Scheffler M (2016) Towards a common format for computational materials science data. PSI-K Scientific Highlights July

    Google Scholar 

  • Glasser L (2016) Crystallographic information resources. J Chem Educ 93(3):542–549. https://doi.org/10.1021/acs.jchemed.5b00253

    Article  MathSciNet  Google Scholar 

  • Grazulis S, Dazkevic A, Merkys A, Chateigner D, Lutterotti L, Quiros M, Serebryanaya NR, Moeck P, Downs RT, Le Bail A (2012) Crystallography Open Database (COD): An open-access collection of crystal structures and platform for world-wide collaboration. Nucl Acids Res 40(Database issue):D420–D427. https://doi.org/10.1093/nar/gkr900

    Article  Google Scholar 

  • Hastings J, Jeliazkova N, Owen G, Tsiliki G, Munteanu CR, Steinbeck C, Willighagen E (2015) enanomapper: harnessing ontologies to enable data integration for nanomaterial risk assessment. J Biomed Semant 6(1):10. https://doi.org/10.1186/s13326-015-0005-5

  • Ivanova V, Lambrix P (2013) A unified approach for debugging is-a structure and mappings in networked taxonomies. J Biomed Semant 4:10:1–10:19. https://doi.org/10.1186/2041-1480-4-10

  • Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson KA (2013) Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Materials 1(1):011,002. https://doi.org/10.1063/1.4812323

    Article  Google Scholar 

  • Kaufman JG, Begley EF (2003) MatML: A data interchange markup language. Adv Mater Process 161:35–36

    Google Scholar 

  • Lambrix P, Strömbäck L, Tan H (2009) Information Integration in Bioinformatics with Ontologies and Standards. In: Bry F, Maluszynski J (eds) Semantic techniques for the web. Springer, Berlin, Heidelberg, pp 343–376. https://doi.org/10.1007/978-3-642-04581-3_8

    Chapter  Google Scholar 

  • Larsen AH, Mortensen JJ, Blomqvist J, Castelli IE, Christensen R, Dulak M, Friis J, Groves MN, Hammer B, Hargus C, Hermes ED, Jennings PC, Jensen PB, Kermode J, Kitchin JR, Kolsbjerg EL, Kubal J, Kaasbjerg K, Lysgaard S, Maronsson JB, Maxson T, Olsen T, Pastewka L, Peterson A, Rostgaard C, Schiøtz J, Schütt O, Strange M, Thygesen KS, Vegge T, Vilhelmsen L, Walter M, Zeng Z, Jacobsen KW (2017) The atomic simulation environment - a Python library for working with atoms. J Phys Condens Matter 29(27):273,002. https://doi.org/10.1088/1361-648X/aa680e

    Article  Google Scholar 

  • Lejaeghere K, Bihlmayer G, Björkman T, Blaha P, Blügel S, Blum V, Caliste D, Castelli IE, Clark SJ, Corso AD, Gironcoli Sd, Deutsch T, Dewhurst JK, Marco ID, Draxl C, Dulak M, Eriksson O, Flores-Livas JA, Garrity KF, Genovese L, Giannozzi P, Giantomassi M, Goedecker S, Gonze X, Grånas O, Gross EKU, Gulans A, Gygi F, Hamann DR, Hasnip PJ, Holzwarth NaW, Iuşan D, Jochym DB, Jollet F, Jones D, Kresse G, Koepernik K, Küçükbenli E, Kvashnin YO, Locht ILM, Lubeck S, Marsman M, Marzari N, Nitzsche U, Nordström L, Ozaki T, Paulatto L, Pickard CJ, Poelmans W, Probert MIJ, Refson K, Richter M, Rignanese GM, Saha S, Scheffler M, Schlipf M, Schwarz K, Sharma S, Tavazza F, Thunström P, Tkatchenko A, Torrent M, Vanderbilt D, van Setten MJ, Speybroeck VV, Wills JM, Yates JR, Zhang GX, Cottenier S (2016) Reproducibility in density functional theory calculations of solids. Science 351(6280):aad3000. https://doi.org/10.1126/science.aad3000

  • Li H, Armiento R, Lambrix P (2019) A method for extending ontologies with application to the materials science domain. Data Sci J 18(1). https://doi.org/10.5334/dsj-2019-050

  • Li H, Armiento R, Lambrix P (2020) An ontology for the materials design domain. In: Pan J, Tamma V, d’Amato C, Janowicz K, Fu B Polleres A, Seneviratne O, Kagal L (eds) The Semantic Web - ISWC 2020. 19th International Semantic Web Conference, Athens, Greece, November 2–6, 2020, Proceedings, Part II. Lecture Notes in Computer Science, vol 12507. Springer, Cham., pp 212–227. https://doi.org/10.1007/978-3-030-62466-8_14 arXiv:2006.07712

  • Moruzzi VL, Janak JF, Williams ARAR (2013) Calculated electronic properties of metals. Pergamon Press, New York

    Google Scholar 

  • Mulholland GJ, Paradiso SP (2016) Perspective: Materials informatics across the product lifecycle: Selection, manufacturing, and certification. APL Materials 4(5):053,207. https://doi.org/10.1063/1.4945422

    Article  Google Scholar 

  • Murray-Rust P, Rzepa HS (2011) CML: Evolution and design. J Cheminf 3:44:1–44:15. https://doi.org/10.1186/1758-2946-3-44

  • Murray-Rust P, Townsend JA, Adams SE, Phadungsukanan W, Thomas J (2011) The semantics of Chemical Markup Language (CML): dictionaries and conventions. J Cheminf 3:43. https://doi.org/10.1186/1758-2946-3-43

    Article  Google Scholar 

  • Pizzi G, Cepellotti A, Sabatini R, Marzari N, Kozinsky B (2016) AiiDA: automated interactive infrastructure and database for computational science. Comput Mater Sci 111(Supplement C):218–230. https://doi.org/10.1016/j.commatsci.2015.09.013

  • Rajan K (2015) Materials informatics: The materials “Gene” and Big data. Annu Rev Mater Res 45:153–169. https://doi.org/10.1146/annurev-matsci-070214-021132

    Article  Google Scholar 

  • Saal JE, Kirklin S, Aykol M, Meredig B, Wolverton C (2013) Materials design and discovery with high-throughput density functional theory: The open quantum materials database (OQMD). JOM 65(11):1501–1509. https://doi.org/10.1007/s11837-013-0755-4

    Article  Google Scholar 

  • Swindells N (2009) The representation and exchange of material and other engineering properties. Data Sci J 8:190–200. https://doi.org/10.2481/dsj.008-007

    Article  Google Scholar 

  • Thomas DG, Pappu RV, Baker NA (2011) Nanoparticle ontology for cancer nanotechnology research. J Biomed Inf 44(1):59–74. https://doi.org/10.1016/j.jbi.2010.03.001

    Article  Google Scholar 

  • Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten JW, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJ, Groth P, Goble C, Grethe JS, Heringa J, ’t Hoen PA, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone SA, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B (2016) The FAIR guiding principles for scientific data management and stewardship. Scientific Data 3:160,018:1–9. https://doi.org/10.1038/sdata.2016.18

  • Zhang X, Hu C, Li H (2009) Semantic query on materials data based on mapping matml to an owl ontology. Data Sci J 8:1–17. https://doi.org/10.2481/dsj.8.1

    Article  Google Scholar 

  • Zhang X, Pan D, Zhao C, Li K (2016) MMOY: Towards deriving a metallic materials ontology from Yago. Adv Eng Inf 30:687–702. https://doi.org/10.1016/j.aei.2016.09.002

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Patrick Lambrix .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this entry

Cite this entry

Lambrix, P., Armiento, R., Delin, A., Li, H. (2022). FAIR Big Data in the Materials Design Domain. In: Zomaya, A., Taheri, J., Sakr, S. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_293-2

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63962-8_293-2

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63962-8

  • Online ISBN: 978-3-319-63962-8

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Chapter history

  1. Latest

    Big Semantic Data Processing in the Materials Design Domain
    Published:
    22 March 2018

    DOI: https://doi.org/10.1007/978-3-319-63962-8_293-1

  2. Original

    FAIR Big Data in the Materials Design Domain
    Published:
    24 February 2012

    DOI: https://doi.org/10.1007/978-3-319-63962-8_293-2