Definitions
To speed up the progress in the field of materials design, a number of challenges related to big data need to be addressed. This entry discusses these challenges and shows the semantic technologies that alleviate the problems related to Variety, Variability, Veracity and FAIRness.
Overview
Materials design and materials informatics are central for technological progress, not the least in the green engineering domain. Many traditional materials contain toxic or critical raw materials, whose use should be avoided or eliminated. Also, there is an urgent need to develop new environmentally friendly energy technology. Presently, relevant examples of materials design challenges include energy storage, solar cells, thermoelectrics, and magnetic transport (Ceder and Persson 2013; Jain et al. 2013; Curtarolo et al. 2013).
The space of potentially useful materials yet to be discovered—the so-called “chemical white space”—is immense. The possible combinations of, say, up to six...
References
Agrawal A, Alok C (2016) Perspective: materials informatics and big data: realization of the Fourth paradigm of science in materials science. APL Materials 4:053,208:1–10. https://doi.org/10.1063/1.4946894
Armiento R (2020) Database-driven high-throughput calculations and machine learning models for materials design. In: Schütt KT, Chmiela S, von Lilienfeld OA, Tkatchenko A, Tsuda K, Müller KR (eds) Machine learning meets quantum physics. Springer International Publishing, Cham, pp 377–395. https://doi.org/10.1007/978-3-030-40245-7_17
Ashino T (2010) Materials ontology: An infrastructure for exchanging materials information and knowledge. Data Sci J 9:54–61. https://doi.org/10.2481/dsj.008-041
Austin T (2016) Towards a digital infrastructure for engineering materials data. Materials Discovery 3:1–12. https://doi.org/10.1016/j.md.2015.12.003
Belsky A, Hellenbrandt M, Karen VL, Luksch P (2002) New developments in the Inorganic Crystal Structure Database (ICSD): Accessibility in support of materials research and design. Acta Crystallogr Sect B Struct Sci 58(3):364–369. https://doi.org/10.1107/S0108768102006948
Bergerhoff G, Hundt R, Sievers R, Brown ID (1983) The inorganic crystal structure data base. J Chem Inf Comput Sci 23(2):66–69. https://doi.org/10.1021/ci00038a003
Bernstein HJ, Bollinger JC, Brown ID, Grazulis S, Hester JR, McMahon B, Spadaccini N, Westbrook JD, Westrip SP (2016) Specification of the crystallographic information file format, version 2.0. J Appl Crystallogr 49:277–284. https://doi.org/10.1107/S1600576715021871
Campbell CE, Kattner UR, Liu ZK (2014) File and data repositories for Next Generation CALPHAD. Scripta Materialia 70(Supplement C):7–11. https://doi.org/10.1016/j.scriptamat.2013.06.013
Ceder G, Persson KA (2013) How Supercomputers will yield a golden age of materials science. Scientific American 309
CEN (2010) A guide to the development and use of standards compliant data formats for engineering materials test data European Committee for standardization
Cheung K, Drennan J, Hunter J (2008) Towards an ontology for data-driven discovery of new materials. In: McGuinness D, Fox P, Brodaric B (eds) Semantic scientific knowledge integration AAAI/SSS workshop, pp 9–14
Curtarolo S, Setyawan W, Wang S, Xue J, Yang K, Taylor R, Nelson L, Hart G, Sanvito S, Buongiorno-Nardelli M, Mingo N, Levy O (2012) AFLOWLIB.ORG: A distributed materials properties repository from high-throughput ab initio calculations. Comput Mater Sci 58(Supplement C):227–235. https://doi.org/10.1016/j.commatsci.2012.02.002
Curtarolo S, Hart G, Buongiorno-Nardelli M, Mingo N, Sanvito S, Levy O (2013) The high-throughput highway to computational materials design. Nature Materials 12(3):191. https://doi.org/10.1038/nmat3568
Draxl C, Scheffler M (2018) NOMAD: The FAIR concept for big data-driven materials science. MRS Bulletin 43(9):676–682. https://doi.org/10.1557/mrs.2018.208
Euzenat J, Shvaiko P (2007) Ontology matching. Springer
Faber F, Lindmaa A, von Lilienfeld A, Armiento R (2016) Machine learning energies of 2 million elpasolite $(AB{C}_{2}{D}_{6})$ crystals. Phys Rev Lett 117(13):135,502. https://doi.org/10.1103/PhysRevLett.117.135502
Frenkel M, Chiroco RD, Diky V, Dong Q, Marsh KN, Dymond JH, Wakeham WA, Stein SE, Königsberger E, Goodwin ARH (2006) XML-based IUPAC standard for experimental, predicted, and critically evaluated thermodynamic property data storage and capture (ThermoML) (IUPAC Recommendations 2006). Pure Appl Chem 78:541–612. https://doi.org/10.1351/pac200678030541
Frenkel M, Chirico RD, Diky V, Brown PL, Dymond JH, Goldberg RN, Goodwin ARH, Heerklotz H, Königsberger E, Ladbury JE, Marsh KN, Remeta DP, Stein SE, Wakeham WA, Williams PA (2011) Extension of ThermoML: The IUPAC standard for thermodynamic data communications (IUPAC Recommendations 2011). Pure Appl Chem 83:1937–1969. https://doi.org/10.1351/PAC-REC-11-05-01
Gaultois MW, Oliynyk AO, Mar A, Sparks TD, Mulholland GJ, Meredig B (2016) Perspective: Web-based machine learning models for real-time screening of thermoelectric materials properties. APL Materials 4(5):053,213. https://doi.org/10.1063/1.4952607
Ghiringhelli LM, Carbogno C, Levchenko S, Mohamed F, Huhs G, Lueders M, Oliveira M, Scheffler M (2016) Towards a common format for computational materials science data. PSI-K Scientific Highlights July
Glasser L (2016) Crystallographic information resources. J Chem Educ 93(3):542–549. https://doi.org/10.1021/acs.jchemed.5b00253
Grazulis S, Dazkevic A, Merkys A, Chateigner D, Lutterotti L, Quiros M, Serebryanaya NR, Moeck P, Downs RT, Le Bail A (2012) Crystallography Open Database (COD): An open-access collection of crystal structures and platform for world-wide collaboration. Nucl Acids Res 40(Database issue):D420–D427. https://doi.org/10.1093/nar/gkr900
Hastings J, Jeliazkova N, Owen G, Tsiliki G, Munteanu CR, Steinbeck C, Willighagen E (2015) enanomapper: harnessing ontologies to enable data integration for nanomaterial risk assessment. J Biomed Semant 6(1):10. https://doi.org/10.1186/s13326-015-0005-5
Ivanova V, Lambrix P (2013) A unified approach for debugging is-a structure and mappings in networked taxonomies. J Biomed Semant 4:10:1–10:19. https://doi.org/10.1186/2041-1480-4-10
Jain A, Ong SP, Hautier G, Chen W, Richards WD, Dacek S, Cholia S, Gunter D, Skinner D, Ceder G, Persson KA (2013) Commentary: The materials project: A materials genome approach to accelerating materials innovation. APL Materials 1(1):011,002. https://doi.org/10.1063/1.4812323
Kaufman JG, Begley EF (2003) MatML: A data interchange markup language. Adv Mater Process 161:35–36
Lambrix P, Strömbäck L, Tan H (2009) Information Integration in Bioinformatics with Ontologies and Standards. In: Bry F, Maluszynski J (eds) Semantic techniques for the web. Springer, Berlin, Heidelberg, pp 343–376. https://doi.org/10.1007/978-3-642-04581-3_8
Larsen AH, Mortensen JJ, Blomqvist J, Castelli IE, Christensen R, Dulak M, Friis J, Groves MN, Hammer B, Hargus C, Hermes ED, Jennings PC, Jensen PB, Kermode J, Kitchin JR, Kolsbjerg EL, Kubal J, Kaasbjerg K, Lysgaard S, Maronsson JB, Maxson T, Olsen T, Pastewka L, Peterson A, Rostgaard C, Schiøtz J, Schütt O, Strange M, Thygesen KS, Vegge T, Vilhelmsen L, Walter M, Zeng Z, Jacobsen KW (2017) The atomic simulation environment - a Python library for working with atoms. J Phys Condens Matter 29(27):273,002. https://doi.org/10.1088/1361-648X/aa680e
Lejaeghere K, Bihlmayer G, Björkman T, Blaha P, Blügel S, Blum V, Caliste D, Castelli IE, Clark SJ, Corso AD, Gironcoli Sd, Deutsch T, Dewhurst JK, Marco ID, Draxl C, Dulak M, Eriksson O, Flores-Livas JA, Garrity KF, Genovese L, Giannozzi P, Giantomassi M, Goedecker S, Gonze X, Grånas O, Gross EKU, Gulans A, Gygi F, Hamann DR, Hasnip PJ, Holzwarth NaW, Iuşan D, Jochym DB, Jollet F, Jones D, Kresse G, Koepernik K, Küçükbenli E, Kvashnin YO, Locht ILM, Lubeck S, Marsman M, Marzari N, Nitzsche U, Nordström L, Ozaki T, Paulatto L, Pickard CJ, Poelmans W, Probert MIJ, Refson K, Richter M, Rignanese GM, Saha S, Scheffler M, Schlipf M, Schwarz K, Sharma S, Tavazza F, Thunström P, Tkatchenko A, Torrent M, Vanderbilt D, van Setten MJ, Speybroeck VV, Wills JM, Yates JR, Zhang GX, Cottenier S (2016) Reproducibility in density functional theory calculations of solids. Science 351(6280):aad3000. https://doi.org/10.1126/science.aad3000
Li H, Armiento R, Lambrix P (2019) A method for extending ontologies with application to the materials science domain. Data Sci J 18(1). https://doi.org/10.5334/dsj-2019-050
Li H, Armiento R, Lambrix P (2020) An ontology for the materials design domain. In: Pan J, Tamma V, d’Amato C, Janowicz K, Fu B Polleres A, Seneviratne O, Kagal L (eds) The Semantic Web - ISWC 2020. 19th International Semantic Web Conference, Athens, Greece, November 2–6, 2020, Proceedings, Part II. Lecture Notes in Computer Science, vol 12507. Springer, Cham., pp 212–227. https://doi.org/10.1007/978-3-030-62466-8_14 arXiv:2006.07712
Moruzzi VL, Janak JF, Williams ARAR (2013) Calculated electronic properties of metals. Pergamon Press, New York
Mulholland GJ, Paradiso SP (2016) Perspective: Materials informatics across the product lifecycle: Selection, manufacturing, and certification. APL Materials 4(5):053,207. https://doi.org/10.1063/1.4945422
Murray-Rust P, Rzepa HS (2011) CML: Evolution and design. J Cheminf 3:44:1–44:15. https://doi.org/10.1186/1758-2946-3-44
Murray-Rust P, Townsend JA, Adams SE, Phadungsukanan W, Thomas J (2011) The semantics of Chemical Markup Language (CML): dictionaries and conventions. J Cheminf 3:43. https://doi.org/10.1186/1758-2946-3-43
Pizzi G, Cepellotti A, Sabatini R, Marzari N, Kozinsky B (2016) AiiDA: automated interactive infrastructure and database for computational science. Comput Mater Sci 111(Supplement C):218–230. https://doi.org/10.1016/j.commatsci.2015.09.013
Rajan K (2015) Materials informatics: The materials “Gene” and Big data. Annu Rev Mater Res 45:153–169. https://doi.org/10.1146/annurev-matsci-070214-021132
Saal JE, Kirklin S, Aykol M, Meredig B, Wolverton C (2013) Materials design and discovery with high-throughput density functional theory: The open quantum materials database (OQMD). JOM 65(11):1501–1509. https://doi.org/10.1007/s11837-013-0755-4
Swindells N (2009) The representation and exchange of material and other engineering properties. Data Sci J 8:190–200. https://doi.org/10.2481/dsj.008-007
Thomas DG, Pappu RV, Baker NA (2011) Nanoparticle ontology for cancer nanotechnology research. J Biomed Inf 44(1):59–74. https://doi.org/10.1016/j.jbi.2010.03.001
Wilkinson MD, Dumontier M, Aalbersberg IJ, Appleton G, Axton M, Baak A, Blomberg N, Boiten JW, da Silva Santos LB, Bourne PE, Bouwman J, Brookes AJ, Clark T, Crosas M, Dillo I, Dumon O, Edmunds S, Evelo CT, Finkers R, Gonzalez-Beltran A, Gray AJ, Groth P, Goble C, Grethe JS, Heringa J, ’t Hoen PA, Hooft R, Kuhn T, Kok R, Kok J, Lusher SJ, Martone ME, Mons A, Packer AL, Persson B, Rocca-Serra P, Roos M, van Schaik R, Sansone SA, Schultes E, Sengstag T, Slater T, Strawn G, Swertz MA, Thompson M, van der Lei J, van Mulligen E, Velterop J, Waagmeester A, Wittenburg P, Wolstencroft K, Zhao J, Mons B (2016) The FAIR guiding principles for scientific data management and stewardship. Scientific Data 3:160,018:1–9. https://doi.org/10.1038/sdata.2016.18
Zhang X, Hu C, Li H (2009) Semantic query on materials data based on mapping matml to an owl ontology. Data Sci J 8:1–17. https://doi.org/10.2481/dsj.8.1
Zhang X, Pan D, Zhao C, Li K (2016) MMOY: Towards deriving a metallic materials ontology from Yago. Adv Eng Inf 30:687–702. https://doi.org/10.1016/j.aei.2016.09.002
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this entry
Cite this entry
Lambrix, P., Armiento, R., Delin, A., Li, H. (2022). FAIR Big Data in the Materials Design Domain. In: Zomaya, A., Taheri, J., Sakr, S. (eds) Encyclopedia of Big Data Technologies. Springer, Cham. https://doi.org/10.1007/978-3-319-63962-8_293-2
Download citation
DOI: https://doi.org/10.1007/978-3-319-63962-8_293-2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-63962-8
Online ISBN: 978-3-319-63962-8
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering
Publish with us
Chapter history
-
Latest
Big Semantic Data Processing in the Materials Design Domain- Published:
- 22 March 2018
DOI: https://doi.org/10.1007/978-3-319-63962-8_293-1
-
Original
FAIR Big Data in the Materials Design Domain- Published:
- 24 February 2012
DOI: https://doi.org/10.1007/978-3-319-63962-8_293-2