Abstract
Machine learning based approaches to material discovery are reviewed with the aim of providing a perspective on the current state of the art and its potential. Various models used to represent molecules and crystals are introduced and such representations can be used within the neural networks to generate materials that satisfy specified physical features and properties. For problems where large database for structure-property map cannot be created, the active learning approaches based on Bayesian optimization to maximize the efficiency of a search are reviewed. Successful applications of these machine learning based material discovery approaches are beginning to appear and some of the notable ones are reviewed.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Y. LeCun, Y. Bengio and G. Hinton, Nature, 521, 436 (2015).
V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg and D. Hassabis, Nature, 518, 529 (2015).
D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. van den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel and D. Hassabis, Nature, 529, 484 (2016).
K. T. Butler, D. W. Davies, H. Cartwright, O. Isayev and A. Walsh, Nature, 559, 547 (2018).
A. Agrawal and A. Choudhary, APL Materials, 4, 053208 (2016).
M. Rupp, A. Tkatchenko, K.-R. Müller and O. A. von Lilienfeld, Phys. Rev. Lett., 108, 058301 (2012).
T. Hastie, R. Tibshirani and J. Friedman, The elements of statistical learning, Springer, New York (2009).
K. Hansen, F. Biegler, R. Ramakrishnan, W. Pronobis, O. A. von Lilienfeld, K.-R. Müller and A. Tkatchenko, J. Phys. Chem. Lett., 6, 2326 (2015).
D. Weininger, J. Chem. Information Modeling, 28, 31 (1988).
D. Weininger, A. Weininger and J. L. Weininger, J. Chem. Information Modeling, 29, 97 (1989).
S. Kearnes, K. McCloskey, M. Berndl, V. Pande and P. Riley, J. Comput.-Aided Mol. Des., 30, 595 (2016).
D. Duvenaud, D. Maclaurin, J. Aguilera-Iparraguirre, R. Gómez-Bombarelli, T. Hirzel, A. Aspuru-Guzik and R. P. Adams, arXiv preprint arXiv:1509.09292 (2015).
A. Garcia-Garcia, S. Orts-Escolano, S. Oprea, V. Villena-Martinez and J. Garcia-Rodriguez, arXiv preprint arXiv:1704.06857 (2017).
J. Zhou, G. Cui, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li and M. Sun, AI Open, 1, 57 (2020)
A. P. Bartók, R. Kondor and G. Csányi, Phys. Rev. B, 87, 184115 (2013).
O. A. von Lilienfeld, R. Ramakrishnan, M. Rupp and A. Knoll, Int. J. Quantum Chem., 115, 1084 (2015).
M. Valle and A. R. Oganov, Acta Crystallogr., Sect. A: Found. Crystallog., 66, 507 (2010).
K. T. Schütt, H. Glawe, F. Brockherde, A. Sanna, K. R. Müller and E. K. U. Gross, Phys. Rev. B, 89, 205118 (2014).
F. Faber, A. Lindmaa, O. A. von Lilienfeld and R. Armiento, Int. J. Quantum Chem., 115, 1094 (2015).
T. Xie and J. C. Grossman, Phys. Rev. Lett., 120, 145301 (2018).
J. Behler and M. Parrinello, Phys. Rev. Lett., 98, 146401 (2007).
J. Behler, J. Chem. Phys., 134, 074106 (2011).
J. Behler, Int. J. Quantum Chem., 115, 1032 (2015).
J. S. Smith, O. Isayev and A. E. Roitberg, Chem. Sci., 8, 3192 (2017).
M. Gastegger, L. Schwiedrzik, M. Bittermann, F. Berzsenyi and P. Marquetand, J. Chem. Phys., 148, 241709 (2018).
K. T. Schütt, H. E. Sauceda, P.-J. Kindermans, A. Tkatchenko and K.-R. Müller, J. Chem. Phys., 148, 241722 (2018).
K. T. Schütt, P. Kessel, M. Gastegger, K. A. Nicoli, A. Tkatchenko and K.-R. Müller, J. Chem. Theory Comput., 15(1), 448 (2018).
L.-C. Lin, A. H. Berger, R. L. Martin, J. Kim, J. A. Swisher, K. Jariwala, C. H. Rycroft, A. S. Bhown, M. W. Deem, M. Haranczyk and B. Smit, Nat. Mater., 11, 633 (2012).
C. E. Wilmer, M. Leaf, C. Y. Lee, O. K. Farha, B. G. Hauser, J. T. Hupp and R. Q. Snurr, Nat. Chem., 4, 83 (2012).
D. A. Gómez-Gualdrón, C. E. Wilmer, O. K. Farha, J. T. Hupp and R. Q. Snurr, J. Phys. Chem. C, 118, 6941 (2014).
C. M. Simon, J. Kim, D. A. Gomez-Gualdron, J. S. Camp, Y. G. Chung, R. L. Martin, R. Mercado, M. W. Deem, D. Gunter, M. Haranczyk, D. S. Sholl, R. Q. Snurr and B. Smit, Energy Environ. Sci., 8, 1190 (2015a).
A. Mullard, Nature, 549, 445 (2017).
B. Sanchez-Lengeling and A. Aspuru-Guzik, Science, 361, 360 (2018).
I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville and Y. Bengio, Commun. ACM, 63(11), 139 (2020).
D. P. Kingma and M. Welling, arXiv preprint arXiv:1312.6114 (2013).
R. Gómez-Bombarelli, J. N. Wei, D. Duvenaud, J. M. Hernández-Lobato, B. Sánchez-Lengeling D. Sheberla, J. Aguilera-Iparraguirre, T. D. Hirzel, R. P. Adams and A. Aspuru-Guzik, ACS Cent. Sci., 4, 268 (2018).
M. J. Kusner, B. Paige and J. M. Hernández-Lobato, ICML, PMLR (2017).
E. Putin, A. Asadulaev, Y. Ivanenkov, V. Aladinskiy, B. Sanchez-Lengeling A. Aspuru-Gzik, and A. Zhavoronkov, J. Chem. Information Modeling, 58, 1194 (2018).
M. H. S. Segler, T. Kogej, C. Tyrchan and M. P. Waller, ACS Cent. Sci., 4, 120 (2018).
G. L. Guimaraes, B. Sanchez-Lengeling, C. Outeiral, P. L. C. Farias and A. Aspuru-Guzik, arXiv preprint arXiv:1705.10843 (2017).
A. Kadurin, S. Nikolenko, K. Khrabrov, A. Aliper and A. Zhavoronkov, Mol. Pharm., 14, 3098 (2017).
M. Olivecrona, T. Blaschke, O. Engkvist and H. Chen, J. Cheminformatics, 9, 48 (2017).
N. De Cao and T. Kipf, arXiv preprint arXiv:1805.11973 (2018).
N. W. A. Gebauer, M. Gastegger and K. T. Schütt, arXiv preprint arXiv:1810.11347 (2018).
D. Xue, Y. Gong, Z. Yang, G. Chuai, S. Qu, A. Shen, J. Yu and Q. Liu, Wiley Interdiscip. Rev. Comput. Mol. Sci., 9, e1395 (2018).
Y. Li, L. Zhang and Z. Liu, J. Cheminformatics, 10(1), 1 (2018).
M. Simonovsky and N. Komodakis, ICANN, Springer, Cham (2018).
Q. Zhou, P. Tang, S. Liu, J. Pan, Q. Yan and S.-C. Zhang, Proc. Natl. Acad. Sci., 115(28), E6411 (2018).
A. Ziletti, D. Kumar, M. Scheffler and L. M. Ghiringhelli, Nat. Commun., 9, 2775 (2018).
J. Noh, J. Kim, H. S. Stein, B. Sanchez-Lengeling, J. M. Gregoire, A. Aspuru-Guzik and Y. Jung, Matter, 1(5), 1370 (2019).
S. Kim, J. Noh, G. H. Gu, A. Aspuru-Guzik and Y. Jung, ACS Cent. Sci., 6, 1412 (2020).
J. Jang, G. H. Gu, J. Noh, J. Kim and Y. Jung, J. Am. Chem. Soc., 142, 18836 (2020).
N. S. Bobbitt and R. Q. Snurr, Mol. Simul., 45(14–15), 1069 (2019).
M. Fernandez, P. G. Boyd, T. D. Daff, M. Z. Aghaji and T. K. Woo, J. Phys. Chem. Lett., 5, 3056 (2014).
C. M. Simon, R. Mercado, S. K. Schnell, B. Smit and M. Haranczyk, Chem. Mater., 27, 4459 (2015).
Y. G. Chung, D. A. Gómez-Gualdrón, P. Li, K. T. Leperi, P. Deria, H. Zhang, N. A. Vermeulen, J. F. Stoddart, F. You, J. T. Hupp, O. K. Farha and R. Q. Snurr, Sci. Adv., 2(10), e1600909 (2016).
A. Raza, A. Sturluson, C. M. Simon and X. Fern, J. Phys. Chem. C, 124, 19070 (2020).
Z. Yao, B. Sánchez-Lengeling, N. S. Bobbitt, B. J. Bucior, S. G. H. Kumar, S. P. Collins, T. Burns, T. K. Woo, O. K. Farha, R. Q. Snurr and A. Aspuru-Guzik, Nat. Mach. Intell., 3, 76 (2021).
S. Lee, B. Kim and J. Kim, J. Mater. Chem. A, 7, 2709 (2019).
B. Kim, S. Lee and J. Kim, Sci. Adv., 6, eaax9324 (2020).
D. Xue, P. V. Balachandran, J. Hogden, J. Theiler, D. Xue and T. Lookman, Nat. Commun., 7(1), 1 (2016).
A. I. J. Forrester and A. J. Keane, Prog. Aerosp. Sci., 45, 50 (2009).
P. Raccuglia, K. C. Elbert, P. D. F. Adler, C. Falk, M. B. Wenny, A. Mollo, M. Zeller, S. A. Friedler, J. Schrier and A. J. Norquist, Nature, 533, 73 (2016).
S. Pruksawan, G. Lambard, S. Samitsu, K. Sodeyama and M. Naito, Sci. Technol. Adv. Mater., 20, 1010 (2019).
J. Mockus, J. Glob. Optim., 4, 347 (1994).
D. R. Jones, M. Schonlau and W. J. Welch, J. Glob. Optim., 13, 455 (1998).
S. Streltsov and P. Vakili, J. Glob. Optim., 14, 283 (1999).
C. E. Rasmussen and C. Williams, Gaussian processes for machine learning, MIT Press, Cambridge (2006).
D. R. Jones, M. Schonlau and W. J. Welch, J. Glob. Optim., 13, 455 (1998).
P. I. Frazier, W. B. Powell and S. Dayanik, SICON, 47, 2410 (2008).
J. Knowles, IEEE Trans. Evol. Comput., 10, 50 (2006).
I. Das, Nonlinear multicriteria optimization and robust optimality, Rice University (1997).
W. Ponweiser, T. Wagner, D. Biermann and M. Vincze, Multiobjective optimization on a limited budget of evaluations using modelassisted S-metric selection, Springer, Berlin (2008).
M. Zuluaga, G. Sergent, A. Krause and M. Püschel, ICML, PMLR (2013).
M. Emmerich and J.-w. Klinkenberg, Rapport technique, Leiden University, 34, 7 (2008).
V. Picheny, Stat. Comput., 25, 1265 (2015).
D. Hernández-Lobato, J. Hernandez-Lobato, A. Shah and R. Adams, ICML, PMLR (2016).
M. Schonlau, Computer experiments and global optimization, University of Waterloo (1997).
M. J. Sasena, Flexibility and efficiency enhancements for constrained global design optimization with kriging approximations, University of Michigan (2002).
M. Sasena, P. Papalambros and P. Goovaerts, 8th Multidiscip. Anal. Optim. Conf., 4921 (2000).
C. Audet, J. Denni, D. Moore, A. Booker and P. Frank, 8th Multidiscip. Anal. Optim. Conf., 4891 (2000).
B. Bichon, S. Mahadevan and M. Eldred, 50th AIAA/ASCE/AHS/ASC Struct. Struct. Dyn. Mater. Conf. (2009).
V. Picheny, R. B. Gramacy, S. Wild and S. L. Digabel, ICONIP, 1443 (2016).
H. Lee, R. Gramacy, C. Linkletter and G. Gray, Pac. J. Optim., 7, 467 (2011).
A. Basudhar, C. Dribusch, S. Lacaze and S. Missoum, Struct. Multidiscip. Optim., 46, 201 (2012).
J. Azimi, A. Fern and X. Z. Fern, NeurIPS (2010).
J. Bergstra, R. Bardenet, Y. Bengio and B. Kégl, NeurIPS, 24 (2011).
J. Azimi, A. Jalali and X. Fern, arXiv preprint arXiv:1202.5597 (2012).
M. Schonlau, W. J. Welch and D. R. Jones, Lecture Notes-Monograph Series, 34, 11 (1998).
E. Contal, D. Buffoni, A. Robicquet and N. Vayatis, ECML PKDD, 225 (2013).
T. Desautels, A. Krause and J. W. Burdick, J. Mach. Learn. Res., 15, 3873 (2014).
J. Očenášek and J. Schwarz, The state of the art in computational intelligence, 61, Physica, Heidelberg (2000).
M. A. Taddy, H. K. H. Lee, G. A. Gray and J. D. Griffin, Technometrics, 51, 389 (2009).
J. Schmidt, M. R. G. Marques, S. Botti and M. A. L. Marques, Npj Comput. Mater., 5, 1 (2019).
T. Lookman, P. V. Balachandran, D. Xue, J. Hogden and J. Theiler, Curr. Opin. Solid State Mater. Sci., 21, 121 (2017).
P. V. Balachandran, D. Xue, J. Theiler, J. Hogden and T. Lookman, Sci. Rep., 6, 1 (2016).
A. Talapatra, S. Boluki, T. Duong, X. Qian, E. Dougherty and R. Arróyave, Phys. Rev. Mater., 2, 113803 (2018).
R.-R. Griffiths and J. M. Hernández-Lobato, arXiv preprint arXiv: 1709.05501 (2017).
Acknowledgements
This work was supported by the National Research Foundation of Korea funded by the Ministry of Science, ICT, & Future Planning under grant no. 2021R1A2C2003583 and 2021R1A2C2006083.
Author information
Authors and Affiliations
Corresponding authors
Additional information
Jihan Kim obtained his B.S. degree in Electrical Engineering and Computer Sciences from UC Berkeley in 2001. He received his M.S. and Ph.D. degree in Electrical and Computer Engineering from University of Illinois at Urbana-Champaign in 2004 and 2009 respectively. From 2009 to 2013, he was a postdoctoral researcher at Lawrence Berkeley National Laboratory. He joined KAIST in 2013 and is currently an associate professor in the Department of Chemical and Biomolecular Engineering. He has published more than 90 papers.
Jay Hyung Lee is currently a KEPCO Chair Professor at Korea Advanced Institute of Science and Technology (KAIST). He is also the director of Saudi Aramco-KAIST CO2 Management Center. He received the AIChE CAST Computing in Chemical Engineering Award and was elected as an IEEE Fellow, an IFAC Fellow, and an AIChE Fellow. He was the 29th Roger Sargent Lecturer in 2016. He published over 200 manuscripts in SCI journals with more than 17000 Google Scholars citations. His research interests are in the areas of state estimation, model predictive control, planning/scheduling, and reinforcement learning with applications to energy systems and carbon management systems.
Rights and permissions
About this article
Cite this article
Lee, S., Byun, H., Cheon, M. et al. Machine learning-based discovery of molecules, crystals, and composites: A perspective review. Korean J. Chem. Eng. 38, 1971–1982 (2021). https://doi.org/10.1007/s11814-021-0869-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11814-021-0869-2