Abstract
In the last decade, connectionist models have been proposed that can process structured information directly. These methods, which are based on the use of graphs for the representation of the data and the relationships within the data, are particularly suitable for handling relational learning tasks. In this paper, two recently proposed architectures of this kind, i.e. Graph Neural Networks (GNNs) and Relational Neural Networks (RelNNs), are compared and discussed, along with their corresponding learning schemes. The goal is to evaluate the performance of these methods on benchmarks that are commonly used by the relational learning community. Moreover, we also aim at reporting differences in the behavior of the two models, in order to gain insights on possible extensions of the approaches. Since RelNNs have been developed with the specific task of learning aggregate functions in mind, some experiments are run considering that particular task. In addition, we carry out more general experiments on the mutagenesis and the biodegradability datasets, on which several other relational learners have been evaluated. The experimental results are promising and suggest that RelNNs and GNNs can be a viable approach for learning on relational data.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Almeida, L. B. (1990). A learning rule for asynchronous perceptrons with feedback in a combinatorial environment. Piscataway: IEEE Press. pp. 102–111.
Back, A., & Tsoi, A. C. (1994). Locally recurrent globally feedforward networks, a critical review of architectures. IEEE Transactions on Neural Networks, 5(3), 229–239.
Baldi, P., & Pollastri, G. (2004). The principled design of large-scale recursive neural network architectures-DAG-RNNs and the protein structure prediction problem. Journal of Machine Learning Research, 4, 575–602.
Bengio, Y., Frasconi, P., & Simard, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5, 157–166. Special Issue on Recurrent Neural Networks.
Bianchini, M., Gori, M., & Scarselli, F. (2001). Processing directed acyclic graphs with recursive neural networks. IEEE Transactions on Neural Networks, 12(6), 1464–1470.
Bianchini, M., Mazzoni, P., Sarti, L., & Scarselli, F. (2003). Face localization with recursive neural networks. In B. Apolloni, M. Marinaro, & R. Tagliaferri (Eds.), Lecture notes in computer science : Vol. 2859. Proceedings of WIRN03 (pp. 99–105). Berlin: Springer.
Bianchini, M., Maggini, M., Sarti, L., & Scarselli, F. (2005). Recursive neural networks for processing graphs with labelled edges: Theory and applications. Neural Networks—Special Issue on Neural Networks and Kernel Methods for Structured Domains, 18(8), 1040–1050.
Bianchini, M., Gori, M., Sarti, L., & Scarselli, F. (2006). Recursive processing of cyclic graphs. IEEE Transactions on Neural Networks, 17(1), 10–18.
Blockeel, H., & Bruynooghe, M. (2003). Aggregation versus selection bias, and relational neural networks. In IJCAI-2003 workshop on learning statistical models from relational data, SRL-2003 (Vol. 11).
Bloehdorn, S., & Blohm, S. (2006). A self organizing map for relation extraction from wikipedia using structured data representations. In Proceedings of the international workshop on intelligent information access, IIIA-2006. Berlin: Springer.
Brin, S., & Page, L. (1998). The anatomy of a large–scale hypertextual Web search engine. Computer Networks and ISDN Systems, 30(1–7), 107–117.
Chang, H., Cohn, D., & McCallum, A. (2000). Learning to create customized authority lists. In Proceedings of the 17th international conference on machine learning, ICML ’00 (pp. 127–134). San Mateo: Morgan Kaufmann.
Chapelle, O., Schölkopf, B., & Zien, A. (Eds.) (2006). Semi-supervised learning. Cambridge: MIT Press.
De Raedt, L., & Blockeel, H. (1997). Using logical decision trees for clustering. In Lecture notes in artificial intelligence : Vol. 1297. Proceedings of the 7th international workshop on inductive logic programming ILP-97 (pp. 133–141). Berlin: Springer.
Debnath, A., Lopex de Compandre, R., Debnath, G., Schusterman, A., & Hansch, C. (1991). Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. Correlation with molecular orbital energies and hydrophobicity. Journal of Medicinal Chemistry, 34(2), 786–797.
Di Massa, V., Monfardini, G., Sarti, L., Scarselli, F., Maggini, M., & Gori, M. (2006). A comparison between recursive neural networks and graph neural networks. In International joint conference on neural networks (pp. 778–785).
Dzeroski, S. (1999). Biodegradability dataset, info at: http://www-ai.ijs.si/ilpnet2/apps/pbiodeg.html.
Dzeroski, S., Blockeel, H., Kompare, B., Kramer, S., Pfahringer, B., & Van Laer, W. (1999). Experiments in predicting biodegradability. In Lecture notes in computer science : Vol. 1634. Proceedings of the 9th international workshop on inductive logic programming, ILP-99 (pp. 80–91).
Francesconi, E., Frasconi, P., Gori, M., Marinai, S., Sheng, J., Soda, G., & Sperduti, A. (1998). Logo recognition by recursive neural networks. In K. Tombre & A. K. Chhabra (Eds.), GREC ’97: Selected papers from the second international workshop on graphics recognition, algorithms and systems (pp. 104–117). Berlin: Springer.
Frasconi, P., Gori, M., & Sperduti, A. (1998). A general framework for adaptive processing of data structures. IEEE Transactions on Neural Networks, 9(5), 768–786.
Gärtner, T. (2003). Kernel-based learning in multi-relational data mining. ACM SIGKDD Explorations, 5(1), 49–58.
Gärtner, T., Lloyd, J., & Flach, P. (2004). Kernels and distances for structured data. Machine Learning, 57(3), 205–232.
Getoor, L., & Taskar, B. (2007). Introduction to statistical relational learning. Cambridge: MIT Press.
Goller, C. (1997). A connectionist approach for learning search control heuristics for automated deduction systems. PhD thesis, Technische Universität München.
Goller, C., & Küchler, A. (1996). Learning task-dependent distributed representations by backpropagation through structure. IEEE Transactions on Neural Networks, 1, 347–352.
Gori, M., Scarselli, F., & Tsoi, A. C. (1998). On the closure of set of functions that can be realized by a multilayer perceptron. IEEE Transactions on Neural Networks, 9(6), 1086–1098.
Gori, M., Monfardini, G., & Scarselli, F. (2005). A new model for learning in graph domains. In Proceedings international joint conference on neural networks (IJCNN2005) (Vol. 2, pp. 729–734).
Goulon-Sigwalt-Abram, A., Duprat, A., & Dreyfus, G. (2005). From hopfield nets to recursive networks to graph machines: numerical machine learning for structured data. Theoretical Computer Science, 344(2–3), 298–334.
Hagenbuchner, M., Sperduti, A., & Tsoi, A. C. (2003). A self-organizing map for adaptive processing of structured data. IEEE Transactions on Neural Networks, 14(3), 491–505.
Hagenbuchner, M., Sperduti, A., & Tsoi, A. (2005). Contextual self-organizing maps for structured domains. In Lecture notes in computer science : Vol. 3720. Proceedings of the workshop on relational machine learning, 16th European conference on machine learning ECML 2005 (pp. 46–55). Berlin: Springer.
Hagenbuchner, M., Sperduti, A., Tsoi, A. C., Trentini, F., Scarselli, F., & Gori, M. (2006). Clustering XML documents using self-organizing maps for structures. In Lecture notes in computer science : Vol. 3977. Advances in XML information retrieval and evaluation (pp. 481–496). Berlin: Springer.
Hagenbuchner, M., Sperduti, A., & Tsoi, A. C. (2009). Graph self-organizing maps for cyclic and unbounded graphs. Neurocomputation, 72(7–9), 1419–1430.
Hammer, B. (1999). Approximation capabilities of folding networks. In M. Caudill & C. Butler (Eds.), Proceedings of the 7th European symposium on artificial neural networks, ESANN 1999 (pp. 33–38).
Hammer, B., & Jain, J. (2004). Neural methods for non-standard data. In M. Verleysen (Ed.), Proceedings of the 12th European symposium on artificial neural networks, ESANN 2004 (pp. 281–292). D-side publications.
Hammer, B., Micheli, A., Sperduti, A., & Strickert, M. (2004a). A general framework for unsupervised processing of structured data. Neurocomputing, 57, 3–35.
Hammer, B., Micheli, A., Sperduti, A., & Strickert, M. (2004b). Recursive self-organizing network models. Neural Networks, 17(8–9), 1061–1085.
Haykin, S. (1994). Neural networks: A comprehensive foundation. New York: Prentice Hall.
Jensen, F. V. (1996). Introduction to Bayesian networks. Berlin: Springer.
Khamsi, M. A. (2001). An introduction to metric spaces and fixed point theory. New York: Wiley.
Kirsten, M. (2002). Multirelational distance-based clustering. PhD thesis, School of Computer Science, Otto-von-Guericke University, Magdeburg, Germany.
Kleinberg, J. (1999). Authoritative sources in a hyperlinked environment. Journal of the ACM, 46(5), 604–632.
Kondor, R., & Lafferty, J. (2002). Diffusion kernels on graphs and other discrete structures. In C. Sammut & A. G. Hoffmann (Eds.), Proceedings of the 19th international conference on machine learning, ICML 2002 (pp. 315–322). San Mateo: Morgan Kaufmann.
Krahmer, E., Erk, S., & Verleg, A. (2003). Graph-based generation of referring expressions. Computational Linguistics, 29(1), 53–72.
Kramer, S., & De Raedt, L. (2001). Feature construction with version spaces for biochemical applications. In Proceedings of the 18th international conference on machine learning, ICML 2001 (pp. 258–265). San Mateo: Morgan Kaufmann.
Krogel, M., Rawles, S., Zelezny, F., Flach, P., Lavrac, N., & Wrobel, S. (2003). Comparative evaluation of approaches to propositionalization. In Lecture notes in computer science : Vol. 2835. Proceedings of the 13th international conference on inductive logic programming, ILP 2003 (pp. 197–214). Berlin: Springer.
Lafferty, J., McCallum, A., & Pereira, F. (2001). Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of 18th international conference on machine learning.
Lodhi, H., & Muggleton, S. H. (2005). Is Mutagenesis still challenging? In Proceedings of the 15th international conference on inductive logic programming, ILP 2005, late-breaking papers (pp. 35–40). Cambridge: MIT Press.
McClelland, J., Rumelhart, D., & the PDP Research Group (1986). Parallel distributed processing: explorations in the microstructure of cognition (Vol. 2). Cambridge: MIT Press.
Micheli, A. (2009). Neural network for graphs: A contextual constructive approach. IEEE Transactions on Neural Networks, 20(3), 498–511.
Micheli, A., Sperduti, A., Starita, A., & Bianucci, A. M. (2001). Analysis of the internal representations developed by neural networks for structures applied to quantitative structure-activity relationship studies of benzodiazepines. Journal of Chemical Information and Computer Sciences, 41(2), 202–218.
Micheli, A., Sona, D., & Sperduti, A. (2004). Contextual processing of structured data by recursive cascade correlation. IEEE Transactions on Neural Networks, 15(6), 1396–1410.
Money, C., & Pollastri, G. (2009). Beyond the twilight zone: Automated prediction of structural properties of protein by recursive neural networks and remote homology information. Proteins: Structure, Function, and Bioinformatics, 77(1), 181–190.
Monfardini, G., & Scarselli, F. (2004). The graph neural networks toolbox, freely available for download at http://airgroup.dii.unisi.it/projects/GraphNeuralNetwork/download.htm.
Mutagenesis (1991). Mutagenesis dataset, available for download at http://www.comlab.ox.ac.uk/activities/machinelearning/mutagenesis.html.
Newman, M. E. J. (2001). From the cover: The structure of scientific collaboration networks. Proceedings National Academy of Sciences, 98(2), 404–409.
Pineda, F. (1987). Generalization of back-propagation to recurrent neural networks. Physical Review Letters, 59, 2229–2232.
Quinlan, J. R. (1996). Boosting first-order learning. In S. Arikawa & A. Sharma (Eds.), Lecture notes in computer science : Vol. 1160. Proceedings of the 7th international workshop on algorithmic learning theory, ALT 1996 (p. 143). Berlin: Springer.
Quinlan, J., & Cameron-Jones, R. (1993). FOIL: A midterm report. In Proceedings of the European conference on machine learning (pp. 3–20).
Ramon, J. (2002). Clustering and instance based learning in first order logic. PhD thesis, K.U. Leuven, Belgium.
Riedmiller, M., & Braun, H. (1993). A direct adaptive method for faster backpropagation learning: the RPROP algorithm. In Proceedings of the IEEE international conference on neural networks (Vol. 1, pp. 586–591). San Francisco, CA, USA.
Rodriguez, P. (2001). Simple recurrent networks learn context-free and context-sensitive languages by counting. Neural Computation, 13, 2093–2118.
Scarselli, F., Yong, S., Gori, M., Hagenbuchner, M., Tsoi, A., & Maggini, M. (2005). Graph neural networks for ranking web pages. In Proceedings of the 2005 IEEE/WIC/ACM conference on web intelligence, WI2005 (pp. 666–672), Washington, DC, USA. Washington: IEEE Computer Society.
Scarselli, F., Gori, M., Monfardini, G., Tsoi, A. C., & Hagenbuchner, M. (2009a). Computational capabilities of graph neural networks. IEEE Transactions on Neural Networks, 20(1), 81–102.
Scarselli, F., Gori, M., Monfardini, G., Tsoi, A. C., & Hagenbuchner, M. (2009b). The graph neural network model. IEEE Transactions on Neural Networks, 20(1), 61–80.
Sperduti, A. (1994). Labelling RAAM. Connection Science, 6(4), 429–459.
Sperduti, A., & Starita, A. (1997). Supervised neural networks for the classification of structures. IEEE Transactions on Neural Networks, 8(3), 714–735.
Srinivasan, A., Muggleton, S., King, R., & Sternberg, M. (1994). Mutagenesis: ILP experiments in a non-determinate biological domain. In S. Wrobel (Ed.), GMD-Studien : Vol. 23. Proceedings of the 4th international workshop on inductive logic programming, ILP 1994 (pp. 217–232). Gesellschaft für Mathematik und Datenverarbeitung MBH.
Sturt, P., Costa, F., Lombardo, V., & Frasconi, V. (2003). Learning first-pass structural attachment preferences with dynamic grammars and recursive neural networks. Cognition, 88(2), 133–169.
Tsoi, A. C., Morini, G., Scarselli, F., Hagenbuchner, M., & Maggini, M. (2003). Adaptive ranking of web pages. In Proceedings of the 12th international conference on world wide web, WWW 2003, New York, NY, USA. New York: ACM.
Tsoi, A. C., Hagenbuchner, M., & Scarselli, F. (2006). Computing customized page ranks. ACM Transactions on Internet Technology, 6(4), 381–414.
Uwents, W., & Blockeel, H. (2005). Classifying relational data with neural networks. In Lecture notes in computer science : Vol. 3625. Proceedings of the 15th international conference on inductive logic programming, ILP 2005 (pp. 384–396). Berlin: Springer.
Vapnik, V. N. (1998). Statistical learning theory. New York: Wiley.
Wang, Z., Hagenbuchner, M., Tsoi, A., Cho, S. Y., & Chi, Z. (2002). Image classification with structured self-organization map. In Proceedings of the 2002 international joint conference on neural networks, IJCNN 2002 (Vol. 2, pp. 1918–1923), Piscataway, NJ, USA. New York: IEEE Press.
Werbos, P. (1990). Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10), 1550–1560.
Yong, S., Hagenbuchner, M., Scarselli, F., Tsoi, A. C., & Gori, M. (2006). Document mining using graph neural networks. In N. Fuhr, M. Lalmas, & A. Trotman (Eds.), Lecture notes in computer science : Vol. 4518. Proceedings of the 5th international workshop of the initiative for the evaluation of XML retrieval, INEX 2006, revised and selected papers (pp. 458–472). Berlin: Springer.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Thomas Gärtner and Gemma Garriga.
Rights and permissions
About this article
Cite this article
Uwents, W., Monfardini, G., Blockeel, H. et al. Neural networks for relational learning: an experimental comparison. Mach Learn 82, 315–349 (2011). https://doi.org/10.1007/s10994-010-5196-5
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-010-5196-5