Kernels and Distances for Structured Data

Gärtner, Thomas; Lloyd, John W.; Flach, Peter A.

doi:10.1023/B:MACH.0000039777.23772.30

Kernels and Distances for Structured Data

Published: December 2004

Volume 57, pages 205–232, (2004)
Cite this article

Download PDF

Machine Learning Aims and scope Submit manuscript

Kernels and Distances for Structured Data

Download PDF

Thomas Gärtner^1,2,
John W. Lloyd³ &
Peter A. Flach⁴

1414 Accesses
114 Citations
Explore all metrics

Abstract

This paper brings together two strands of machine learning of increasing importance: kernel methods and highly structured data. We propose a general method for constructing a kernel following the syntactic structure of the data, as defined by its type signature in a higher-order logic. Our main theoretical result is the positive definiteness of any kernel thus defined. We report encouraging experimental results on a range of real-world data sets. By converting our kernel to a distance pseudo-metric for 1-nearest neighbour, we were able to improve the best accuracy from the literature on the Diterpene data set by more than 10%.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Andrews, S., Tsochantaridis, I., & Hofmann, T. (2003). Support vector machines for multiple-instance learning. In Advances in neural information processing systems (Vol. 15) MIT Press.
Aronszajn, N. (1950). Theory of reproducing kernels. Transactions of the American Mathematical Society, 68.
Ben-Hur, A., Horn, D., Siegelmann, H. T., & Vapnik, V. (2001). Support vector clustering. Journal of MachineLearning Research, 2, 125–137.
Google Scholar
Blockeel, H., & De Raedt, L. (1998). Top-down induction of first order logical decision trees. Artificial Intelligence, 101:1/2, 285–297.
Google Scholar
Boser, B. E., Guyon, I. M., & Vapnik, V. N. (1992). A training algorithm for optimal margin classifiers. In D. Haussler (Ed.), Proceedings of the 5th Annual ACM Workshop on Computational Learning Theory (pp. 144–152). ACM Press.
Church, A. (1940). A formulation of the simple theory of types. Journal of Symbolic Logic, 5, 56–68.
Google Scholar
Collins, M., & Duffy, N. (2002). Convolution kernels for natural language. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems (Vol. 14) MIT Press.
Cristianini, N., & Shawe-Taylor, J. (2000). An introduction to support vector machines (and other kernel-basedlearning methods). Cambridge University Press.
De Raedt, L. (1998). Attribute value learning versus inductive logic programming: The missing links (extended abstract). In D. Page (Ed.), Proceedings of the 8th International Conference on Inductive Logic Programming, Vol. 1446 of Lecture Notes in Artificial Intelligence (pp. 1–8). Springer-Verlag.
De Raedt, L., & Van Laer, W. (1995). Inductive constraint logic. In K. Jantke, T. Shinohara, & T. Zeugmann (Eds.), Proceedings of the 6th InternationalWorkshop on Algorithmic Learning Theory, Vol. 997 of LNAI, (pp. 80–94).Springer Verlag.
Dietterich, T. G., Lathrop, R. H., & Lozano-Pérez, T. (1997). Solving the multiple instance problem with axisparallel rectangles. Artificial Intelligence, 89:1/2, 31–71.
Google Scholar
D?zeroski, S., & Lavrač N. (Eds.) (2001). Relational data mining. Springer-Verlag.
D?zeroski, S., Schulze-Kremer, S., Heidtke, K., Siems, K., Wettschereck, D., & Blockeel, H. (1998). Diterpene structure elucidation from 13C NMR spectra with inductive logic programming. Applied Artificial Intelligence, 12:5, 363–383. Special Issue on First-Order Knowledge Discovery in Databases.
Google Scholar
Emde, W., & Wettschereck, D. (1996). Relational instance-based learning. In Proceedings of the 13th International Conference on Machine Learning (pp. 122–130). Morgan Kaufmann.
Evgeniou, T., Pontil, M., & Poggio, T. (2000). Regularization networks and support vector machines. Advances in Computational Mathematics.
Gärtner, T. (2002). Exponential and geometric kernels for graphs. In NIPS Workshop on Unreal Data: Principles of Modeling Nonvectorial Data.
Gärtner, T. (2003). A survey of kernels for structured data. SIGKDD Explorations.
Gärtner, T., Flach, P. A., Kowalczyk, A., & Smola, A. J. (2002). Multi-instance kernels. In C. Sammut & A. Hoffmann (Eds.), Proceedings of the 19th International Conference on Machine Learning (pp. 179–186). Morgan Kaufmann.
Gärtner, T., Flach, P. A., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. In Proceedings of the 16th Annual Conference on Computational Learning Theory and the 7th Kernel Workshop.
Haussler, D. (1999). Convolution kernels on discrete structures.Technical report, Department of Computer Science, University of California at Santa Cruz.
Horváth, T., Wrobel, S., & Bohnebeck, U. (2001). Relational instance-based learning with lists and terms. Machine Learning, 43:1/2, 53–80.
Google Scholar
Jones, S. P., & Hughes J. (Eds.) (1998). Haskell98: A Non-Strict Purely Functional Language. Available at http://haskell.org/.
Kashima, H., & Inokuchi, A. (2002). Kernels for graph classification. In ICDM Workshop on Active Mining.
Keeler, J. D., Rumelhart, D. E., & Leow, W.-K. (1991). Integrated segmentation and recognition of hand-printed numerals. In R. Lippmann, J. Moody, & D. Touretzky (Eds.), Advances in neural information processing systems, Vol. 3 (pp. 557–563). Morgan Kaufmann.
Lloyd, J. W. (2003). Logic for learning. Springer-Verlag.
Lodhi, H., Saunders, C., Shawe-Taylor, J., Cristianini, N., & Watkins, C. (2002). Text classification using string kernels. Journal of Machine Learning Research, 2, 419–444.
Google Scholar
Maron, O., & Lozano-Pérez, T. (1998). A framework for multiple-instance learning. In M. I. Jordan, M. J. Kearns, & S. A. Solla (Eds.), Advances in neural information processing systems, Vol. 10. MIT Press.
Michie, D., Muggleton, S., Page, D., & Srinivasan, A. (1994). To the international computing community: A new EastWest challenge. Technical report, Oxford University Computing laboratory, Oxford, UK.
Müller, K.-R., Mika, S., Rätsch, G., Tsuda, K., & Schölkopf, B. (2001). An introduction to kernel-based learning algorithms. IEEE Transactions on Neural Networks, 2:2.
Provost, F., & Fawcett, T. (2001). Robust classification for imprecise environments. Machine Learing, 42:3, 203–231.
Google Scholar
Quinlan, J. (1990). Learning logical definitions from relations. Machine Learning, 5:3, 239–266.
Google Scholar
Ramon, J., & Bruynooghe, M. (2001). A polynomial time computable metric between point sets. Acta Informatica, 37:10, 765–780.
Google Scholar
Ramon, J., & De Raedt, L. (2000). Multi instance neural networks. In Attribute-Value and Relational Learning: Crossing the Boundaries.AWorkshop at the Seventeenth International Conference on Machine Learning (ICML-2000).
Schölkopf, B., Herbrich, R., & Smola, A. J. (2001). A generalized representer theorem. In Proceedings of the 14th Annual Conference on Learning Theory.
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. MIT Press.
Schölkopf, B., Smola, A. J., & Müller, K.-R. (1999). Kernel principal component analysis. In B. Schölkopf, C. Burges, & A. Smola (Eds.), Advances in kernel methods-support vector learning ( pp. 327–352). MIT Press.
Tikhonov, A. N., & Arsenin, V. Y. (1977). Solutions of Ill-posed problems. W.H. Winston.
Vapnik, V. (1995). The nature of statistical learning theory. Springer-Verlag.
Wahba, G. (1990). Spline Models for Observational Data, Vol. 59 of CBMS-NSF Regional Conference Series in Applied Mathematics. Philadelphia: SIAM.
Witten, I. H., & Frank, E. (2000). Data mining: practical machine learning tools and techniques with Java implementations. Morgan Kaufmann.
Zhang, Q., & Goldman, S. (2002). EM-DD: An improved multiple-instance learning technique. In T. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in neural information processing systems, Vol. 14. MIT Press.
Zien, A., Ratsch, G., Mika, S., Schölkopf, B., Lengauer, T., & Muller, K.-R. (2000). Engineering support vector machine kernels that recognize translation initiation sites. Bioinformatics, 16:9, 799–807.
Google Scholar

Download references

Author information

Authors and Affiliations

Fraunhofer Institut Autonome Intelligente Systeme, Germany; Department of Computer Science, University of Bristol, United Kingdom
Thomas Gärtner
Department of Computer Science III, University of Bonn, Germany
Thomas Gärtner
Research School of Information Sciences and Engineering, The Australian National University, Australia
John W. Lloyd
Machine Learning, Department of Computer Science, University of Bristol, United Kingdom
Peter A. Flach

Authors

Thomas Gärtner
View author publications
You can also search for this author in PubMed Google Scholar
John W. Lloyd
View author publications
You can also search for this author in PubMed Google Scholar
Peter A. Flach
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gärtner, T., Lloyd, J.W. & Flach, P.A. Kernels and Distances for Structured Data. Machine Learning 57, 205–232 (2004). https://doi.org/10.1023/B:MACH.0000039777.23772.30

Download citation

Issue Date: December 2004
DOI: https://doi.org/10.1023/B:MACH.0000039777.23772.30

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Kernels and Distances for Structured Data

Abstract

Article PDF

Similar content being viewed by others

Kernel Methods for Structured Data

An overview of distance and similarity functions for structured data

A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Navigation

Kernels and Distances for Structured Data

Abstract

Article PDF

Similar content being viewed by others

Kernel Methods for Structured Data

An overview of distance and similarity functions for structured data

A Kernel-Learning Approach to Semi-supervised Clustering with Relative Distance Comparisons

Explore related subjects

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation