Abstract
Many learning algorithms rely on distance metrics to receive their input data. Research has shown that these metrics can improve the performance of these algorithms. Over the years an often popular function is the Euclidean function. In this paper, we investigate a number of different metrics proposed by different communities, including Mahalanobis, Euclidean, Kullback-Leibler and Hamming distance. Overall, the best-performing method is the Mahalanobis distance metric.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
References
Abdi, H., Encyclopedia of Measurement and Statistics, 2007
Bar-Hillel, A., Learning from Weak Representations using Distance Functions and Generative Models, Ph.D. Thesis, Hebrew University of Jerusalem, 2006.
Beitao L., Chang, E., Wu, C., DPF – A Perceptual Distance Function for Image Retrieval. In Proceedings of the IEEE conference on Image Processing, Sept 2002.
Boriah, S., Chandola, V. Kumar, V. Similarity Measures for Categorical Data: A Comparative Evaluation, In Proceedings of the 2008 Society of Industrial and Applied Mathematics (SIAM) International Conference on Data Mining., pp.23–254, 2008.
Cover, T.M., Hart, P.E., Nearest Neighbor Pattern Classification. Institute of Electrical and Electronics Engineers Transactions on Information Theory, 13, pp. 21–271, Jan. 1967.
Davis, J.V., Kulis, B., Jain, P., Sra, S., Dhillon, I. S., Information-Theoretic Metric Learning, In the Proceedings of the 24th International Conference on Machine Learning, 2007.
Griffiths, R. Multiple Comparison Methods for Data Review of Census for Agriculture Press Releases, In the Proceedings of the Survey Research Methods Section of the American Statistical Association, 1992.
Jensen, D.D., Cohen, P.R., Multiple Comparisons in Induction Algorithms, Klumer Academic Publishers, pp. 1–33, 2002.
Jones, W.P., Furnas, G.W., Pictures of Relevance: A Geometric Analysis of similarity Measures, Journal of American Society of Information Science vol. 38, Issue 6, pp. 420–442, 1987.
Kamichety, H.M., Natarajan, P., Rakshit S., An Empirical Framework to Evaluate Performance of Dissimilarity Metrics in Content Based Image Retrieval Systems, Technical Report, Center of Artificial Intelligence and Robotics, Bangalore, 2002.
Noreault, T., McGill, M., Koll, M.B., A Performance Evaluation of Similarity Measures, Document Term Weighting Schemes and Representations in a Boolean Environment, In SIGIR ’80 Proceedings of the 3rd Annual ACM Conference on Research and Development in Information Retrieval, 76, 1981.
Qian, G., Sural, S., Gu, Y., Pramanik, S., Similarity Between Euclidean and Cosine Angle Distance of Nearest Neighbor Queries, In the Proceedings of the ACM Symposium on Applied Computing, 2004.
Tumminello, M., Lillo, F., Mantegna, R.N., Kulback- Leiber as a Measure of the Information Filtered from Multivariate Data, Physical Review E. 76, 031123 , 2007.
Weinberger, K.Q., Blitzer, J., Saul, L.K., Distance Metric Learning for Large Margin Nearest Neighbor Classification, Advances in Neural Information Processing Systems, MIT Press, 2006.
Weinberger, K. Q., Saul, L. K., Fast Solvers and Efficient Implementations for Distance Metric Learning, Under Review by the International Conference on Machine Learning (ICML), 2007.
Wilson, D.R., Martinez, T.R., Improved Heterogeneous Distance Functions, Journal of Artificial Intelligence Research (JAIR), vol. 6, Issue 1, pp. 1–34, 1997.
Wilson, D.R., Advances in Instance-Based Learning Algorithms, Ph.D. Thesis, Brigham Young University, 1997.
Wölfel, M., Ekenel,H. K., Feature Weighted Mahalanobis Distance: Improved Robustness for Gaussian Classifiers, In the Proceedings of the 13th European Signal Processing Conference (EUSIPCO 2005), Sept 2005.
Zwick, R., Carlstein, E., Budescu, D.V., Measures of Similarity among Fuzzy Concepts: A Comparative Analysis, International Journal of Approximate Reasoning 1, 2, pp. 221–242, 1987.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Springer Science+Business Media B.V.
About this paper
Cite this paper
Walters-Williams, J., Li, Y. (2010). Comparative Study of Distance Functions for Nearest Neighbors. In: Elleithy, K. (eds) Advanced Techniques in Computing Sciences and Software Engineering. Springer, Dordrecht. https://doi.org/10.1007/978-90-481-3660-5_14
Download citation
DOI: https://doi.org/10.1007/978-90-481-3660-5_14
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-3659-9
Online ISBN: 978-90-481-3660-5
eBook Packages: Computer ScienceComputer Science (R0)