Nonparametric Hierarchical Clustering of Functional Data

Boullé, Marc; Guigourès, Romain; Rossi, Fabrice

doi:10.1007/978-3-319-02999-3_2

Marc Boullé⁶,
Romain Guigourès^6,7 &
Fabrice Rossi⁷

Part of the book series: Studies in Computational Intelligence ((SCI,volume 527))

897 Accesses
3 Citations
1 Altmetric

Abstract

In this paper, we deal with the problem of curves clustering.We propose a nonparametric method which partitions the curves into clusters and discretizes the dimensions of the curve points into intervals. The cross-product of these partitions forms a data-grid which is obtained using a Bayesian model selection approach while making no assumptions regarding the curves. Finally, a post-processing technique, aiming at reducing the number of clusters in order to improve the interpretability of the clustering, is proposed. It consists in optimally merging the clusters step by step, which corresponds to an agglomerative hierarchical classification whose dissimilarity measure is the variation of the criterion. Interestingly this measure is none other than the sum of the Kullback-Leibler divergences between clusters distributions before and after the merges. The practical interest of the approach for functional data exploratory analysis is presented and compared with an alternative approach on an artificial and a real world data set.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Functional data clustering via hypothesis testing k-means

Article 16 March 2018

A fast epigraph and hypograph-based approach for clustering functional data

Article Open access 04 February 2023

A divisive clustering method for functional data with special consideration of outliers

Article 11 August 2017

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Abraham, C., Cornillon, P., Matzner-Løbe, E., Molinari, N.: Unsupervised curve clustering using b-splines. Scandinavian Journal of Statistics 30(3), 581–595 (2003)
Article MathSciNet MATH Google Scholar
Abramowitz, M., Stegun, I.: Handbook of mathematical functions. Dover Publications Inc., New York (1970)
Google Scholar
Blei, D.M., Jordan, M.I.: Variational inference for dirichlet process mixtures. Bayesian Analysis 1, 121–144 (2005)
Article MathSciNet Google Scholar
Boullé, M.: Data grid models for preparation and modeling in supervised learning. In: Guyon, I., Cawley, G., Dror, G., Saffari, A. (eds.) Hands on Pattern Recognition. Microtome (2010) (in press)
Google Scholar
Cadez, I., Gaffney, S., Smyth, P.: A general probabilistic framework for clustering individuals and objects. In: Proc. ACM Sixth Inter. Conf. Knowledge Discovery and Data Mining, pp. 140–149 (2000)
Google Scholar
Chamroukhi, F., Samé, A., Govaert, G., Aknin, P.: A hidden process regression model for functional data description. application to curve discrimination. Neurocomputing 73(7-9), 1210–1221 (2010)
Article Google Scholar
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., Wirth, R.: CRISP-DM 1.0: step-by-step data mining guide (2000)
Google Scholar
Cover, T., Thomas, J.: Elements of information theory. Wiley-Interscience, New York (1991)
Book MATH Google Scholar
Delaigle, G., Hall, P.: Defining probability density for a distribution of random functions. Annals of Statistics 38(2), 1171–1193 (2010)
Article MathSciNet MATH Google Scholar
Ferraty, F., Vieu, P.: Nonparametric Functional Data Analysis: Theory and Practice. Springer (2006)
Google Scholar
Gaffney, S., Smyth, P.: Joint probabilistic curve clustering and alignment. In: Advances in Neural Information Processing Systems 17 (2004)
Google Scholar
Gasser, T., Hall, P., Presnell, B.: Nonparametric estimation of the mode of a distribution of random curves. Journal of the Royal Statistical Society 60, 681–691 (1998)
Article MathSciNet MATH Google Scholar
Hansen, P., Mladenovic, N.: Variable neighborhood search: principles and applications. European Journal of Operational Research 130, 449–467 (2001)
Article MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The elements of statistical learning. Springer (2001)
Google Scholar
Hébrail, G., Hugueney, B., Lechevallier, Y., Rossi, F.: Exploratory Analysis of Functional Data via Clustering and Optimal Segmentation. Neurocomputing 73(7-9), 1125–1141 (2010)
Article Google Scholar
Neal, R.M.: Markov chain sampling methods for dirichlet process mixture models. Journal of Computational and Graphical Statistics 9(2), 249–265 (2000)
MathSciNet Google Scholar
Nguyen, X., Gelfand, A.: The dirichlet labeling process for clustering functional data. Sinica Statistica 21(3), 1249–1289 (2011)
Article MathSciNet MATH Google Scholar
Ramsay, J., Silverman, B.: Functional Data Analysis. Springer Series in Statistics. Springer (2005)
Google Scholar
Rissanen, J.: Modeling by shortest data description. Automatica 14, 465–471 (1978)
Article MATH Google Scholar
Sheather, S., Jones, M.: A reliable data-based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society. Series B (Methodological), 683–690 (1991)
Google Scholar
Teh, Y.W.: Dirichlet processes. In: Encyclopedia of Machine Learning. Springer (2010)
Google Scholar
Vogt, J.E., Prabhakaran, S., Fuchs, T.J., Roth, V.: The translation-invariant wishart-dirichlet process for clustering distance data (2010)
Google Scholar
Wallach, H.M., Jensen, S.T., Dicker, L., Heller, K.A.: An alternative prior process for nonparametric bayesian clustering. In: AISTATS, pp. 892–899 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Orange Labs, 2 av. Pierre Marzin, 22300, Lannion, France
Marc Boullé & Romain Guigourès
SAMM EA 4543, Université Paris 1, 90 rue de Tolbiac, 75013, Paris, France
Romain Guigourès & Fabrice Rossi

Authors

Marc Boullé
View author publications
You can also search for this author in PubMed Google Scholar
Romain Guigourès
View author publications
You can also search for this author in PubMed Google Scholar
Fabrice Rossi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Marc Boullé .

Editor information

Editors and Affiliations

LINA (CNRS UMR 6241), University of Nantes, Nantes Cedex 3, France
Fabrice Guillet
LaBRI, University of Bordeaux 1, Talence Cedex, France
Bruno Pinaud
Dpt Informatique, University François Rabelais of Tours, Tours, France
Gilles Venturini
Laboratoire ERIC, Lumière University Lyon 2, Bron, France
Djamel Abdelkader Zighed

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Boullé, M., Guigourès, R., Rossi, F. (2014). Nonparametric Hierarchical Clustering of Functional Data. In: Guillet, F., Pinaud, B., Venturini, G., Zighed, D. (eds) Advances in Knowledge Discovery and Management. Studies in Computational Intelligence, vol 527. Springer, Cham. https://doi.org/10.1007/978-3-319-02999-3_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-02999-3_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-02998-6
Online ISBN: 978-3-319-02999-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Nonparametric Hierarchical Clustering of Functional Data

Abstract

Chapter PDF

Similar content being viewed by others

Functional data clustering via hypothesis testing k-means

A fast epigraph and hypograph-based approach for clustering functional data

A divisive clustering method for functional data with special consideration of outliers

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Publish with us

Navigation

Nonparametric Hierarchical Clustering of Functional Data

Abstract

Chapter PDF

Similar content being viewed by others

Functional data clustering via hypothesis testing k-means

A fast epigraph and hypograph-based approach for clustering functional data

A divisive clustering method for functional data with special consideration of outliers

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation