Spectral Clustering: Interpretation and Gaussian Parameter

Mouysset, Sandrine; Noailles, Joseph; Ruiz, Daniel; Tauber, Clovis

doi:10.1007/978-3-319-01595-8_17

Sandrine Mouysset²¹,
Joseph Noailles²²,
Daniel Ruiz²² &
…
Clovis Tauber²³

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

5368 Accesses
1 Citations

Abstract

Spectral clustering consists in creating, from the spectral elements of a Gaussian affinity matrix, a low-dimensional space in which data are grouped into clusters. However, questions about the separability of clusters in the projection space and the choice of the Gaussian parameter remain open. By drawing back to some continuous formulation, we propose an interpretation of spectral clustering with Partial Differential Equations tools which provides clustering properties and defines bounds for the affinity parameter.

Access provided by Autonomous University of Puebla. Download conference paper PDF

An Introduction to Gamma-Convergence for Spectral Clustering

On the Spectral Clustering for Dynamic Data

Fast Spectral Clustering via the Nyström Method

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

Spectral clustering aims at selecting dominant eigenvectors of a parametrized Gaussian affinity matrix in order to build an embedding space in which the clustering is made. Many interpretations of this method were lead to explain why the clustering is made in the embedding space with graph theory with random walks (Meila and Shi 2001), matrix perturbation theory (Ng et al. 2002), Operators in Manifolds (Belkin and Niyogi 2003), physical models as inhomogeneous ferromagnetic Potts model (Blatt et al. 1996) or Diffusion Maps (Nadler et al. 2006). But all these analysis are investigated asymptotically for a large number of points and do not explain why this method works for a finite data set. Moreover, another problem still arise: the affinity parameter influences the clustering results (Ng et al. 2002; Von Luxburg 2007). And the difficulty to define an adequate parameter seems to be slightly connected to the lack of some clustering property explaining how the grouping in this low-dimensional space correctly defines the partitioning in the original data.

In this paper, we propose a fully theoretical interpretation of spectral clustering whose first steps were introduced by Mouysset et al. (2010). From this, we define a new clustering property in the embedding space at each step of the study and new results showing the rule of the Gaussian affinity parameter. After recalling the spectral clustering method and the rule of the affinity parameter in Sect. 2.1, we propose a continuous version of the Spectral Clustering with Partial Differential Equations (PDE). To do so, we consider a sampling of connected components and, from this, we draw back to original shapes. This leads to formulate spectral clustering as an eigenvalue problem where data points correspond to nodes of some finite elements discretization and to consider the Gaussian affinity matrix A as a representation of heat kernel and the affinity parameter σ as the heat parameter t. Hence, the first step is to introduce an eigenvalue problem based on heat equation which is defined with a Dirichlet boundary problem. From this, in Sect. 2.2, we deduce an “almost” eigenvalue problem which can be associated to the Gaussian values. Thus identifying connected component appears to be linked to these eigenfunctions. Then, by introducing the Finite Elements approximation and mass lumping, we prove in Sect. 2.3 that this property is preserved with conditions on t when looking at eigenvectors given by spectral clustering algorithm. Finally, in Sect. 3, we study numerically the difference between eigenvectors from the spectral clustering algorithm and their associated discretized eigenfunctions from heat equation on a geometrical example, as a function of the affinity parameter t.

2 Interpretation

In the following, spectral clustering and its inherent problem are presented. Then we propose a continuous version of this method.

2.1 Spectral Clustering: Rule of Gaussian Parameter

Let consider a data set $\mathcal{P} =\{ x_{i}\}_{i=1..N} \in {\mathbb{R}}^{p}$. Assume that the number of targeted clusters k is known. First, the spectral clustering consists in constructing the parametrized affinity matrix based on the Gaussian affinity measure between points of the data set $\mathcal{P}$. After a normalization step, by stacking the k largest eigenvectors, the spectral embedding in ${\mathbb{R}}^{k}$ is created. Each row of this matrix represents a data point x _i which is plotted in this embedding space and then grouped into clusters via the K-means method. Finally, thanks to an equivalence relation, the final partition of data set is directly defined from the clustering in the embedding space.

So this unsupervised method is mainly based on the Gaussian affinity measure, its parameter σ and its spectral elements. Moreover, it is known that the Gaussian parameter conditions the separability between the clusters in the spectral embedding space and should be well chosen (Von Luxburg 2007). The difficulty to fix this choice seems to be tightly connected to the lack of results explaining how the grouping in this low-dimensional space defines correctly the partitioning in the original data for a finite data set. Figure 1 summaries these previous remarks via a percentage of clustering which evaluated the percentage of mis-clustered points applied on a geometrical example of two concentric rectangles (Fig. 1a). For σ = 0. 8, value which provides clustering errors (Fig. 1b), the two clusters defined with K-means are represented in the spectral embedding (Fig. 1c) by the respective black and grey colors. A piece of circle in which no separation by hyperplane is possible is described. Thus, in the original space, both rectangles are cut in two and define a bad clustering as show in Fig. 1a.

2.2 Through an Interpretation with PDE Tools

As spectral elements used in spectral clustering do not give explicitly this topological criteria for a discrete data set, we are drawing back to some continuous formulation wherein clusters will appear as disjoint subsets as shown in Fig. 2. In that way, we first have to define a clustering compatibility which establishes the link between continuous interpretation and the discrete case. So we consider an open set Ω subdivided by k disjoints connected components of Ω.

Definition 17.1 (Clustering Compatibility).

Let Ω be a bounded open set in ${\mathbb{R}}^{p}$ made by Ω _i, i ∈ 1, . . , k disjoint connected components such that: $\varOmega =\bigcup _{ i=1}^{k}\varOmega _{i}$. Let $\mathcal{P}$ be a set of points $\{x_{i}\}_{i=1}^{N}$ in the open set Ω. Let note $\mathcal{P}_{j}$, for j = { 1, . . , k}, the non empty set of points of $\mathcal{P}$ in the connected component Ω _j of Ω:$\mathcal{P}_{j} =\varOmega _{j} \cap \mathcal{P},\forall j \in \{ 1,..,k\}$. Let $\mathcal{C} =\{ C_{1},..,C_{k^{\prime}}\}$ be a partition of $\mathcal{P}$. Suppose that k = k′ then $\mathcal{C}$ is a compatible clustering if $\forall j =\{ 1,..,k^{\prime}\},\exists i \in \{ 1,..,k\},\ C_{j} = \mathcal{P}_{i}$.

To make a parallel version in the L ²(Ω) space, data points which believe in a subset of Ω are equivalent to believe in the same connected component. In the following, we will formulate spectral clustering as an eigenvalue problem by assuming data points as nodes of some finite elements discretization and by considering Gaussian affinity matrix as a representation of heat kernel. But as the spectrum of heat operator in free space is essential, we will make a link with a problem defined on bounded domain in which the spectrum is finite. Then, due to the fact that we compare the discrete data defined by the elements of the affinity matrix with some L ² functions which are the solutions of heat equation, we will introduce an explicit discretization with the Finite Element theory and the mass lumping to cancel all knowledge about the mesh. Then we will make a feedback of this analysis for the application of spectral clustering by defining clustering properties following the successive approximations. Finally, this study will lead to a functional rule of σ and a new formulation of a spectral clustering criterion.

2.2.1 Link Between Gaussian Affinity and Heat Kernel in ${\mathbb{R}}^{p}$

Let recall the Gaussian affinity element A _ij between two data points x _i and x _j is defined by $A_{\mathit{ij}} =\exp \left (-{\left \|x_{i} - x_{j}\right \|}^{2}/{2\sigma }^{2}\right )$. A direct link between the affinity A _ij and the heat kernel on $\mathbb{R}_{+}^{{\ast}}\times {\mathbb{R}}^{p}$, defined by $K_{H}(t,x) = {(4\pi t)}^{-\frac{p} {2} }\exp \left (-\|{x\|}^{2}/4t\right )$ could be established as follows:

$$\displaystyle{ A_{\mathit{ij}} = {({2\pi \sigma }^{2})}^{\frac{p} {2} }K_{H}\left ({\sigma }^{2}/2,x_{i} - x_{j}\right ),\ \forall i\neq j,\ \forall (i,j) \in \{ 1,..,N\}. }$$

(1)

Equation (1) permits defining the affinity measure as a limit operator: the Gaussian affinity is interpreted as the heat kernel of a parabolic problem and its Gaussian parameter σ as a heat parameter t. Consider the following parabolic problem which is called heat equation, for $f \in {L}^{2}({\mathbb{R}}^{p})$:

$$\displaystyle{(\mathcal{P}_{{\mathbb{R}}^{p}})\left \{\begin{array}{@{}l@{\quad }l@{}} \partial _{t}u -\varDelta u = 0\ \text{ for }\ (t,x) \in {\mathbb{R}}^{+} \times {\mathbb{R}}^{p},\quad \\ u(x,0) = f\ \text{ for }x \in {\mathbb{R}}^{p}. \quad \end{array} \right.}$$

Due to the fact that the spectrum of heat operator in free space, noted S _H, is essential and eigenfunctions are not localized in ${\mathbb{R}}^{p}$ without boundary conditions, we have to restrict the domain definition and make a link with a problem on a bounded domain Ω in which the eigenfunctions could be studied.

2.2.2 Clustering Property with Heat Equation

Let now introduce the initial value problem in L ²(Ω), for f ∈ L ²(Ω):

$$\displaystyle{(\mathcal{P}_{\varOmega })\left \{\begin{array}{@{}l@{\quad }l@{}} \partial _{t}u -\varDelta u = 0\text{ in }{\mathbb{R}}^{+} \times \varOmega,\quad \\ u(t = 0) = f,\text{ in }\varOmega, \quad \\ u = 0,\text{ on }{\mathbb{R}}^{+} \times \partial \varOmega. \quad \end{array} \right.}$$

Denote by K _D the Green’s kernel of $(\mathcal{P}_{\varOmega })$. The solution operator in ${H}^{2}(\varOmega ) \cap H_{0}^{1}(\varOmega )$ associated to this problem is defined, for $f \in {L}^{2}(\varOmega )$, by:

$$\displaystyle{S_{D}(t)f(x) =\int _{\varOmega }K_{D}(t,x,y)f(y)\mathit{dy},\ x \in {\mathbb{R}}^{p}.}$$

Let consider $\{(\widetilde{v_{n,i}})_{n,i>0},\ i \in \{ 1,..,k\}\} \in H_{0}^{1}(\varOmega )$ such that $(\widetilde{v_{n,i}})_{n,i>0}$ are the solutions of $\varDelta \widetilde{v_{n,i}} =\lambda _{n,i}\widetilde{v_{n,i}}$ on Ω _i for i ∈ { 1, . . , k} and n > 0 and extend $\widetilde{v_{n,i}} = 0$ on Ω∖Ω _i. These functions are eigenfunctions of $(\mathcal{P}_{\varOmega })$ and the union of these eigenfunctions is an Hilbert basis of $H_{0}^{1}(\varOmega )$. Moreover, as $\varOmega =\bigcup _{ i=1}^{k}\varOmega _{i}$, for all i ∈ { 1, . . , k} and n > 0, the eigenfunctions, noted $\{(\widetilde{v_{n,i}})_{n,i>0},\ i \in \{ 1,..,k\}\}$, satisfied: $S_{D}(t)\widetilde{v_{n,i}} = {e}^{-\lambda _{n,i}t}\widetilde{v_{n,i}}$. So the eigenfunctions of S _D have a geometrical property: its support is included in only one connected component. Thus a clustering property in the spectral embedding space could be established.

Proposition 17.1 (Clustering Property).

For all point x ∈Ω and ε > 0, let note $\rho _{x}^{\epsilon }$ a regularized Dirac function centred in x: $\rho _{x}^{\epsilon } \in {C}^{\infty }(\varOmega,[0,1]),\ \rho _{x}^{\varepsilon }(x) = 1$ and $\mathit{supp}(\rho _{x}^{\varepsilon }) \subset \mathcal{B}(x,\varepsilon )$ . The eigenfunctions of S _D , noted $\widetilde{v_{n,i}}$ , for i ∈{ 1,..,k} and n > 0 such that for all x ∈Ω and all i ∈{ 1,..,k} and for all t > 0, the following result is satisfied:

$$\displaystyle{ \left [\exists \varepsilon _{0} > 0,\ \forall \varepsilon \in ]0,\varepsilon _{0}[,\ \exists n > 0,\ (S_{D}(t)\rho _{x}^{\varepsilon }\vert \widetilde{v_{ n,i}})_{{L}^{2}(\varOmega )}\neq 0\right ]\Longleftrightarrow x \in \varOmega _{i} }$$

(2)

where $(f\vert g)_{{L}^{2}(\varOmega )} =\int _{\varOmega }f(y)g(y)\mathit{dy},\forall (f,g) \in {L}^{2}(\varOmega )$ is the usual scalar product in L ² .

Proof.

By contrapositive, let i ∈ { 1, . . , k} and a point x ∈ Ω _j with any j ≠ i. Let $d_{x} = d(x,\partial \varOmega _{j}) > 0$ be the distance of x from the boundary of Ω _j. According to the hypothesis on Ω, we have $d_{0} = d(\varOmega _{i},\varOmega _{j}) > 0$. So for all $\varepsilon \in ]0,\inf (d_{x},d_{0})[$, $\mathcal{B}(x,\varepsilon ) \subset \varOmega _{j}$. Then for all t > 0, $\mathit{supp}(S_{D}(t)\rho _{x}^{\varepsilon }) \subset \varOmega _{j}$ and so, for n > 0, $(S_{D}(t)\rho _{x}^{\varepsilon }\vert \widetilde{v_{n,i}})_{{L}^{2}(\varOmega )} = 0$. So there does not any $\varepsilon _{0} > 0$ which verifies the direct implication of (2). Reversely, let x ∈ Ω _i and $\varepsilon \in ]0,\inf (d_{x},d_{0})[$, $\mathcal{B}(x,\varepsilon ) \subset \varOmega _{i}$. So the support of $\rho _{x}^{\varepsilon }$ is in Ω _i. As the $(\widetilde{v_{n,i}})_{n>0}$ is an Hilbert basis of ${L}^{2}(\varOmega _{i})$ and that $\rho _{x}^{\varepsilon }(x) = 1\neq 0$ then there exists n > 0 such that $(\rho _{x}^{\varepsilon }\vert \widetilde{v_{n,i}})\neq 0$. In this case, $(S_{D}(t)\rho _{x}^{\varepsilon }\vert \widetilde{v_{n,i}})_{{L}^{2}(\varOmega )} = {e}^{-\lambda _{n,i}t}(\rho _{x}^{\varepsilon }\vert \widetilde{v_{n,i}})\neq 0$.

By considering an open subset $\mathcal{O}$ which approximates from the interior the open set Ω such that $\mathit{Volume}(\varOmega \setminus \mathcal{O}) \leq \epsilon$, for ε > 0, both heat operators of $(\mathcal{P}_{{\mathbb{R}}^{p}})$ and $(\mathcal{P}_{\varOmega })$ could be compared in $\mathcal{O}$. Let δ be the distance from $\mathcal{O}$ to Ω as shown in Fig. 2. Due to the fact that the difference between the Green kernels K _H and K _D could be estimated in $\mathcal{O}$ and is function of the heat parameter t, the geometrical property could thus be preserved on the heat operator in free space restricted to $\mathcal{O}$. Let v _n, i be the eigenfunction $\widetilde{v_{n,i}}$ which support is restricted to $\mathcal{O}$, for all i ∈ { 1, . . , k} and n > 0. From this, we obtain, for 0 < t < δ ²:

$$\displaystyle{ S_{H}^{\mathcal{O}}(t)v_{ n,i} {=\exp }^{-\lambda _{n_{i}}t}v_{ n,i} +\eta (t,v_{n,i}), }$$

(3)

$\text{with }\|\eta (t,v_{n,i})\|_{{L}^{2}(\mathcal{O})} \rightarrow 0\text{ when }t \rightarrow 0,\delta \rightarrow 0.$

So we can prove that on $\mathcal{O}$, the eigenfunctions for the solution operator for bounded heat equation are quasi-eigenfunctions for $S_{H}^{\mathcal{O}}$ plus a residual (Mouysset et al. 2010). The clustering property adapted to the restricted heat operator $S_{H}^{\mathcal{O}}$ remains introducing an hypothesis on the heat parameter t. Moreover, (2) is modified with non-null values by introducing a gap between scalar product with eigenfunctions such that for all x > 0 and all i ∈ { 1, . . , k}:

$$\displaystyle\begin{array}{rcl} \left [\begin{array}{l} \exists \varepsilon _{0} > 0,\ \exists \alpha > 0,\ \forall \varepsilon \in ]0,\varepsilon _{0}[,\ \exists n > 0,\ \forall t > 0\text{ small enough,} \\ v_{n,i} =\arg \max _{\{v_{m,j},m\in \mathbb{N},j\in [\vert 1,k\vert ]\}}\left \vert (S_{H}^{\mathcal{O}}(t)\rho _{x}^{\epsilon }\vert v_{m,j})_{{L}^{2}(\mathcal{O})}\right \vert \\ \text{ and }\left \vert (S_{H}^{\mathcal{O}}(t)\rho _{x}^{\varepsilon }\vert v_{n,i})_{{L}^{2}(\mathcal{O})}\right \vert >\alpha \end{array} \right ]\Longleftrightarrow x \in \mathcal{O}_{i}.& &{}\end{array}$$

(4)

These previous results prove that in infinite dimension, a clustering could be realized in the spectral embedding space because the eigenfunctions have a geometrical property. This study leads to the following question: do eigenvectors of the affinity matrix behave like eigenfunctions of $(\mathcal{P}_{\varOmega })$?

2.3 Discretization with Finite Elements

From this, we will look for a similar behaviour onto eigenvectors of A by introducing a finite dimension representation matching with the initial data set $\mathcal{P}$ with help of the finite elements (Ciarlet 1978). So, we consider data points as finite dimensional approximation and elements of the affinity matrix built from data points as nodal values of $S_{H}^{\mathcal{O}}$.

2.3.1 Approximation in Finite Dimension

Let τ _h be a triangulation on $\bar{\mathcal{O}}$ such that: $h =\max _{K\in \tau _{h}}\ h_{K}$, h _K being a characteristic length of triangle K. Let consider a finite decomposition of the domain: $\bar{\mathcal{O}} = \cup _{K\in \tau _{h}}K$ in which $(K,P_{K},\varSigma _{K})$ satisfies Lagrange finite element assumptions for all K ∈ τ _h. We define also the finite dimension approximation space: $V _{h} =\{ w \in {\mathcal{C}}^{0}(\bar{\mathcal{O}});\forall K \in \tau _{h},w_{\vert K} \in P_{K}\}$ and denote Π _h the linear interpolation from ${C}^{0}(\bar{\mathcal{O}})$ in V _h with the usual scalar product $(\cdot \vert \cdot )_{{L}^{2}(V _{h})}$ (Ciarlet 1978). According to this notations, for t > 0, the Π _h-mapped operator $S_{H}^{\mathcal{O}}$ applied to each shape function ϕ _j is, for h ^3p+2 < t ², for all 1 ≤ j ≤ N:

$$\displaystyle{ {(4\pi t)}^{\frac{p} {2} }\varPi _{h}(S_{H}^{\mathcal{O}}(t)\phi _{j})(x) =\sum _{ k=1}^{N}\left ((A + \mathbb{I}_{N})M\right )_{kj}\phi _{k}(x) + O\left (\frac{{h}^{3p+2}} {{t}^{2}} \right ), }$$

(5)

where M stands for the mass matrix defined by: $M_{\mathit{ij}} = (\phi _{i}\vert \phi _{j})_{{L}^{2}(V _{h})}$. Equation (5) means that the affinity matrix defined in (1) in spectral algorithm is interpreted as the Π _h-projection of operator solution of $(\mathcal{P}_{{\mathbb{R}}^{p}})$ with M mass matrix from Finite Element theory (Mouysset et al. 2010).

So we could formulate finite elements approximation of continuous clustering result (3). From the eigenfunctions of S _D restricted to $\mathcal{O}$, their projection in V _h, noted W _n, i, are defined by: $W_{n,i} =\varPi _{h}v_{n,i} \in V _{h},\forall i \in \{ 1,..,k\}$. So, for ${h}^{\frac{3p+2} {2} } < t {<\delta }^{2}$, the following result could be established:

$$\displaystyle{ {(4\pi t)}^{\frac{-p} {2} }(A + \mathbb{I}_{N})MW_{n,i} = {e}^{-\lambda _{n,i}t}W_{n,i} +\varPsi \left (t,h\right ), }$$

(6)

where $\|\varPsi \left (t,h\right )\|_{{L}^{2}(V _{h})} \rightarrow 0$ and δ → 0. Equation (6) shows that the geometrical property is preserved in finite dimension on the eigenvectors of $(A + \mathbb{I}_{N})M$. Moreover, a lower bound for the heat parameter was defined. But all this previous results include the mass matrix which is totally dependent of the finite elements. In order to cancel this dependence, mass lumping process is investigated.

2.3.2 Mass Lumping

The mass lumping method consists in using a quadrature formula whose integration points are the interpolation points of the finite element. So let $\mathcal{I}_{k}$ be the list of indices of points which are element of K ∈ τ _h. Let consider the quadrature scheme exact for polynomials of degree ≤ 1:

$$\displaystyle{ \int _{K}\phi (x)\mathit{dx} \approx \sum _{k\in \mathcal{I}_{k}}\frac{\vert K\vert } {3} \phi (x_{i_{k}}) }$$

(7)

where | K | is the area of the finite element K. So, with additional regularity condition on the mesh which bounds | K | , the mass lumping permits considering the mass matrix M as a homogeneous identity matrix. So (6) is modified so that, ∃ α > 0, such that:

$$\displaystyle{ \alpha \left (A + \mathbb{I}_{N}\right )W_{n,i} = {e}^{-\lambda _{n,i}t}W_{ n,i} +\varPsi ^{\prime}(t,h), }$$

(8)

where $\|\varPsi ^{\prime}(t,h)\|_{{L}^{2}(V _{h})} \rightarrow 0$ and δ → 0. The approximation in finite dimension of the clustering property (4) is reformulated as follows, for all $x_{r} \in \mathcal{P}$, for all i ∈ { 1, . . , k}:

$$\displaystyle{ \left [\begin{array}{l} \exists \alpha > 0,\ \exists n > 0,\ \forall t > 0,t,{h}^{2}/t\text{ and }{h}^{(3p+1)}/{t}^{2}\text{ small enough, } \\ W_{n,i} =\arg \max _{\{W_{m,j},m\in \mathbb{N},j\in [\vert 1,k\vert ]\}}\left \vert \left (\left (A + \mathbb{I}_{N}\right )_{.r}\vert W_{m,j}\right )_{{L}^{2}(V _{h})}\right \vert \\ \text{ and }\left \vert \left (\left (A + \mathbb{I}_{N}\right )_{.r}\vert W_{n,i}\right )_{{L}^{2}(V _{h})}\right \vert >\alpha \end{array} \right ]\Longleftrightarrow x_{r} \in \mathcal{O}_{i}, }$$

(9)

where $\left ((A + \mathbb{I}_{N})\right )_{.r}$ is the rth column of the matrix $(A + \mathbb{I}_{N})$, for all r ∈ { 1, . . N}. This leads to the same clustering for a set of data points either we consider eigenfunctions in ${L}^{2}(\varOmega )$ or Π _h-interpolated eigenfunction in the approximation space V _h. With an asymptotic condition on the heat parameter t (or Gaussian parameter σ), points which are elements of the same cluster have the maximum of their projection coefficient along the same eigenvector. So the clustering in spectral embedding space provides the clustering in data space.

3 Gaussian Parameter: A Geometrical Example

This previous theoretical interpretation proves that the Gaussian parameter should be chosen within a specific interval in order to improve the separability between clusters in the spectral embedding space. In order to experiment the parallel between continuous version and the approximate one, we consider a geometrical example with non convex shapes as shown in Fig. 3a. For each connected component (or each cluster) i ∈ { 1, 2}, the discretized eigenfunction, noted W _1, i, associated to the first eigenvalue of each connected component and the eigenvectors, noted Y _i, which gives maximum projection coefficient with W _1, i are respectively plotted in Fig. 3b, c and e, f. The correlation ω between W _1, i and Y _i is represented as a function of the heat parameter t in Fig. 3d: $\omega = \vert (W_{1,i}\vert Y _{i})\vert {(\|W_{1,i}\|_{2}\|Y _{i}\|_{2})}^{-1}$. The vertical black dash dot lines indicate the lower and upper estimated bounds of the heat parameter. In this interval, the correlation between the continuous version and the eigenvectors of the Gaussian affinity matrix is maximum. So the clusters are well separated in the spectral embedding space.

4 Conclusion

In this paper, spectral clustering was formulated as an eigenvalue problem. From this interpretation, a clustering property on the eigenvectors and some conditions on the Gaussian parameter have been defined. This leads to understand how spectral clustering works and to show how clustering results could be affected with a bad choice of the affinity parameter. But we do not take into account the normalization step in the whole paper but its rule is crucial for ordering largest eigenvectors for each connected components to the first eigenvectors and should be studied.

References

Belkin, M., & Niyogi, P. (2003). Laplacian eigenmaps for dimensionality reduction and data representation. Neural Computation, 15(6), 1373–1396.
Article MATH Google Scholar
Blatt, M., Wiseman, S., & Domany, E. (1996). Superparamagnetic clustering of data. Physical Review Letters, 76(18), 3251–3254.
Article Google Scholar
Ciarlet, P. G. (1978). The finite element method for elliptic problems. Series studies in mathematics and its applications (Vol. 4). Amsterdam: North-Holland.
Google Scholar
Meila, M., & Shi, J. (2001). A random walks view of spectral segmentation. In Proceedings of eighth international workshop on artificial intelligence and statistics (AISTATS) 2001.
Google Scholar
Mouysset, S., Noailles, J., & Ruiz, D. (2010). On an interpretation of spectral clustering via heat equation and finite elements theory. In Proceedings of international conference on data mining and knowledge engineering (ICDMKE) (pp. 267–272). Newswood Limited.
Google Scholar
Nadler, B., Lafon, S., Coifman, R. R., & Kevrekidis, I. G. (2006). Diffusion maps, spectral clustering and reaction coordinates of dynamical systems. Applied and Computational Harmonic Analysis: Special Issue on Diffusion Maps and Wavelets, 21(1), 113–127.
Article MathSciNet MATH Google Scholar
Ng, A. Y., Jordan, M. I., & Weiss, Y. (2002). On spectral clustering: Analysis and an algorithm. Advances in neural information processing systems (pp. 849–856). Cambridge: MIT.
Google Scholar
Von Luxburg, U. (2007). A tutorial on spectral clustering. Statistics and Computing, 17(4), 395–416. Berlin: Springer.
Google Scholar

Download references

Author information

Authors and Affiliations

University of Toulouse, IRIT-UPS, 118 route de Narbonne, 31062, Toulouse, France
Sandrine Mouysset
University of Toulouse, IRIT-ENSEEIHT, 2 rue Camichel, 31071 , Toulouse, France
Joseph Noailles & Daniel Ruiz
University of Tours, Hopital Bretonneau, 2 boulevard Tonnelle, 37044, Tours, France
Clovis Tauber

Authors

Sandrine Mouysset
View author publications
You can also search for this author in PubMed Google Scholar
Joseph Noailles
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Ruiz
View author publications
You can also search for this author in PubMed Google Scholar
Clovis Tauber
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sandrine Mouysset .

Editor information

Editors and Affiliations

Faculty of Computer Science, Otto-von-Guericke-Universität Magdeburg, Magdeburg, Germany
Myra Spiliopoulou
Institute of Computer Science, University of Hildesheim, Hildesheim, Germany
Lars Schmidt-Thieme
Institute of Computer Science, University of Hildesheim, Hildesheim, Germany
Ruth Janning

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Mouysset, S., Noailles, J., Ruiz, D., Tauber, C. (2014). Spectral Clustering: Interpretation and Gaussian Parameter. In: Spiliopoulou, M., Schmidt-Thieme, L., Janning, R. (eds) Data Analysis, Machine Learning and Knowledge Discovery. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Cham. https://doi.org/10.1007/978-3-319-01595-8_17

Download citation

DOI: https://doi.org/10.1007/978-3-319-01595-8_17
Published: 10 October 2013
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01594-1
Online ISBN: 978-3-319-01595-8
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Spectral Clustering: Interpretation and Gaussian Parameter

Abstract

Similar content being viewed by others

An Introduction to Gamma-Convergence for Spectral Clustering

On the Spectral Clustering for Dynamic Data

Fast Spectral Clustering via the Nyström Method

Keywords

1 Introduction

2 Interpretation

2.1 Spectral Clustering: Rule of Gaussian Parameter

2.2 Through an Interpretation with PDE Tools

Definition 17.1 (Clustering Compatibility).

2.2.1 Link Between Gaussian Affinity and Heat Kernel in \({\mathbb{R}}^{p}\)

2.2.2 Clustering Property with Heat Equation

Proposition 17.1 (Clustering Property).

Proof.

2.3 Discretization with Finite Elements

2.3.1 Approximation in Finite Dimension

2.3.2 Mass Lumping

3 Gaussian Parameter: A Geometrical Example

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Spectral Clustering: Interpretation and Gaussian Parameter

Abstract

Similar content being viewed by others

An Introduction to Gamma-Convergence for Spectral Clustering

On the Spectral Clustering for Dynamic Data

Fast Spectral Clustering via the Nyström Method

Keywords

1 Introduction

2 Interpretation

2.1 Spectral Clustering: Rule of Gaussian Parameter

2.2 Through an Interpretation with PDE Tools

Definition 17.1 (Clustering Compatibility).

2.2.1 Link Between Gaussian Affinity and Heat Kernel in \({\mathbb{R}}^{p}\)

2.2.2 Clustering Property with Heat Equation

Proposition 17.1 (Clustering Property).

Proof.

2.3 Discretization with Finite Elements

2.3.1 Approximation in Finite Dimension

2.3.2 Mass Lumping

3 Gaussian Parameter: A Geometrical Example

4 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation