1 Introduction

The main goal of this paper is to propose an adaptation of the Eikonal equation on weighted graphs, using the framework of Partial difference Equations [17], with the motivation of extending this equation’s applications, to any discrete data that can be represented by graphs. This adaptation leads to a finite difference equation whose coefficients are data geometry dependent, and that leads to an efficient algorithm, as an extension of the Fast Marching algorithm, to propagate multiple fronts without restriction in the direction of their propagations. We will show that the combination of both coefficients and graph topology enables the resolution of many applications in image and high dimensional data processing.

Many applications involve data defined on topologically complex domains. These data can be defined on manifolds (e.g., a sphere) or irregularly shaped domains, defined on network-like structures (e.g., network communities), or defined as high dimensional point clouds such as collections of features vectors. Such unorganized data can be conveniently represented as graphs, where the vertices represent initial data and the edges represent interactions between them. Moreover, the use of a graph representation for usual images also enables to take into account local and non-local interactions and leads to very powerful tools for non-local image processing [8, 19].

Processing and analyzing such structured types of data is a major challenge for both the image and machine learning communities. Hence, it is very important to transfer many tools which were initially developed on usual Euclidean spaces and proven to be efficient for many problems dealing with usual image and signal domains, to graphs and networks.

Classical approaches for graph processing mainly come from graph theory and one can quote two main categories for these methods. Methods of the first category are based on the minimization of an energy with applications in semi-supervised segmentation. One can quote graph cuts [5], random walks [20] or recently the power watershed [14]. A second category groups techniques based on spectral graph theory [12]. They have been successfully used for image filtering [24], image segmentation [42], data clustering [30], or network communities extraction in complex networks [27], and so on.

There has been also recently much interest in transposing signal processing tools used in image and signal processing on graphs. One can quote the generalization of wavelets approach to graphs, with the work of Coifman et al. on diffusion wavelets [13], Jansen on multiscale methods [31], or recently Hammond et al. on spectral transform [23].

Similarly, there are recent works that aim to transpose Partial Differential Equations (PDEs) on graphs. These works exploit discrete calculus to perform such transcription [25]. Discrete Calculus has been used in recent years to produce a reformulation of continuous problems onto a graph is such a manner that the solution behaves analogously to the continuous formulation. See [22] and references therein for a complete overview on that subject with applications in image processing and machine learning.

To transpose PDEs on graphs, one approach consists in exploiting Partial difference Equations (PdEs) on graphs. Conceptually, PdEs are methods that mimic PDEs on the graphs general domain, by replacing differential operators by difference operator on graphs. Historically, it was first introduced in the seminal paper of Courant, Friedrichs and Lewy [15]. Then, the study of PdEs has appeared to be a subject on its own interest, dealing with existence and qualitative behavior problems [3, 32, 34]. Introduction of such methods for image processing started with the work of Chan et al. [10] who introduced the TV digital filter for image denoising, which is the discrete analogue of total variation on unweighted graphs. Zhou has also used TV on graphs for semi-supervised classification [51].

Following the line of research we developed in previous works, we base the contributions of this paper on difference operators on graphs [24]. The motivation is that these operators allows to simply adapt continuous formulations to graphs by replacing continuous operators by their discrete adaptation. In particular, it allows most techniques based on the p-Laplacian and gradients to be handled with such operators on graphs in a very straightforward, simple and similar manner [17, 47]. Such an approach enables an adaptation to graphs that is not necessarily consistent with the continuous formulation (see in [22]). This point is however not a problem for the paper and will be investigated in future works.

In previous works, using the PdEs formalism, we have introduced non-local difference operators on graphs, and used the framework of PdEs to transcribe PDEs on graphs [4]. In particular, in [17], we have introduced a non-local discrete regularization on graphs of arbitrary topologies as a framework for image and data filtering and clustering. With the same ideas, we have proposed PdE morphological processes on graphs that are a transcription of continuous morphological PDEs [45]. Recently, we have also adapted a time -dependent version of the Eikonal equation with PdEs morphological processes on graphs [46, 47].

Eikonal Equation Background

The Eikonal equation is a special case of the following general continuous static Hamilton-Jacobi equations:

$$ \begin{cases} H(x,f,{\nabla}f)=0,& x\in \varOmega\subset \mathbb {R}^m, \\[2pt] f(x) = \psi(x), & x\in \varGamma\subset\varOmega, \\ \end{cases} $$
(1)

where ψ is a positive function defined on a domain Ω and f(x) is the traveling time or distance from source Γ. Then, the Eikonal equation can be expressed by using the following Hamiltonian:

$$ H(x,f,{\nabla}f)=\bigl\|{\nabla}f(x)\bigr\|-P(x), $$
(2)

where P is a given potential function. This equation can be linked to the level-set formulation for advancing fronts introduced by Sethian [40]

(3)

where ϕ is the level-set representation of Γ, and . The relation between such formulation and the Eikonal equation stems from the following change of variable: ϕ(x,t)=tf(x), under the condition that is positive on the whole domain Ω.

Solutions of static equation (2) are usually based on a discretization of the Hamiltonian where the approximations are performed by the Godunov methods [28] or with the Lax-Friedrich schemes [40]. Numerous numerical schemes have been proposed and investigated for solving the nonlinear system described by (2). Among the existing ones, we can quote the following schemes. (i) An iterative scheme has been proposed by [37] based on fixed point methods that solve a quadratic equation. (ii) The fast sweeping methods [50] that use Gauss-Seidel type of iterations to update the distance function field. The key point of fast sweeping is to update the points in a certain order. (iii) Tsitsiklis [48] was the first to develop a Dijkstra like method and proposed an optimal algorithm for solving the Eikonal equation. Based on the same idea, Sethian [33, 40] proposed the fast marching methods.

Another approach to solve Eq. (2) is to consider a time dependent version of the equation:

$$ \begin{cases} \displaystyle \frac{\partial f(x,t)}{\partial t}=-\bigl\|{\nabla}f(x)\bigr\|+P(x), &x\in \varOmega\subset \mathbb {R}^m, \\[4pt] f(x,t)=\psi(x), &x\in \varGamma\subset \mathbb {R}^m, \\[2pt] f(x,0)=\psi_0(x), &x\in \varOmega. \end{cases} $$
(4)

At steady state, the solution of the system (4) satisfies the Eikonal equation (2). Recently, we proposed in a conference paper [46] an adaptation of the time dependent formulation of the Eikonal equation over weighted graphs. Based on PdE, the analogue of (4) on a weighted graph \(G=\left(V,E,w\right)\) is

$$ \begin{cases} \displaystyle \frac{\partial f(u,t)}{\partial t} =-\bigl\| \bigl(\nabla^-_wf\bigr)(u)\bigr\|_p+P(u), &u\in V,\\[4pt] f(u,t)=\psi(u), & u\in V_0\subset V, \\[2pt] f(u,0)=\psi_0(u), & u\in V, \end{cases} $$
(5)

where V corresponds to the set of vertices of the graph and V 0 is the initial set of seed vertices. Operator \(\nabla^{-}_{w}\) corresponds to the weighted internal morphological gradient on graphs (detailed in Sect. 2.2) and ∥.∥p denotes the -norm. One can see that formulation (5) needs numerous iterations due to finite propagation speed and CFL conditions to converge to the solution of the Eikonal equation.

Contributions

This work generalizes and extends significantly our previous works on the Eikonal equation. First, we propose an adaptation of the stationary version of the Eikonal equation over arbitrary weighted graphs. Based on PdEs, our adaptation of (2) is given by this finite difference equation

(6)

Explicits solutions of this equation are given for particular values of p∈{1,2,∞}. An efficient algorithm to obtain such solutions, using the Fast Marching’s updating scheme, is proposed, and proofs of existence and uniqueness are also provided. This formulation generalizes front propagation methods on weighted graphs and recovers well known schemes as Osher-Sethian discretization or Dijkstra shortest path, for specific graphs and values of p.

Such an adaptation of the Eikonal equation on graphs enables the transcriptions of efficient algorithms from the field of image processing to a huge variety of discrete data that can be represented by a weighted graph. Then, using this adaptation, we also propose a new fast algorithm for propagation and tracking of many concurrent fronts on a weighted graph, the complexity of which is independent of the number of these fronts. Such an algorithm leads to several applications on weighted graphs such as semi-supervised image segmentation or data clustering.

In these two previous contributions, we only considered the case where the front is evolving in the outward normal direction (with the speed defined non-negative everywhere), but it is also interesting to consider both inward and outward directions and both positive and negative speeds. In particular, in the case of semi-supervised clustering, such inward and outward evolutions enables to minimize or overcome errors due to wrong initialization. Finally, we generalize the previous algorithm to the case where the speeds of the different fronts can be either positive or negative, which leads to a graph-based active contour model with many contours. This generalization provides a complete tool for multiple fronts propagation on arbitrary graphs, and offers a novel extension to the classical Fast Marching, when applied to regular grid graphs.

Paper Organization

The rest of this paper is organized as follows. In Sect. 2, we provide definitions and notations used in this work. In Sect. 3, we present our new finite difference equation on weighted graphs with proofs for existence and uniqueness, and explicit solutions for values of p∈{1,2,∞} are also given. Section 4 introduces two efficient algorithms for labels propagation, using the previous equation and an adaptation of the Fast Marching algorithm. Then, Sect. 5 presents several experiments which illustrate the behavior and efficiency of the proposed formulations and algorithms, as geodesic distance, semi-supervised image segmentation (with non-local configuration), active contours or data clustering. Finally, Sect. 6 concludes this paper.

2 Operators on Graphs

As the core structure of our approach, in this section we provide notations and basics on weighted graphs. We recall our formulations of differences, morphological differences and gradients on weighted graphs [4, 17, 45, 46]. The latter formulations constitute the basis of our proposed numerical scheme for solving the Eikonal equation on weighted graphs.

2.1 Notations

We consider the general situation where any discrete domain can be viewed as a weighted graph. Let G=(V,E,w) be a weighted graph composed of two finite sets: vertices V and weighted edges EV×V. An edge (u,v)∈E connects two adjacent (neighbor) vertices u and v. The neighborhood of a vertex u is noted N(u)={vV∖{u}:(u,v)∈E}. The weight w(u,v) of an edge (u,v) can be defined with a function w:V×V→ℝ+ if (u,v)∈E, and w(u,v)=0 otherwise. For the sake of simplicity, w(u,v) will be denoted by w uv. Graphs are assumed to be simple, connected and undirected implying that function w is symmetric.

Let f:V→ℝ be a real-valued function that assigns a real value f(u) to each vertex uV. We denote by the Hilbert space of such functions and similarly by , the Hilbert space of functions that assign a real value to each edge of E. These two spaces are endowed with the following inner products:

(7)

with , and

(8)

where .

Given a function , the integral of f is defined as

$$ \int\limits_{V} f = \sum\limits_{u\in V}f(u) $$
(9)

and it’s norm is given by

$$ \everymath{\displaystyle} \begin{array}{l} \|f\|_p = \biggl(\sum\limits_{u \in V}\bigl|f(u)\bigr|^p\biggr)^{1/p} , \quad 1 \leq p < \infty,\\[12pt] \|f\|_\infty = \max\limits_{u \in V}\bigl(\bigl|f(u)\bigr|\bigr), \quad p = \infty. \end{array} $$
(10)

Let be a set of connected vertices with such that for all , there exists a vertex with (u,v)∈E. We denote by and : the external and internal boundary sets of , respectively

(11)

where is the complement of .

2.2 Operators and Gradients on Weighted Graphs

The weighted gradient operator or weighted difference operator of a function , noted , respectively d w, is defined on an edge (u,v)∈E by

$$ \bigl( \stackrel{\rightarrow}{\nabla_w f}\bigr)(u,v) \stackrel{\mathit{def}.}{=} (d_w f)(u,v) \stackrel{\mathit{def}.}{=} \sqrt{w(u,v)}\bigl(f(v) - f(u) \bigr). $$
(12)

The adjoint of the weighted gradient operator, noted , is defined by:

$$ \bigl\langle\stackrel{\rightarrow}{\nabla_w f}, H\bigr\rangle \stackrel{\mathit{def}.}{=} \bigl\langle f, \stackrel{\rightarrow}{\nabla^*_wH}\bigr\rangle, $$
(13)

with and , and can be expressed as

$$ \bigl(\stackrel{\rightarrow}{\nabla^*_w H}\bigr)(u) \stackrel{\mathit{def}.}{=} \sum\limits_{v\sim u}\sqrt{w(u,v)}\bigl(H(v,u) - H(u,v)\bigr), $$
(14)

where vu means that v is adjacent to u.

This adjoint is linear and measures the flow of a function in at each vertex of the graph. Similarly to the continuous case, the divergence of a function is defined by \(\mathrm{div}_{w}F = -\stackrel{\rightarrow}{\nabla^{*}_{w} F}\).

These two definitions of the weighted gradient operator and it’s adjoint allow to define a family of first and second order operators on graphs, as the p-Laplace operator. But in this paper we only focus on the first order weighted gradient operator.

Based on the weighted gradient operator definition, two weighted directional gradient operators are defined. The weighted directional external and internal gradient operators are defined as , with

$$ \begin{array}{l} \bigl(\stackrel{\rightarrow}{\nabla_w^+ f}\bigr)(u,v) \stackrel{\mathit{def}.}{=} \sqrt{w(u,v)}\bigl(f(v) - f(u) \bigr)^+,\\[3pt] \bigl(\stackrel{\rightarrow}{\nabla_w^- f}\bigr)(u,v) \stackrel{\mathit{def}.}{=} \sqrt{w(u,v)}\bigl(f(v) - f(u) \bigr)^- , \end{array} $$
(15)

with the following notations:

$$(x)^+ = \max(x,0)\quad \mbox{and}\quad (x)^- = -\min(x,0). $$

The weighted gradient of a function at vertex u is defined as the vector of all weighted gradient with respect to the set of edges (u,v)∈E

$$ (\nabla_wf)(u)\stackrel{\mathit{def}.}{=} \bigl(\stackrel{\rightarrow}{\nabla_w f}(u,v)\bigr)_{v \in V}. $$
(16)

In the sequel, weighted gradient will refer to this gradient defined on vertices. Similarly, the weighted morphological internal and external gradients at a vertex u are expressed as

$$ \begin{aligned} \bigl(\nabla_w^+f\bigr)(u)=&\bigl(\stackrel{\rightarrow}{\nabla_w^+ f}\bigr)(u,v)_{v \in V}\quad \mbox{and}\\[3pt] \bigl(\nabla_w^-f\bigr)(u)=&\bigl(\stackrel{\rightarrow}{\nabla_w^- f}\bigr)(u,v)_{v \in V}. \end{aligned} $$
(17)

The corresponding -norms of gradients (17) and (16) for a vertex u are

$$ \everymath{\displaystyle}\begin{array}{l} \bigl\| \bigl(\nabla_w^+f\bigr)(u)\bigr\|_p= \biggl\lbrack \sum\limits_{v \sim u} w_{uv}^{p/2}\bigl| \bigl(Df(u)\bigr)^+\bigr|^p \biggr\rbrack^{1/p},\\[12pt] \bigl\|\bigl(\nabla_w^-f\bigr)(u)\bigr\|_p= \biggl\lbrack \sum\limits_{v \sim u} w_{uv}^{p/2}\bigl|\bigl(Df(u)\bigr)^-\bigr|^p \biggr\rbrack^{1/p} \quad \mbox{and }\\[12pt] \bigl\|(\nabla_w f)(u)\bigr\|_p= \biggl\lbrack \sum\limits_{v \sim u} w_{uv}^{p/2}\bigl|Df(u)\bigr|^p \biggr\rbrack^{1/p} \end{array} $$
(18)

with 0<p<∞ and where Df(u)=(f(v)−f(u)). The relation between these directional gradients norms was given in [46] as

$$ \bigl\|(\nabla_wf)(u)\bigr\|_p^p = \bigl\| \bigl(\nabla_w^+f\bigr)(u)\bigr\|_p^p + \bigl\|\bigl(\nabla_w^-f\bigr)(u)\bigr\|_p^p. $$
(19)

For the -norm, we have

$$ \begin{array}{l} \bigl\|\bigl(\nabla_w^+f\bigr)(u)\bigr\|_\infty= \max\limits_{v\sim u}\bigl(\sqrt{w_{uv}}\bigl| \bigl(Df(u)\bigr)^+\bigr|\bigr), \\[5pt] \bigl\|\bigl(\nabla_w^-f\bigr)(u)\bigr\|_\infty= \max\limits_{v\sim u}\bigl(\sqrt{w_{uv}}\bigl|\bigl(Df(u)\bigr)^-\bigr|\bigr) \quad \mbox{and }\\[5pt] \bigl\|(\nabla_w f)(u)\bigr\|_\infty= \max\limits_{v\sim u}\bigl(\sqrt{w_{uv}}\bigl|Df(u)\bigr|\bigr), \end{array} $$
(20)

with the following property:

$$\bigl\|(\nabla_w f)(u)\bigr\|_\infty= \max \bigl(\bigl\| \bigl(\nabla_w^+f\bigr)(u)\bigr\|_\infty, \bigl\|\bigl(\nabla_w^-f\bigr)(u)\bigr\|_\infty\bigr). $$

Properties of these gradients can be found in [45, 46]. One can note that the general definitions presented in this section are defined on graphs of arbitrary topology. They can be used to process any discrete regular or irregular data sets that can be represented by a weighted graph. Moreover, local and non-local settings are directly handled in these definitions and both are expressed by the graph topology in terms of neighborhood connectivity [17].

2.3 Morphological Evolution Equations

Time dependent Hamilton-Jacobi equation formulated in (4) is linked with mathematical morphology processes and can be viewed as morphological evolution equations. We have shown in [44, 45] that morphological gradients constitute numerical schemes for solving time dependent morphological dilation and erosion processes over graphs and therefore solve the time dependent Eikonal equation.

Continuous scale morphology (see for instance in [7, 38] and references therein) defines flat dilations δ:ℝm→ℝm and erosions ε:ℝm→ℝm of a function f 0:ℝm→ℝ by structuring sets B={x:∥xp≤1}, with the general PDEs

$$ \partial_tf=+\|\nabla f\|_p \quad \mbox{and}\quad \partial_tf=-\|\nabla f\|_p $$
(21)

where f is a modified version of f 0 and with the initial condition t=0 f=f 0. With different values of p, one obtains different structuring elements: a rhombus for p=∞, a disc with p=2 and a square with p=1.

We have proposed in [45] a graph-based versions of (21) by using gradients defined in (17). Given a weighted graph G=(V,E,w) and a function , the analogues of (21) on G are

$$ \begin{array}{l} \partial_tf(u)=+\bigl\|\bigl(\nabla_w^+f\bigr)(u)\bigr\|_p \quad \mbox{and}\quad \\[7pt] \partial_tf(u)=-\bigl\| \bigl(\nabla_w^-f\bigr)(u)\bigr\|_p. \end{array} $$
(22)

Intuitively, given a set of vertices and using external and internal graph boundaries (11), equation of dilation over can be interpreted as a growth process that adds vertices from to . By duality, erosion over can be interpreted as a contraction process that removes vertices from to .

As mentioned in the introduction, one approach to solve the Eikonal equation (1) is to consider a time dependent version (4) of this equation. The time dependent equation can be viewed as an erosion process regarding the minus sign and a null potential function P. Then, using the corresponding internal gradient \((\nabla^{-}_{w})\) involved in discrete PdEs based erosion process, one can directly obtain the time-dependent adaptation of the eikonal equation on graphs. Given a graph G=(V,E,w) and a function , we have

$$ \begin{cases} \displaystyle \frac{\partial f(u,t)}{\partial t} =-\bigl\|\bigl(\nabla^-_wf\bigr)(u)\bigr\|_p +P(u), &u\in V,\\[4pt] f(u,t) = \psi(u), & u\in V_0\subset V,\\[2pt] f(u,0) = \psi_0(u), &u\in V, \end{cases} $$
(23)

where V 0 corresponds to the set of initial seed vertices. This equation can be solved by steepest gradient descent method, using the following iterative numerical scheme for all uV, with f n(u)≈f(u,nΔt):

$$ f^{n+1}(u) = f^n(u) - \Delta t\bigl(\bigl\|\bigl(\nabla^-_wf^n\bigr)(u)\bigr\|_p - P(u)\bigr). $$
(24)

A complete and precise definition with numerical schemes for p∈{1,2,∞} can be found in [46].

3 Adaptation of the Eikonal Equation over Weighted Graphs

As mentioned in the introduction, the time dependent version of the Eikonal equation on weighted graphs needs numerous iterations to converge to the solution and our approach, detailed in Sect. 2.3 is not adapted to large graphs. Therefore, we propose a new adaptation of the Eikonal equation and a new algorithm to solve such finite difference equation (6), that overcomes these limitations.

We consider the Eikonal equation that describes the evolution of a propagation front Γ,

(25)

where f(x) is the arrival time of the front at x and . The associated level-set function ϕ is defined as

(26)

and we know that ϕ(x,t)=tf(x), then

(27)

Transposed on a weighted graph G=(V,E,w) with functions and and using morphological equations defined in (22), the level-set formulation (26) can be rewritten as

(28)

and can be expressed as a morphological process with the following sum of dilation and erosion

(29)

Only considering the case , and with ϕ(u,t)=tf(u), one has

(30)

Finally, with we obtain a discrete adaptation of the Eikonal equation on weighted graph, which describes a morphological erosion process, and defined as

$$ \begin{cases} \bigl\|\bigl(\nabla^-_w f\bigr)(u)\bigr\|_p = P(u), & \forall u \in V,\\[2pt] f(u) = 0, & \forall u \in V_0. \end{cases} $$
(31)

3.1 Existence and Uniqueness of the Solution

The proof of existence and uniqueness of Eq. (31) can be established considering the equation

$$ \begin{cases} \bigl \|\bigl(\nabla_w^- f\bigr)(u)\bigr\|_p = P(u), & u \in A \subset V, \\[2pt] f(u) = 0, & u \in \partial A \end{cases} $$
(32)

which can be expressed as S u(u,f,f(v)vu)=0 with

(33)

It can be easily shown that this scheme satisfies the following properties for all f.

  • H 1. ∂S u[f]/∂f(v)≤0, ∀uv

  • H 2. S u[f+M]=S u[f], ∀M∈ℝ∀u

  • H 3. S u[λf]=λS u[f]+(λ−1)P(u), ∀λ≥0, ∀u

  • H 4. limf(u)→+∞ S u(u,f,f(v)vu)=−P(u)

  • H 5. limf(u)→−∞ S u(u,f,f(v)vu)=+∞

Uniqueness of the Solution

Equation (31) has a unique solution.

Proof

Let f and g be two distinct solutions of (31) such that

$$ \max_{A}\bigl(f(u) - g(u)\bigr) > 0. $$
(34)

Then for a given λ≥1, we have

$$ M_H = \max_A \bigl(f(u) - \lambda g(u)\bigr) > 0. $$
(35)

We denote u 0 the u for which M H=f(u 0)−λg(u 0). Then we can deduce that, with h(u)=M H+λg(u):

$$ \begin{cases} f(u) \leq h(u)\quad \forall u \in A,\\[2pt] f(u_0) = h(u_0). \end{cases} $$
(36)

This implies that

(37)

There is a contradiction unless f(u)=g(u). □

Existence of Solution

The proof of existence for Eq. (31) solution can be easily shown. We know that equation S u(u,f,f(v)vu) is continuous. Due to properties H 4 and H 5, we can deduce that ∃−∞<f(u)<+∞ such that S u(u,f,f(v)vu)=0.

3.2 Numerical Schemes and Algorithms

From (31) and using norms defined in (18) and (20) with the property min(x,0)=−max(−x,0), we obtain the following equations for the and norms.

  • Case p∈{1,2}:

    $$ \biggl(\sum_{v\sim u} w_{uv}^{p/2}\max\bigl(0, f(u) - f(v)\bigr)^p \biggr)^{1/p} = P(u). $$
    (38)

    With a simple transformation of variables and some conventional notations, Eq. (38) can be rewritten as

    (39)

    where x=f(u), \(h_{i} = \sqrt{1/w_{uv}}\), \(n = \operatorname{card}(N(u))\), a={f(v i)|v iN(u) with i=1,…,n}, .

  • Case p=∞:

    $$ \max_{v\sim u}\bigl(\sqrt{w_{uv}}\max\bigl(0, f(u) - f(v)\bigr)\bigr) = P(u). $$
    (40)

    Using the same transformation of variables we obtain

    (41)

Local solutions (i.e., solution for a vertex, assuming the others are held fixed) of (31) over a weighted graph are given by (39) and (41). Both equations are clearly independent of the graph formulation, and can be applied to weighted graphs of arbitrary topology.

For the case p={1,2}, local solution \(\overline{x}\) at a particular vertex can be easily obtained with an iterative algorithm as described in Algorithm 1. The algorithm is based on the knowledge that there exists a k with 1≤kn such that \(a_{k} \leq \overline{x} \leq a_{k+1}\), and \(\overline{x}\) is the unique solution of the equation. Then, the algorithm consists in sorting increasingly the values a i and computing temporary solutions \(\hat{x}_{m}\) with the following equations (42) and (43) until the condition \(\hat{x}_{m} \leq a_{m+1}\) is satisfied.

Algorithm 1
figure 1

\(\overline{x}\) computation (local solution)

For p=1, the temporary local solution is given by:

(42)

For p=2, one has:

(43)

The numerical scheme for the norm is much simpler. Considering Eq. (41), the local solution \(\overline{x}\) for a particular node is directly given by

(44)

where n corresponds to the number of neighbors. One can remark that this equation is equivalent to Dijkstra like local update equation.

Many solvers can be adapted to compute global solution (i.e., on the whole graph) as, for instance, the Fast Iterative Method introduced by Jeong and Whitaker [49]. In this paper, we prefer the Fast Marching method which has the advantage to be monotonic. On an arbitrary graph, the Fast Marching consists in an active list (A) of vertices for which the solution is already known and fixed, and in a narrow band (NB) of vertices which are not yet fixed and have at least one neighbor in the active list. Vertices which are neither active or in the narrow band are said far away (FA). The narrow band is built as a sorted heap, and at each iteration, the first vertex is removed from the narrow band, added to the active list, and it’s neighbors are updated if they are not yet fixed and added to the narrow band if they are far away. Each neighbors v are updated simultaneously by computing new local solutions \(\overline{x}_{v}\) from the new value of f(u) (and using Algorithm 1) and updating \(f(v) \leftarrow \min(f(v), \overline{x}_{v})\) under the condition that the new local solution \(\overline{x}_{v}\) is inferior to previous local solution. This is iterated until the narrow band is empty. More details on the initial Fast Marching algorithm and a proof that the initial algorithm constructs a viable solution are given in [39].

Accuracy and Complexity of the Algorithm

Let G=(V,E,w) be a weighted graph. The costs C u,p to update a vertex u (i.e., compute a new value of f(u)), according to Algorithm 1 and Eqs. (41), (42) and (43), are given by

$$ \begin{array}{l@{\quad }l} C_{u,1}(u) = D_{u}(u)^2 & \mbox{case } p=1,\\[2pt] C_{u,2}(u) = D_{u}(u)^3 & \mbox{case } p=2,\\[2pt] C_{u,\infty}(u) = D_{u}(u) & \mbox{case } p=\infty, \end{array} $$
(45)

where the quantity D u(u) corresponds to the number of neighbors of u which are not far away.

In order to generalize the cost functions C u,p, we can consider that the previous quantity D u(u) is constant and corresponds to the maximum degree of a vertex of V. Such constant is defined as D=maxuV(card(N(u))). Thus, in the sequel we will use the following constants to denote the update costs.

$$ \begin{array}{l@{\quad }l} C_{u,1} = D^2 \geq C_{u,1}(u) & \forall u \in V,\\[2pt] C_{u,2} = D^3 \geq C_{u,2}(u) & \forall u \in V,\\[2pt] C_{u,\infty} = D \geq C_{u,\infty}(u) &\forall u \in V. \end{array} $$
(46)

Based on the previous constants, the activation costs C a,p of a vertex (i.e., the cost to change the state of a vertex from far away to active, what implies to update every of it’s not yet activated neighbors) can be defined as

$$ C_{a,p} = DC_{u,p}, \quad p \in \{1, 2, \infty \}. $$
(47)

while we know that N(u)≤D ∀uV. And we have

$$ \begin{array}{l@{\quad }l} C_{a,1} = D^3 & \mbox{case } p=1,\\[2pt] C_{a,2} = D^4 & \mbox{case } p=2,\\[2pt] C_{a,\infty} = D^2 & \mbox{case } p=\infty. \end{array} $$
(48)

Finally, the total cost of the algorithm can be summarized as the following complexity

(49)

where N is the number of vertices and the log(N) factor corresponds to the managing cost of the heapsort. In practice, processing such an algorithm on strongly connected graphs is very rare, and in the most cases we have C a,pN.

3.3 Grid Graph Example

In this section, we show that with an adapted graph topology and an appropriated weight function, the proposed formulation recovers the well known Osher-Sethian scheme that has been proposed in literature to solve the Eikonal equation. Let G=(V,E,w) be a weighted graph. In the case where p=2, we have the following scheme

$$ \sqrt{\sum_{v\sim u} w_{uv} \max\bigl(0, f(u) - f(v)\bigr)^2} = P(u). $$
(50)

If the graph represents a m-dimensional grid in ℝm, this numerical scheme recovers the Osher-Sethian discretization models on a m-dimensional grid.

Let u be a given vertex of V, that defines a vector of m-dimensional spatial such that u=(i 1 h 1,…,i m h m) where the h j represent the grid spacing and i j∈ℕ. The neighborhood of u can then be defined as N(u)={v|v=u±h j e j with j∈1,…,m} where e j=(q k)k=1,…,m is a vector of ℝm such as q k=1 if j=k and q k=0 otherwise. Finally, with the notations

$$ \begin{array}{l} D^+_jf(u) = \bigl(f(u + h_je_j) - f(u)\bigr)/h_j,\\[2pt] D^-_jf(u) = \bigl(f(u) - f(u - h_je_j)\bigr)/h_j \end{array} $$
(51)

and with \(w_{uv_{j}} = 1/h_{j}\) for all v jN(u), (43) can be rewritten as

$$ \sqrt{ \sum_{j=1}^m \max\bigl(0, D^-_jf(u)\bigr)^2 + \min\bigl(0, D^+_jf(u)\bigr)^2 } = P(u) $$
(52)

since min(0,ab)2=max(0,ba)2. This Eq. (52) corresponds to the Osher-Sethian Hamiltonian discretization scheme on a m-dimensional grid, and can be solved by the initial Fast Marching algorithm [39].

4 A New Class of Fast Algorithms for Semi-supervised Graph Clustering

In Sect. 3, we have introduced a new finite difference equation (31) which is an adaptation of the Eikonal equation over weighted graph, and proposed an efficient algorithm based on the Fast Marching method to solve such an equation. In this section, we propose to extend the previous algorithm so as to define a new class of fast algorithms for diffusion processes, that enable the simultaneous propagation of many concurrent labels on a weighted graph. Efficient algorithms for concurrent label propagation on a weighted graph have a great interest for graph partitioning and have led to numerous applications in semi-supervised image segmentation and data clustering. We will first present the algorithm when the speed is always non-negative (), then its extension to the case where the speed can be either positive or negative ().

4.1 Label Propagation with Non-negative Speed

Given a graph G=(V,E,w), let L={l 1,…,l n} be a set of labels and \(S^{0} = S^{0}_{1} \cup \cdots \cup S^{0}_{n}\) be the set of vertices initially marked by a label, where \(S^{0}_{i}\) is the set of vertices initially marked by l i. We call \(S^{0}_{i}\) the seed of l i.

The objective of label propagation for graph clustering is to mark each vertex u of V with a label l i, under the condition that u is closer to at least one vertex of \(S^{0}_{i}\) than other vertices of S 0 (according to the topology of G, the weight function w and a speed function , l iL), so as to obtain a partition of the graph in n consistent clusters. The propagation of each label l i is driven by a front Γ i, initialized as the boundary of \(S^{0}_{i}\), and the final partition S i is given by the set of vertices reached by Γ i until it is stopped by another front or the boundary of the domain.

In practice, the propagation is performed in the same time than the computation of the arrival-time (given by resolution of (31)): each time a vertex u is reached by a front, the label of the front is propagated to u. The proximity condition is respected while in the case of several front propagation, the front that arrives at a vertex is necessarily the front coming from the nearest source of that vertex (according to weight and speed functions). This is a consequence of the Fast Marching algorithm which activates vertices from the smallest to the greatest distance (or arrival-time). Indeed, a vertex u is considered reached by a front when it is activated by the algorithm, and the propagated label comes necessarily from it’s neighbor v which is already activated and such that

$$ s(u) = \frac{f(v)}{w_{uv}} = \min_{z\sim u}\biggl(\frac{f(z)}{w_{uz}}\biggr). $$
(53)

The weight 1/w is used to penalize neighbors which are close to a source but have very weak link with u. The whole algorithm for label propagation on weighted graphs is given in Algorithm 2.

Algorithm 2
figure 2

Labels propagation with non-negative speed

Remark 1

In the case where several neighbors have the same contribution and where these neighbors have different labels, several labels could be affected to node u. But, as this situation is very rare and happens only where fronts collapse, we choose, in this work, to label the vertex with only one arbitrary label from the list. This choice is application dependent and does not act as a rule.

Remark 2

Due to the multitude of fronts, we have to define one speed per front and the speed of a front Γ i at a vertex u is given by . Consequently, each time a node u is activated, it’s neighbors v such that vA are updated by:

$$ \bigl\| \bigl(\nabla^-_w f\bigr)(v)\bigr\|_p = P_{l_i}(v), $$
(54)

where l i is the label of u and .

One can remark that the algorithm complexity does not depend on the number of labels (the complexity is the same than for the classical Fast Marching algorithm described in Sect. 3). Then, this algorithm opens the way to a new class of fast algorithms for semi-supervised graph clustering, which allows to cluster data or segment images in many partitions, without loss of efficiency.

As noticed in Sect. 3, the label propagation is performed with a speed which is always non-negative. In the following section, we propose an adaptive version of the algorithm that can consider a front evolution (and then label propagation) with positive or negative speed.

4.2 Label Propagation Without Restrictions on Speed Sign

The case where the front’s speeds can be either positive or negative is very interesting because it enables the labels to be propagated when the speed is positive and removed when the speed is negative, This also enables the tracking of multiple fronts subjected to changing sign speeds.

In the continuous case, Carlini has proposed an approach for tracking an hypersurface with non always positive velocity, called Generalized Fast Marching Method (GFMM) [9].

Let Γ be a front evolving on a continuous domain noted Ω, an represented by a characteristic discontinuous function θ(x,t) defined as

(55)

where is a changing-sign speed and \(1_{\varOmega_{0}} - 1_{\varOmega_{0}^{c}}\) is equal to 1 on Ω 0 and −1 on its complementary. The position of the front Γ is then given by the discontinuities of function θ and the time evolution is obtained by solving the stationary Eikonal equation (25).

Then, the main idea of the Generalized Fast Marching Method is to split the front in two fronts: \(\varGamma_{+}^{t} = \partial\{x | \theta(x,t) > 0\}\) and \(\varGamma_{-}^{t} = \partial\{x | \theta(x,t) < 0\}\) and perform two Fast Marching along two narrow bands ( and ), with a modified version of (55). Interested readers can refer to [9] for a detailed presentation.

Then, we propose an extension of such an approach for front tracking on weighted graphs, along with a generalization of the previous algorithm (Algorithm 2) with positive and negative speeds. This can be done by representing each front Γ i by two fronts \(\varGamma_{i}^{+}\) and \(\varGamma_{i}^{-}\) that describe the part of front Γ i subjected to a positive speed and respectively negative speed. Then, the propagation of a label l i is performed through two evolving fronts: \(\varGamma^{+}_{i}\) which describes the growth of the set S i (where the speed is positive) and \(\varGamma^{-}_{i}\) which describes the decay of the set S i (where the speed is negative). We have \(\varGamma_{i} = \varGamma_{i}^{+}\cup \varGamma_{i}^{-}\).

On the first hand, the front \(\varGamma^{+}_{i}\) is initialized by the set . In other words, this corresponds to the set of vertices of \(S^{0}_{i}\) which lies in the inner narrow band of \(S^{0}_{i}\) and where the speed of the front is positive. Neighbors of the front are added to the narrow band under an additional condition: as the speed of the front has to be positive at their position. This additional condition can be easily justified since the front can’t grow where its speed is negative.

On the other hand, the front \(\varGamma^{-}_{i}\) is initialized by the set . Neighbors are added to the narrow band under the condition than the speed is negative. Because this front describes the decay of S i, each time a vertex is reached by \(\varGamma^{-}_{i}\), the vertex is unmarked instead of being marked by l i.

Finally, in order to keep the distance (or arrival time) always positive, the signed version of the speed is replaced by it’s modulus and the potential is given by .

Such an approach of label propagation with changing-sign speed for two fronts is illustrated in Figs. 1 and 2. The entire process is summarized in Algorithm 3.

Fig. 1
figure 3

Label propagation with changing-sign speed. The first graph is a 4-adjacency grid graph, and the second a non-Cartesian graph. Vertices inside the purple line represent \(S^{0}_{i}\). The speed is positive on the left side and negative on the right side. Green points and red points represent vertices in initial configuration of positive front \(\varGamma^{+}_{i}\), respectively vertices in initial configuration of negative front \(\varGamma^{-}_{i}\). Blue points represent vertices in the narrow band from the front \(\varGamma^{+}_{i}\) (surrounded in green) and from the front \(\varGamma^{-}_{i}\) (surrounded in red) (Color figure online)

Algorithm 3
figure 4

Labels propagation with changing-sign speed

Fig. 2
figure 5

Illustration of label propagation with changing-sign speed, with a toy example. The set S i is represented in blue. The inner narrow band which represents the position of front \(\varGamma_{i}^{+}\) is superimposed in green, and the outer narrow band which represents the front \(\varGamma_{i}^{-}\) is superimposed in red. The six images show different steps of the evolution of the set S i and both two fronts, with the speed which is positive on the gray background and negative on the white background (Color figure online)

Remark 3

In the case where p=∞, and according to Eq. (44), the arrival-time of a vertex is computed with the contribution of exactly one edge. Geometrically, this means that the fronts propagation follows the edges (one can see edges as pipes in which the fronts evolve), and the front that reaches a particular vertex comes from exactly one of its neighbors. The label assignment is then automatically deduced.

On the contrary, in the case where p={1,2} and according to Eqs. (42), (43), the arrival-time of a vertex is computed with the contribution of several edges. Thus, only considering the graph structure, the front propagation is interpreted as coming from several edges. In practice, we consider that the front comes from the edge with the highest contribution, and labels are propagated in this way. If we consider a sufficiently higher-dimensional domain in which the graph is embedded (as it is the case for images, meshes or manifolds), the front is coming between all the involved edges.

5 Experiments and Applications

5.1 Graph Construction

There exists several popular methods that transform a set of unorganized discrete data into a neighborhood graph. Considering a set of vertices V such that functions of represent the data, the construction of such a graph consists in modeling the neighborhood relationships between the data through the definition of a set of edges E, using a pairwise distance measure μ:V×V→ℝ+. In the general case (where data are unorganized) one can quote:

  • ε-neighborhood graphs where two data v i, v jV are connected by an edge of E if μ(v i,v j)≤ε, with ε>0.

  • k-nearest neighbor graphs (k-NNG) where each vertex v i is connected with it’s k-nearest neighbors according to μ. Such construction implies to build a directed graph, as the neighborhood relationship is not symmetric. Nevertheless, an undirected graph can be obtained while adding an edge between two nodes v i and v j if v i is among the k-nearest neighbors of v j or if v j is among the k-nearest neighbors ofv i.

In the case of structured data (i.e., images or meshes):

  • Grid graphs which are the natural structure to describe an image with a graph. Each pixel is connected by an edge to its adjacent pixels.

  • Region adjacency graphs (RAG), initially designed for images where vertices correspond to image regions and the set of edges is obtained by considering an adjacency distance. A RAG can be built for any structured data represented by a graph, where a region R i is defined as a set of connected vertices such that ⋃R i=V and ⋂R i=∅. Two regions R i,R j are adjacent if ∃v iR i and v jR jv iv j.

In both cases, weights are computed according to a measure of similarity g:E→ℝ+, which satisfies:

$$ w(u,v) = \begin{cases} g(u,v) &\mbox{if } (u,v) \in E,\\[2pt] 0 & \mbox{otherwise}. \end{cases} $$
(56)

The similarity is usually based on a pairwise distance measure between data features, where each vertex uV is represented by a feature vector F u⊂ℝm. For a given edge (u,v)∈E and a distance measure μ:V×V→ℝ+, we can have

$$ \begin{array}{l} g_0(u,v) = 1,\\[2pt] g_1(u,v) = \bigl(\mu(F_u, F_v) + \varepsilon\bigr)^{-1} \quad \mbox{with}\ \varepsilon > 0, \varepsilon \rightarrow 0,\\[2pt] g_2(u,v) = \exp \bigl(-\mu(F_u,F_v)/\sigma^2\bigr) \quad \mbox{with}\ \sigma > 0,\\ \end{array} $$
(57)

where σ depends on the variation of the function μ over the graph and controls the similarity scale. Several choices can be considered for the expression of the feature vectors, depending on the nature of the features to be used for the graph processing. In the context of image processing, one can quote the simplest grayscale or color feature vector F u, or the patch feature vector (i.e., the set of values F v where v is in a square window of size (2τ+1)×(2τ+1) centered at a vertex pixel u), in order to incorporate non-local features.

5.2 Weighted Geodesic Distances

In this section, we show the adaptivity of our framework and illustrate the behavior of proposed numerical schemes to compute weighted geodesic distances.

Figure 3 presents the application of the schemes formulated by (39) and (41). Results are obtained with different graph topologies and weight functions. They are all illustrated with color distance maps where iso-levels are superimposed in white.

Fig. 3
figure 6

Adaptive weighted distance computation with different p values, weight functions w and graph topologies G. Results represent color distance maps with iso-level sets. At top: original image. First, second and third columns: results with p=2,1 and ∞. First row, the weight w 1 is constant. Second row, the weight w 2 is designed to penalize vertical diffusion. Third row, the weight w 3 holds original image informations. First rows, the graph is built on the whole image. On the last, the distance is computed on a Region Adjacency Graph (RAG) from a partition of the original image

The potential function P is constant and equal to 1 and the propagation is performed for a unique label which initial seed is located either at the top left corner of the original image or at the center. The original image is a 256×256 grayscale image. First rows show results obtained with a weighted 4-adjacency grid graph and different weight functions w 1, w 2 and w 3. The first weight function (w 1=1) is a constant weight function. Using such weight function and in the case where p=2 we obtain the same result as using the initial Fast Marching algorithm [39], what verifies that in such configuration our formulation recovers the Osher-Sethian scheme. Similarly, in the case where p=∞, our algorithm is equivalent to a Dijkstra algorithm. The second weight function is designed to favor horizontal diffusion to vertical diffusion by penalizing the weights of vertical edges and given by

$$ w_2(u,v) = \begin{cases} 1 &\mbox{if\ } v_x = u_x,\\[2pt] 0.25 &\mbox{if\ } v_y = u_y. \end{cases} $$
(58)

In that latter case, the seed is placed at the center of the image. The third weight function (w 3=exp(−d 2/σ 2)) is designed to hold the similarity between each connected pixel (based on pixel intensity). Shape information is naturally captured by the weights that stop the front evolution at the boundaries. Clearly, in this case, one can obtain a segmentation of the different shapes simply by thresholding the obtained distance map. We provide results for different values of p=1, p=2 and p=∞ at respectively first, second and third column.

The last row illustrates distance computation on a Region Adjacency Graph (RAG), built from the original image. The first column shows the partition. Second one shows the associated graph where each node represents a region of the RAG and two nodes are linked if their regions are adjacent. Finally, the third column presents the distance map obtained from the top-left region. In that case, the weight function holds the similarity between two adjacent regions in the sense of mean intensity of each region, and the distance was computed with p=∞.

Figure 4 presents illustrations of geodesic distances computed on an irregular 3d mesh and an irregular 2d graph with constant speed/potential. The two graphs are said irregular, because each vertex has a variable number of edges. In these cases, our results does not necessarily coincide with solutions of the Eikonal equation via Fast Marching algorithms. Indeed, the finite difference equation (31) used to compute these geodesics is not the Eikonal equation (but is adapted from) and only consider values of functions defined on graph vertices. There exists some algorithms, using geometric informations and local solvers designed for these particular cases, which better approximate the solution of the Eikonal equation on triangulated domains, as the Fast Marching algorithm for triangulated domains proposed by Kimmel [26], or Fast Sweeping on triangulated meshes by Qian [35].

Fig. 4
figure 7

First row: geodesic distance on a 3d mesh from two distinct seeds with, at left the seeds, at right the rainbow distance map where red means near and blue means far. Second row: geodesic distance on an irregular graph in ℝ2 with one seed

In the following section, we will present the accuracy of the proposed fast algorithm to compute graph partitions.

5.3 Graph Partitioning

In this section, we present the accuracy of the proposed algorithm (Algorithm 2) for graph partitioning using a distance function (i.e. a metric), with some examples of data simplification. The use of a distance function for graph partitioning is not new, and we recommend interested readers to refer to [1, 2, 18] an references therein for a more complete review on this topic.

First, we recall a definition of metric-based graph partition. Let G=(V,E,w) be a weighted graph and C G(u,v) be the set of paths connecting two vertices u,vV. A path c(u,v) is a sequence of vertices (u 1,…,u j) such as u=u 1 and v=u j with (u i,u i+1)∈E and i=1,…,j−1.

Let ρ:E→ℝ be a metric defined as

$$ \rho(u,v) = \min_{\rho \in C_G(u,v)}\sum_{i=1}^{j-1}\bigl(f(u_{i+1}) - f(u_i)\bigr). $$
(59)

Given a set of K labels L={l i}i=1,…,K and a set of K seeds \(S^{0}=\{s^{0}_{i}\}_{i=1,\ldots,K} \subseteq V\) (where \(s^{0}_{i}\) is the seed of l i), the energy ρ S:V→ℝm induced by the metric ρ for all seed of S 0 is

$$ \rho_S(u) = \min_{s_i \in S}\rho(s_i, u) \quad \forall u \in V. $$
(60)

The region R i of a label l i, is the set of vertices which are closer to \(s^{0}_{i}\) than to any other seeds with respect to the metric ρ. R i can be defined as

$$ R_i = \bigl\{u \in V : \rho \bigl(s^0_i, u\bigr) \leq \rho \bigl(s^0_j, u\bigr), \forall j=1,\ldots,K\bigr\}. $$
(61)

A partition of G, for a given set of labels (with seeds S 0) and a metric ρ, is finally the set R(L,ρ)={R i,∀l iL}. For a graph G, to find it’s partition corresponds to seek a minimal cost path over G, and can be easily computed with a simple Dijkstra like algorithm.

This general formulation of graph partitioning according to a metric can be easily linked with our definition of label propagation for graph clustering (see Sect. 4.1). Indeed, such a partition can be processed using the new finite difference equation (31) and the new class of algorithms proposed in Algorithm 2 or Algorithm 3, since we know that the label propagated at each vertex uV is the label which seed is the nearest of u (see Sect. 4 for more details). Given a vertex uV with a label l i, we have the following relation

$$ f(u) = \rho_S(u) = \rho \bigl(u,s^0_i\bigr) = \min_{s^0_j\in S^0}\bigl(\rho \bigl(u,s^0_j\bigr)\bigr). $$
(62)

Moreover, graph partitioning is intrinsic to our algorithm, because of the label propagation according to the distance measure (or arrival time), which naturally cluster the graph. The resulting partition corresponds to a generalized Voronoï diagram. In the next paragraph, we present a methodology to compute graph partition using our equations and algorithms, we call supervertices.

From Super Pixels to Super Vertices

Initially developed by Ren and Malik [36], superpixels are an efficient way to reduce image complexity by grouping pixels in a region map while preserving contours. With TurboPixels [29], Levinshtein et al. have proposed an implementation of superpixels in which an image partition is obtained by dilating a regular grid of seeds so as to adapt to local image structure, where the dilation is performed using a level-set approach. In this paper, we propose to extend such approach of image simplification to weighted graphs, using our fast algorithm for multilabel propagation on graphs. Because this adaptation groups vertices instead of pixels, we name it supervertices. As TurboPixels, our approach uses a set of labels whose seeds are placed on initial data, but seeds dilation becomes a label propagation which is controlled by our fast algorithm instead of iterative evolving equations. Figure 5 presents two examples of supervertices on 3D meshes. A review of some other mesh partitioning methods can be found in [41] and references therein. In the case of image processing, the process is very similar to superpixels: the simplification is performed from a regular grid of distinct labels whose seeds position can be perturbed in the direction of the descending image gradient to avoid placing seeds close to string boundaries. This is illustrated in Fig. 6 on two images from the Berkeley Database.Footnote 1 Such partition can be easily transformed in a Region Adjacency Graph (RAG), which can be used as a simplified version of the initial image that preserves texture information and strong boundaries. Then, it becomes very interesting to perform any graph-based algorithm on such a reduced version of the image. This is illustrated in Sect. 5.4.4 with active contour and semi-supervised image segmentation using RAG.

Fig. 5
figure 8

Supervertices on meshes. Graphs used in this example are faces graphs, where each face of meshes is represented by a vertex and linked by an edge to any adjacent face. In both cases, the weight function is given by f=1 and P=1. First column presents initial meshes with superimposed random seeds. Second column shows the distance maps computed where red means near a seed and purple means far from a seed. Finally, last column shows the resulting partition in super vertices

Fig. 6
figure 9

Image partition with supervertices. Both images were partitioned using a regular grid of seeds (perturbed according to the image gradient). See text for details

5.4 Semi-supervised Image Segmentation and Data Clustering

In this section, we present the behavior and efficiency of our algorithm for semi-supervised image segmentation and data clustering, using graph representation of these data. First we give qualitative and quantitative comparison with a recent and efficient graph-based approach for semi-supervised image segmentation. Then, we illustrate the algorithm behavior with local or non-local configurations and with non-negative speeds or either positive or negative speeds, through several examples involving different kinds of data.

In the case of image semi-supervised segmentation, graph-based approaches have became very popular in recent years. Many graph-based algorithms for image segmentation have been proposed, such as graph cuts [5], random walk [20], shortest-path, watershed or frameworks that unify some of the previous methods (as powerwatershed) [14, 43]. Recently, these algorithms were all placed into a common framework [14] that allows them to be seen as special cases of a single general semi-supervised algorithms, and leads to the following general formulation: Given a graph G=(V,E,w), the general algorithm consists in finding the function f which minimizes the following energy function:

$$ E_{p,q}(f) = \sum_{{u\sim v}\in E}w^p_{uv}\bigl|f(u) - f(v)\bigr|^q $$
(63)

where f represents the target configuration. Using such a formulation in the case of two classes A and B such that V AV B=V and V AV B=∅, the graph clustering can be performed as follow:

  1. 1.

    The first step consists in finding an optimum f for (63), which is obtained by solving the equation

    $$ \begin{cases} f(A) = 1,\\ f(B) = 0,\\ f = \mathop{\arg \min}\limits_f E_{p,q}(f). \end{cases} $$
    (64)

    Such formulation can be interpreted using PdEs as argmin∑∥(∇w f)(u)∥ with w depending on p and q, and A and B represents the internal Dirichlet boundary condition.

  2. 2.

    Finally, the segmentation is performed via a threshold and the labeling function L:V→{1,0} is given by

    $$ L(u) = \begin{cases} 0 & \mbox{if } f(u) < 1/2,\\[2pt] 1 & \mbox{if } f(u) \geq 1/2. \end{cases} $$
    (65)

In the case of more than two classes, the method needs to find as many optimum f i as distinct labels, where the shape is given by the set of vertices of the label and the background by the set of vertices of all other labels. Then, the labeling function is given by

$$ L(u) = \arg \max_i f_i(u). $$
(66)

Another approach [2, 18] to perform a graph clustering consists in computing a graph partition from the set of user’s seeds and a metric, as it is described in the previous section. In this paper, we consider this approach which can be easily performed using Algorithms 2 and 3.

5.4.1 Comparison with Power Watershed

In this paragraph, we propose qualitative and quantitative comparisons between our method and the recent power watershed [14] which is one of the most efficient methods proposed in the previously cited framework. Results for the power watershed are obtained using the source code from Couprie’s website.Footnote 2

Figure 7 presents a qualitative comparison between the two algorithms for semi-supervised image segmentation with two labels (shape and background). The comparison is performed on images from the Microsoft Grabcut database.Footnote 3 The initial set of seeds is replaced by an eroded version in order to penalize both algorithms (as all propagation algorithms are very efficient when desired boundaries and seeds are very close). Results are presented for values of p∈{1,2,∞} and the power watershed (given by (63) with p=2,q=2). The mean computation time is given for each method (although it is quite difficult to compare runtime from two different implementations) and one can remark that the runtime difference between each p-norm is negligible, while the case p=2 outperform the three other methods (p=1,∞ and power watershed), with smoother results.

Fig. 7
figure 10

Qualitative comparison between power watershed and the proposed approach, using two images from the grabcut database. See text for more details

On the contrary to others methods, the proposed algorithm complexity and computation time does not depend of the number of labels (neither seeds). Indeed, whatever the number of labels, each vertex is activated exactly once by the algorithm (and we recall that the arrival time and the label of each vertex are fixed when the vertex is activated, see Sect. 4.1). This is illustrated on Figs. 8 and 9, which presents the runtime variation of both methods, in function of an increasing number of labels. The labels propagation is performed on a cytological slide photography (size 1280×2960) on which we consider a label for the background and a label per cell. For each experiment, an increasing number of label’s seeds are placed on the image, and the non-seeded cells are considered as background. Figures 8 and 9 presents the whole set of labels and the slide segmentation with all labels for both two methods. One can remark that where the power watershed runtime grows as the number of labels, the proposed method runtime is relatively constant (the little increase is implementation dependent and is not a consequence of the algorithm).

Fig. 8
figure 11

Runtime variation in function of the number of labels. Variation is given for the power watershed (in red) and the proposed approach (in blue). The sets of seeds for the maximum number of labels, and segmentation results for both methods are illustrated. See text for a comment (Color figure online)

Fig. 9
figure 12

Runtime variation in function of the number of labels. Variation is given for the power watershed (in red) and the proposed approach (in blue). The graph illustrates the runtime variation while the number of labels increase from 20 to 240. See text for a comment (Color figure online)

5.4.2 Non-local Image Segmentation

Our approach using graphs has the advantage to naturally enable local and non-local configurations in the same formulation. In this paragraph, we show the benefits of non-local schemes as compared to local ones for semi-supervised image segmentation, especially noisy images or images that contain fine and repetitive structures. In order to hold non-local configurations, the graph is construct as a k-grid graph, (i.e. a pixel u is linked to every pixels in a k×k window centered on u), and each pixel is associated with a patch feature vector (where each vector can be seen as a patch of texture). Figure 10 presents several examples of textured image segmentation. First column presents initial image and seeds. Second column shows local results obtained with a 4-adjacency grid graph using color feature vectors. Finally, the last column shows non-local results with a 11×11 window and patchs of 3×3. All results were computed with a potential function P=1 and p=2. These results demonstrate the benefits of non-local configurations especially for textured images, where classical methods fail to found correctly the desired object. In non-local configuration cases, the graph weights better capture fine and repetitive information contained in the image.

Fig. 10
figure 13

Semi-supervised image segmentation using the proposed algorithm. For each image, results are provided for two different graph topologies and weight functions. The first column presents images to segment with superimposed initial labels (respectively 2 and 3). The second one shows results for 4-adjacency grid graphs where each pixel is characterized by it’s color feature vector. Finally, third column shows results for a larger neighborhood (each pixel u is linked with any pixel in a 11×11 window centered on u) and pixels are characterized by patchs of 3×3

5.4.3 Active Contour

In this paragraph, we illustrate the behavior of our algorithm to perform active contour on graphs using an adaptation of the Chan and Vese model [11]. Although many recent approaches outperform this model [6, 16, 21], our choice lies in that we are not interested in present a new active contour model on graph, but only illustrate the adaptivity of our finite difference equation and algorithms. Given a front Γ represented by the level-set function ϕ, the Chan and Vese active contours model (without regularization term) can be summarized by the following PDE

$$ \frac{\partial\phi(x,t)}{\partial t} = \bigl[ \bigl(F_x - \mu_{1}(t)\bigr)^2 - \bigl(F_x - \mu_{2}(t)\bigr)^2 \bigr] \bigl\|\nabla\phi(x,t)\bigr\|, $$
(67)

where F x is the feature vector of x, μ 1(t) and μ 2(t) are the mean outside the front Γ at time t, respectively the mean inside. With such an Eq. (67) is equivalent to the level-set function defined in (26), which is the continuous version of our finite difference equation. Then, transposed on weighted graphs, the active contours model for a front Γ i can be written as

$$ \bigl\| \bigl(\nabla_w^-f\bigr)(u)\bigr\|_p = P_{l_i}(u, t), $$
(68)

with and . Finally, the front evolution is performed using Algorithm 3 with the advantages given by our formulation as the speed of the Fast Marching and the ability to perform several active contours evolution simultaneously. It is illustrated by Fig. 13, where different steps of the contours evolutions are presented. The graph is a 4-adjacency grid graph, and the 4 initial sets of labels (or in other words the initial position of contours) were deliberately roughly placed, to illustrate the behavior of the algorithm in such configuration. Figure 11 presents another example which involves 9 distinct fronts/contours. Here again, the seeds are deliberately roughly placed to illustrate the robustness of the method to imprecise initialization

Fig. 11
figure 14

Another example of active contours on a natural image. The evolution of the 9 contours is performed simultaneously without lost of efficiency

Remark 4

In Algorithm 3, in order to transcript the time dependence of speed, the speeds are periodically updated.

5.4.4 Region Adjacency Graph

Another advantage of our graph-based formulation is that the proposed algorithm can be applied to any graph, and therefore any graph representing images. To illustrate such an adaptive behavior, we propose to use other image structures, such as regions maps, instead of pixels grids to build the graph for image segmentation. In the following examples, we use two graphs. The first is a 4-adjacency grid graph used to build an image partition in a supervertices lattice. The obtained region map is then transformed in a second graph, a Region adjacency Graph (RAG), on which the segmentation algorithm is performed. Figure 14 illustrates non-local semi-supervised image segmentation using a RAG. In order to let the labels grow beyond local neighborhood, each vertex neighborhood is extended by a k-nearest neighborhood based on mean color value. The final graph is a RAG ∪ k-NNG. One can remark that every cells are extracted even those without initial marker. Another example, presented in Fig. 12, illustrates active contour on a RAG. Finally, using region based graphs has the following advantages.

  • Fast computation. By reducing the number of graph vertices (approximately 98 % of reduction in term of vertices as compared to the pixel based graph), the computing time is reduced due to the reduced number of data to consider.

  • Non-local segmentation. The first example shows non-local object segmentation, where only one object is initially marked and the others are found by our method even if the objects are not spatially close or connected.

  • Minimal number of initial seeds. With an appropriate graph structure, the segmentation requires a minimal number of initial seeds.

Fig. 12
figure 15

Active contours on a weighted RAG from one contour. (a) The image partition the RAG is built on. (b, c, d) Different steps of the contour evolution. Points represent vertices and white lines represent edges. The contour is given by the red points and blue points represent the vertices which lie inside the contour. (e) The final contour transposed on initial image (Color figure online)

Fig. 13
figure 16

An application of the proposed algorithm to active contours. Different steps of the simultaneous evolution of 4 contours on the 4-grid graph representation of the given image. See text for details

Fig. 14
figure 17

Semi supervised non-local-region-based segmentation. First image: original image with two seeds. Second image: image partition with region boundaries superimposed. Final image: final segmentation (performed on the RAG). See text for details

5.4.5 Data Clustering

Finally, this last experiment illustrates the adaptivity and behavior of the proposed algorithm for high-dimensional unorganized data clustering. The clustering is performed similarly to the non-local image segmentation case, using a set of seeds and a weighted graph. The data consists in a set of 500 cells extracted from a cytological slide. Each cell c i is represented by a feature vector \(F_{c_{i}}\) and the graph is a 7-NNG. Figure 15 presents the graph with 4 initial sets of labels, and the final clustering.

Fig. 15
figure 18

Unorganized real data clustering. The graph is a 7-nn graph built from a set of 500 cells, and each cell c i is represented by a feature vector \(F_{c_{i}}\). At left, the graph with initial labels. At right, the final clustering

6 Conclusion

In this paper, we have considered a new finite difference equation which is an adaptation of the Eikonal equation and defined on weighted graphs. We have shown that this equation has a unique solution and proposed numerical schemes for its resolution with different norms, with p∈{1,2,∞}. Based on such an equation, we have proposed efficient algorithms for multiple labels propagation on graphs of arbitrary topology which enables numerous applications as graph clustering, geodesic distance computation or front tracking. This can be used on many domains such as images, meshes, data or any structure that can be represented by a weighted graph.