1 Introduction

Image segmentation is a crucial element of every image recognition system. There exist numerous methods of low-level segmentation. Most of them, however, possess limitations when more complicated images are considered. Those limitations are a result of the fact that the information contained in the image is usually insufficient for proper understanding of its content and additional knowledge-based constraints must be added during the segmentation process. The problem is overcome in a natural way by active contour techniques where knowledge can be encoded in energy function that heuristically evaluates the whole labeling of the pixels. In this paper, a relationship between active contours and classification methods is described, its consequences are discussed and on that basis the new potential active contour technique is introduced.

The paper is organized as follows: in Sect. 2 the relationship between active contour techniques and construction of the classifiers is discussed. Section 3 describes potential active contour method as a consequence of that relationship. Section 4 presents the example usage of the introduced method for segmentation of artificially generated images whereas Sect. 5 is devoted to application of that method to heart ventricle segmentation. Finally, the last section focuses on the summary of the presented results.

2 Relationship

Segmentation of set O can be defined as its division into the finite set of \({{\mathsf{L}}\in{\mathbb{N}}}\) subsets \(O^1,\ldots,O^{\mathsf{L}}\) that do not have common elements and together constitute the whole set O. Set O can be segmented in many different ways and the choice of proper segmentation constitutes a very important task in pattern recognition where the semantic meaning is assigned to those segments. The segmentation can be supervised or unsupervised. In the first case, the correct labeling of segments is expected. In the second, only the division must be correct. The issues presented further in this paper are relevant to both supervised and unsupervised segmentation.

There exist typical tasks where segmentation takes place such as: classification, which is usually independent of the type of elements in the set O, and image segmentation where set O is composed of image elements. Further, both those tasks are shortly described with special attention paid to active contour methods and the relationship between them is presented.

2.1 Classification

Classifiers are pattern recognition methods where the set of \({{\mathbb{L}}(1,{\mathsf{L}})}\) recognition results of objects O is finite. Naturally, none of the classification algorithms can operate on the objects directly. They are able to operate on data extracted from objects (i.e. sensed, measured etc.). Extraction can be defined as a function \(e:O \rightarrow X\) that transforms the problem of classifying objects from set O into classification of their descriptors from set X. It allows to prepare classification methods regardless of the type of classified objects. Thus, classifiers can be considered as functions \({k:X\rightarrow{\mathbb{L}}(1,{\mathsf{L}})}\) that assign proper labels to the object descriptors. It is worth mentioning that the choice of descriptors is not a trivial task as it determines the ability of objects discrimination, but this problem is beyond the scope of this paper.

The actual problem is the construction of classification functions. There are many classifiers \(k\in K\) that map X into \({{\mathbb{L}}(1,{\mathsf{L}})}\), where K denotes a set of all possible classifiers. In practice, some classifier model is assumed. It usually limits the space of possible classifiers \({K}^{\rm model}\subseteq K\) and is parametrized. This allows to reduce the problem of classifier construction to the search of the best parameters. To enable the selection of those parameters additional knowledge must be used for evaluation of the quality of the corresponding classifier:

$$ Q:K\rightarrow{{\mathbb{R}}}$$
(1)

This objective function allows to find, in an optimization process, the best parameters and, consequently, the best classifier within the assumed model.

The scheme presented above requires a number of decisions to be made which in general are not obvious:

  • The choice of the model is crucial as it must possess proper discriminative power for the considered classification problem. If the model is wrong, even with the best quality function and optimization method, the classifier construction may fail as the optimal classifier may not belong to K model.

  • The choice of quality function Q determines classifiers’ comparison method and should express knowledge about the problem. Usually, that knowledge is limited to the set of objects with known labels, which, as it will be shown further, not always allows to find a proper classifier.

  • The choice of optimization method specifies the method of search of the best classifier. However, optimization problems are not trivial tasks especially when objective function Q is complicated and there are additional constraints imposed by the model K model, which leads to finding local minima instead of global ones.

There are many practical methods of classifier construction that differ especially in classifier model definition and in the method of learning. In the supervised case one has: k-NN classifiers, potential function classifiers, Fisher discriminant analysis, Bayes classifiers, support vector machines, decision trees, rule-based classifiers, neural-based classifiers, etc. [2, 6, 23, 24, 26, 31, 38, 39, 40, 41], whereas in the unsupervised case: c-means clustering, single-link, complete-link, average-link clustering, neural-based clustering, etc. [2, 26, 32, 34, 40].

As it was mentioned, classification is a method of segmentation of set O since for a given classifier k and a method of extraction e segments can be defined as:

$${O}^l=\{o\in O:k(e(o))=l\}$$
(2)

where \({l\in{\mathbb{L}}(1,{\mathsf{L}})}\). An opposite relationship is also true as each segmentation defines straightforwardly some classifier.

2.2 Image segmentation

In the image analysis, segmentation means division of the set of image pixels into \({\mathsf{L}}\) subsets representing visible logical entities. Those entities can have different semantic levels of interpretation ranging from spots of uniform color or texture to objects of real world. For convenience, it will be assumed that for image segmentation \({O={\mathbb{R}}^2}\). This is not a limitation as each pixel is unambiguously defined by its coordinates and, if necessary, there can be defined additional segments for points lying outside the image. Further, image width will be denoted as \({{\mathsf{W}}\in{\mathbb{N}}}\) and image height as \({{\mathsf{H}}\in{\mathbb{N}}}\).

Since, as it was presented in the previous section, segmentation can be performed using classifiers, image segmentation can be carried out in that way if only a proper extraction method e is defined. What differentiates image segmentation from other typical classification tasks is that for proper classification of the object information about other objects is required. In practice, it means that for extraction both information about a given pixel and the information about pixels in the neighborhood are used, as the information about coordinates and color of the single image element is usually insufficient.

There exist also other than classification methods specifically dedicated to segmentation such as: thresholding techniques, region-based techniques (merging and splitting regions with pixels possessing the same or different features, respectively) and edge-based techniques (detecting and reconstructing borders of regions with pixels possessing different features) [12, 15, 36]. Separate group of those methods are model-based techniques and among them active contour methods presented in the next section. The segmentation methods usually perform a binary segmentation of the image (for \({\mathsf{L}}\) = 2) where one segment represents object and the other the rest of the image (background). Further, labels l o and l b will be used for discrimination of those segments, respectively.

2.3 Active contours

The ‘active contour’ term was first proposed by Kass, Witkin and Terzopoulos in [19] where contour, defined as a parametrized curve and called snake, evolves in the images in order to localize the objects visible. The aim of the evolution was defined as an energy function capable of contour evaluation. As in this case the contour was represented as a function, the energy was in fact a functional and the calculus of variations had to be used to find the optimum. It led to the problem of solving of partial differential equation set which, when solved numerically, led to contour evolution. This basic concept has many modifications. Cohen in [7] introduced additional pressure forces squeezing or stretching the contour. That concept was further developed by Ivins and Porrill in [17] where region energy and forces were proposed. Both those modifications were introduced to avoid problems with getting stuck in local minima of energy functions, which otherwise had to be solved by proper contour initialization. The other solution of that problem was suggested by Cohen and Cohen in [8] (distance potential) and by Xu and Prince in [49] (gradient vector flow) where image energy and force components were modified to give better values even for contours staying far from the sought object. There exist also numerous other modifications: Amini, Weymouth and Jain in [1] proposed the dynamic programming approach, whereas Williams and Shah in [48] used the greedy algorithm for finding the optimum contour, which in some problems made the optimization more efficient, McInerney and Terzopoulos in [28] as well as Delingette and Montagnat [13] proposed algorithms allowing the changes in topology of the regions described by the contour, etc.

The second group of active contour methods constitute geometric active contours. They differ from snakes, which belong to the group of parametric active contours, in that they do not use the information about parametrization of curves while evaluating contours. Those methods were proposed simultaneously by Malladi, Sethian and Vemuri in [27] and by Caselles, Catte, Coll and Dibos in [5]. During optimization, those methods used the level-set approach proposed by Osher and Sethian in [30] as a solution of front propagation problem. Such an implementation allowed to change easily the topology of the regions described by the contour without any additional steps as it was in the case of snakes. The extension of those methods were geodesic active contours which were introduced in two different ways by Caselles, Kimmel and Sapiro in [3, 4] and by Yezzi, Kichenassamy, Kumar, Olver and Tannenbaum in [20, 52]. In both cases, the motivation was to present the geometric contour evolution as a problem of energy functional optimization as it was in other active contour techniques. The relationship between geometric and parametric active contours were presented by Xu, Yezzi, Prince and Hopkins in [50, 51], which allowed to exchange energy formulation methods between those groups of active contours.

There are also other variants of active contours in literature. The most important are:

  • Cootes and Taylor in [911] introduced active shape models where contour is described by a set of landmark points characterizing object and where the point distribution model is trained using sample shapes to impose additional constraints on contour during evolution.

  • Grzeszczuk and Levin in [16] proposed the linguistic description of the contour and suggested the statistical training of the image energy component. In this work also simulated annealing was used as a contour evolution method.

  • Jacob, Blu and Unser in [18] as well as Schnabel and Arridge in [35] used splines for contour definition.

  • Staib and Duncan in [37] as well as Leroy, Herlin and Cohen in [25] modeled contour using Fourier descriptors.

  • Denzler and Niemann in [14] introduced active rays where contour was described using distances from the given center point.

Most of the methods described above have also extensions that allow to apply them for segmentation image sequences—both video sequences (tracking) and 3D sequences (active surfaces).

All the active contour methods use contour term though none of them defines it straightforwardly. To describe the relationship between active contours and classification techniques it will be useful, however, to define this term more formally. Intuitively, the contour is a set of lines and points in the image that unambiguously localizes the object in that image. The set of lines and points suffice for description of all image parts as each region can be circumscribed by the line. Yet, this set alone is not enough to point out the object in the image as to each part a label should be assigned indicating whether it is a part of the object or not. Consequently, contour c can be defined a set of lines and points in the image together with information about labels of parts of the image which are defined by that set. Those considerations can be described precisely but for the purposes of this paper a simplified formalism will be used. The simplification reduces only the set of contours C possible to describe.

Let \({C:{\mathbb{R}}^2\rightarrow{\mathbb{R}}}\) be an implicit contour function. If this is a continuous function with continuous first partial derivatives and if there exists only a finite number of singularity points \({(x,y)\in{\mathbb{R}}^2}\) such that C(x,y) = 0, then, as a result of implicit function theorem, the set \({\{(x,y)\in{\mathbb{R}}^2:C(x,y)= 0\}}\) defines a finite set of points and regular curves and, thus, defines the contour c where labels of the image parts can be assigned in the following way:

$$ O^{l^o}=\{(x,y)\in{{\mathbb{R}}}^2:C(x,y)\geq 0\}\subseteq{{{\mathbb{R}}}^2} $$
(3)

and

$$ O^{l^b}=\{(x,y)\in{{\mathbb{R}}}^2:C(x,y)< 0\}\subseteq{{{\mathbb{R}}}^2} $$
(4)

Having defined the contour, it is also worth noticing that all the active contour methods have the same scheme. All of them search for the optimal contour c, which, at the beginning, requires assumption of some contour model. That model usually limits the space of possible contours \(C^{\rm model}\subseteq C\) and is parametrized, which allows to reduce the problem of segmentation to the problem of search of the best parameters. To enable the selection of those parameters additional knowledge must be used to evaluate the quality of the corresponding contour:

$$ E:C\rightarrow{{\mathbb{R}}} $$
(5)

This energy function allows to find, in an evolution process, the best parameters and, consequently, the best contour within the assumed model.

This scheme requires a number of decisions to be made, which, in general, are not obvious:

  • The choice of the model is crucial as it must allow to find contours able to describe shapes of the searched objects. If the model is wrong even with the best energy function and evolution method, the search of the contour may fail as the optimal one may not belong to C model.

  • The choice of energy function E determines the way of comparison of contours and it must express knowledge about the problem.

  • The choice of evolution method specifies the search method of the best contour. Yet, optimization problems are not trivial tasks especially when objective function E is complicated and there are additional constraints imposed by the model C model. In those situations evolution can lead to finding local minima instead of global ones.

2.4 Relationship

The relationship between active contour methods and techniques of classifiers construction has been described for the first time in [45]. It is quite evident from the considerations presented above since both of those groups of methods use the same scheme: they define the parametrized model of the searched object, they choose a function able to evaluate the objects within that model and, finally, they define the method of optimization that looks for the optimum parameters and, thus, for the best object. Therefore, if the problem of pixel set segmentation is considered, contours can be treated as if they were classifiers. If so, the question arises why the active contour techniques were introduced since, as it was presented in Sect. 2.2, traditional classifiers can also be of use during image segmentation. The answer to this question is hidden in the knowledge that is used to prepare the energy function. In traditional classification techniques this knowledge is usually limited to the training set of correctly labeled objects that, during the learning of the classifier, is used to evaluate separately the objects classification results. Nevertheless, as it was mentioned in Sect. 2.3, that approach to pixel classification is wrong since labels of pixels should depend on the characteristic of other pixels in the image. This problem can be partially solved by proper definition of the descriptor extraction function e where spatial relations between pixels can be considered. That, however, will not solve the problem where the label of the pixel depends on the labels of other pixels. For example, to decide whether a pixel belongs to the object of a given shape it must be known which pixels will be taken to evaluate the shape. Such task can be accomplished only if the quality function evaluates not the label of each object separately but the whole labeling of objects. The energy functions defined for active contours usually fulfill that assumption (especially their components imposing additional constraints on objects localization, shape, etc.).

The presented relationship has many interesting consequences. First of all, classifier models can be adapted to create new contour models able to classify pixels using the traditional energy formulations used in active contour approach. That consequence is presented in this paper where model from potential function classifier is used to introduce potential active contours. Secondly, contour models can be used to prepare new models of classifiers able to classify objects more complicated than pixels. Those classifiers can use the quality function in similar way to other classification techniques. An example of that approach can be found in [43] where the extension of potential contour approach is presented. The same can be done easily with geometric active contours and Brownian strings methods discussed in Sect. 2.3. Thirdly, the advantage of active contour energy function that evaluates the whole labeling can be used while classifying other objects than pixels. It will allow to introduce some knowledge-based constraints to the quality function. The examples of such approach were presented in [46] both for supervised and unsupervised classification. In [44] other than pixels image elements were used to introduce a sample image understanding methodology. This idea is not entirely new since currently knowledge-based classification techniques are intensively developed.

Finally, the presented relationship allows to define new methods of image segmentation evaluation. Since contour can be treated as any other classifier, it can be evaluated using techniques coming from classification tasks. In this paper, precision \(P\in[0,1]\) and recall \(R\in[0,1]\) are used for this purpose. If the reference segmentation of O is given by \(M^{l^o}\) and \(M^{l^b}\), they can be defined as:

$$ P=\frac{n(M^{l^o}\cap{O^{l^o}})}{n(M^{l^o}\cap{O^{l^o}})+n(M^{l^b}\cap{O^{l^o}})} $$
(6)

and

$$ R=\frac{n(M^{l^o}\cap{O^{l^o}})}{n(M^{l^o}\cap{O^{l^o}})+n(M^{l^o}\cap{O^{l^b}})} $$
(7)

where \({n:2^O\rightarrow{\mathbb{R}}}\) denotes the cardinality of a given set. Precision indicates how many pixels inside the current contour are in fact correctly classified as l o and recall how many pixels that are expected to be classified as l o are in fact inside the contour. The more P and R are closer to 1, the more perfect segmentation results are. Smaller value of precision means that the evaluated contour sticks out of the reference contour drawn by an expert and smaller value of recall means that reference contour sticks out of the considered contour. Those measures are not always well suited for segmentation problem because they consider only the number of pixels (contour area) and not their structure (contour shape) but as rough quality measure they are useful.

3 Adaptive potential active contours

In this section the concept of adaptive potential active contours is presented as a consequence of the relationship between active contours and classifier construction described above. The model of the contour bases on the potential function classifier model, while its evolution is controlled by general energy functions as it takes place in other active contour techniques.

3.1 Potential function classifier

Potential function method [42] assumes that label of the classified object should depend on the labels of currently known similar objects. The similarity measure considered in this case is a metric \({\rho:X \times X\rightarrow{\mathbb{R}}}\) and, consequently, the method can be applied in any metric space X.

Let \(D^l=\{\mathbf{x}_1^l,\ldots,\mathbf{x}_{\mathsf{N}^l}^l\}\) denote a subset of X containing \(\mathsf{N}^l\) source objects corresponding to the label \({l\in{\mathbb{L}}(1,{\mathsf{L}})}\) and let \({P:{\mathbb{R}}\rightarrow{\mathbb{R}}}\) denote the potential function. Examples of the latter are exponential potential function:

$$ P_{\Uppsi,\mu}^{\rm exp}(d)={\Uppsi}{\text{e}}^{-\mu{d^2}} $$
(8)

and inverse potential function:

$$ P_{\Uppsi,\mu}^{\rm inv}(d)=\frac{\Uppsi}{1+\mu{d^2}} $$
(9)

where \(\Uppsi\) and μ are parameters controlling the maximum strength of the potential field in its source and its distribution, respectively (Fig. 1). The potential function classifier \({k^{\rm pot}:X \rightarrow{\mathbb{L}}(1,{\mathsf{L}})}\) is defined in the following way:

$$ k^{\rm pot}({\mathbf{x}})={\hbox{arg\, \rm max}}_{{l\in{{\mathbb{L}}}}(1,{{\mathsf{L}})}}S^{\rm pot}({\mathbf{x}},l) $$
(10)

where \({S^{\rm pot}:X\times{\mathbb{L}}(1,{\mathsf{L}})\rightarrow\mathbb{R}}\) and \(S^{\rm pot}(\mathbf{x},l)\) represents the resulting summary potential generated by sources D l calculated for \(\mathbf{x}\in X\):

$$ S^{\rm pot}({\mathbf{x}},l)=\sum_{i=1}^{{\mathsf{N}}^l} P_{\psi_i^l,\mu_i^l}(\rho({\mathbf{x}}_i^l,{\mathbf{x}})) $$
(11)

To put it differently, label \({l\in{\mathbb{L}}(1,{\mathsf{L}})}\) is assigned to object \(\mathbf{x}\in X\) if the summary potential of objects from D l has the maximum value (Fig. 2). Further, the inverse potential function is considered.

Fig. 1
figure 1

Potential functions for different values of μ and \(\Uppsi=1\) (the bigger value of μ, faster the decrease of potential value that can be observed when the distance d from the potential source increases): a exponential potential function, b inverse potential function

Fig. 2
figure 2

The summary potential for sources in \({{\mathbb{R}}}\). There are 2 sources on the left and 3 sources on the right

3.2 Potential contour

Assuming, as it was before, that l o and l b represent labels of object and background and that \(D^{l^o}=\{(x_1^{l^o},y_1^{l^o}),\ldots,(x_{\mathsf{N}^{l^o}}^{l^o},y_{N^{l^o}}^{l^o})\}\) and \(D^{l^b}=\{(x_1^{l^b},y_1^{l^b}),\ldots,(x_{\mathsf{N}^{l^b}}^{l^b},y_{N^{l^b}}^{l^b})\}\) represent object and background sources, respectively, its implicit contour function described in Sect. 2.3 can be defined as:

$$ C^{\rm pot}(x,y)=S^{\rm pot}((x,y),l^o)-S^{\rm pot}((x,y),l^b) $$
(12)

The shape of the contour depends on the parameters of potential contour model i.e. on \({(x_i^{l^o},y_i^{l^o})\in{\mathbb{R}}^2}\), \({\Uppsi_i^{l^o}\in{\mathbb{R}}}\), \({\mu_i^{l^o}\in{\mathbb{R}}}\) for \({i\in{\mathbb{L}}(1,{N^{l^o}})}\) and \({(x_i^{l^b},y_i^{l^b})\in{\mathbb{R}}^2}\), \({\Uppsi_i^{l^b}\in{\mathbb{R}}}\), \({\mu_i^{l^b}\in{\mathbb{R}}}\) for \({i\in{\mathbb{L}}(1,{N^{l^b}})}\) (Fig. 3). Those parameters are sought during the contour evolution and, consequently, to make that process effective, some additional, reasonable assumptions about their values should be made:

  • The potential sources should lie inside the image, which means that: x min = 0, x max = \({\mathsf{W}}\), y min = 0, y max = \({\mathsf{H}}\).

  • The strength of the potential source should be positive. Its maximum value, however, is irrelevant since multiplying the contour function C pot by a constant positive value does not change the contour shape. In this paper it is assumed that: \(\Uppsi^{\rm min}=0\), \(\Uppsi^{\rm max}=1\).

  • The value of the parameter controlling the distribution of potential field should assure that potential source influences significantly the summary potential inside the image. Since, with the increase of its value the distance, where potential begins to asymptotically approach to 0, decreases, its maximum value can be estimated assuming the minimum distance above which the potential is close to 0 (Fig. 1). Let \(f^{\rm psi}\in(0,1)\) denote the fraction of the potential source strength and the \(f^{\rm dist}\in(0,1)\) denote the fraction of the maximum distance in the image, the maximum value of the discussed parameter can be found from:

    $$ \frac{\Uppsi}{1+\mu^{\rm max}{f^{\rm dist}}^2 (W^2+H^2)}=f^{\rm psi}\Uppsi $$
    (13)

    which, after simple transformations, leads to:

    $$ \mu^{\rm max}=\frac{1-f^{\rm psi}}{f^{\rm psi}{f^{\rm dist}}^2 (W^2+H^2)} $$
    (14)

    In this paper it is assumed that: f dist = 0.01 and f psi = 0.01 since those parameters control how far from the source the value of potential is sufficiently small to be considered as insignificant. Of course: μ min = 0.

    Fig. 3
    figure 3

    Sample contours with \(N^{l^o}=1\) and \(N^{l^b}=2\) sources: ac contours, d normalized radius angle characteristic for the contour a

A separate problem is the choice of the number of background \(N^{l^b}\) and object \(N^{l^o}\) sources as it determines the class of possible contour shapes that can be achieved. In real applications it is hard to predict their correct number necessary to describe the object sought in the image. The proposed solution of this problem will be described in detail in Sect. 3.5.

3.3 Energy function

Energy is an objective function during contour evolution. To show the properties of the described technique, in this paper a specific type of tasks is considered where the information about the expected feature of pixels representing objects in the image, e.g. their color or texture, is available (Figs. 4, 9). The aim of those tasks is to find the image segment with pixels possessing that feature. Those tasks become complicated when there is noise in the image making the proper localization of the objects difficult. The choice of that type of tasks is not a limitation of potential active contour method since for other tasks the proper energy functions can be defined as well.

Fig. 4
figure 4

Sample results for two chosen images from training set: a, d insufficient number of sources (\(N^{l^o}=1\), \(N^{l^b}=1\), E example), b, e no shape constraints (\(N^{l^o}=1\), \(N^{l^b}=2\), E image), c, f correct results (\(N^{l^o}=1\), \(N^{l^b}=2\), E example)

To consider tasks described above the following notation will be used further:

  • Let \({F^{\text{feat}}:{\mathbb{R}}^2\rightarrow\{0,1\}}\) be a function describing the distribution of a given feature such that F feat(x, y) = 1 if \({(x,y)\in{\mathbb{R}}^2}\) possesses that feature and F feat(x, y) = 0 otherwise.

  • Let \({F^{\rm dist}_m:{\mathbb{R}}^2\rightarrow{\mathbb{R}}}\) for \(m\in\{0,1\}\) denote the distance function which for a given metric calculates the distance to the closest point \({(x,y)\in{\mathbb{R}}^2}\) such that F feat(xy) = m.

  • Let \({F^{\rm pot}_m:{\mathbb{R}}^2\rightarrow[0,1]}\) where:

    $$ F^{\rm pot}_m(x,y) = \frac{1}{1+\frac{1}{W^2+H^2}{{F}^{\rm dist}_m}^2(x,y)} $$
    (15)

    for \(m\in\{0,1\}\). Function F pot m plays a similar role as a distance potential function used in the snakes method mentioned in Sect. 2.3 and allows the contour to have proper energy values even when the contour is far from the sought object.

Sample energy functions based on that notation are presented in Sects. 4 and 5 where examples of tasks of the considered type are presented. Those functions consider only image information, which is not sufficient if there is noise in the image. In such situations, additional constraints using external knowledge about the searched object must be used. An example of such energy component is presented in Sect. 4.

3.4 Evolution method

The simulated annealing algorithm was chosen as the evolution method of potential contour. This probabilistic optimization algorithm was first proposed by Kirkpatrick in [22]. That work was based on the earlier results of Metropolis in [29] and Pincus in [33]. A major advantage of this method is its ability to avoid getting stuck in a local optima of the objective function, which, at the same time, reduces the problem of the correct initialization near the global optimum.

The basic idea of this algorithm is to generate the sequence of solutions and accept or reject them depending on the values of energy function. For the given iteration \({t\in{\mathbb{N}}}\) if \(E(c_{t+1})\leq E(c_{t})\), then contour c t+1 is accepted, otherwise the contour c t+1 is taken with probability:

$$ p_{t}={\text{e}}^{-\frac{E(c_{t+1})-E(c_{t})}{T_t}} $$
(16)

where T t represents an additional parameter, called temperature, which decreases with the progress of the algorithm. The probability of acceptance of worse solution is greater at the beginning of the algorithm (as the temperature has larger values) and when energy difference \(E(c_{t+1})-E(c_{t})\) is small. In theory, if the annealing is sufficiently slow and solution generator possesses specific properties, that algorithm guarantees localization of global optima. In practice, however, those assumptions cannot be fulfilled and heuristics implementation are used.

To utilize the simulated annealing, the solution representation must be defined, the solution and temperature generators must be prepared, the initial temperature must be chosen and some stop criteria must be assumed. In this paper, potential contour is represented by its \(4(N^{l^o}+N^{l^b})\) parameters. Temperature is decreased using the exponential annealing schedule proposed by Kirkpatrick:

$$ T_{t+1}=\alpha T_t $$
(17)

where α = 0.95, though the decrease is not performed at each iteration. For every temperature value L temp solutions are considered. That sequence of solutions is usually called a Markov chain. This heuristic is further modified in this paper as in the same temperature there are allowed N mar Markov chains of length L mar and after each chain the best solution found so far is assumed for further calculations. In consequence, \(L^{\text{temp}}=N^{\text{mar}}L^{\text{mar}}\). The next element of the algorithm is a solution generator. In theory, it should introduce small modifications and should allow to explore the whole space of the solutions. For the purposes of potential active contours, the solution generator performs randomly one of the following modifications:

$$ (x_{t+1},y_{t+1})=(x_{t},y_{t})+(G (f^{\text{range}} (x^{\rm max}-x^{\rm min})),G (f^{\text{range}} (y^{\rm max}-y^{\rm min}))) $$
(18)

or:

$$ \Uppsi_{t+1}=\Uppsi_{t}+G (f^{\text{range}} (\Uppsi^{\rm max}-\Uppsi^{\rm min})) $$
(19)

or:

$$ \mu_{t+1}=\mu_{t}+G (f^{\text{range}} (\mu^{\rm max}-\mu^{\rm min})) $$
(20)

where \({G:{\mathbb{R}}\rightarrow{\mathbb{R}}}\) for argument \({d\in{\mathbb{R}}}\) generates a random value from between −d and d using normal distribution with 0 mean and f range controls the range of those modifications and is a fraction of the maximum range of values available for the given parameter. If the resulting value is out of the parameter range, it is clipped to the correct range. Having the solution generator defined, the length L mar of Markov chain can also be estimated as follows:

$$ L^{\text{mar}}=4(N^{l^o}+N^{l^b})\frac{1}{f^{\text{range}}} $$
(21)

It depends on the number of parameters that are optimized and on the minimal number of their modifications necessary to explore the ranges of those parameters. Such a formulation to a certain extent guarantees that during one Markov chain a proper number of solutions will be generated to explore the solutions’ space. In this paper f range = 0.1 since on the one hand the generated solutions should be chosen from the neighborhood of the current solution and, on the other hand, too small value of this parameter would make the exploration of contour space very slow. To choose the initial temperature the method proposed by Kirkpatrick in [21] is used. It assumes that before the start of the algorithm a sequence of random solutions is generated, which allows to estimate an average absolute change of energy function \(\Updelta E\). This estimation allows to calculate the initial temperature if the initial probability p init of worse solution acceptance is given from the following equation:

$$ p^{\text{\text{init}}}={\text{e}}^{-\frac{\Updelta E}{T^{\text{init}}}} $$
(22)

which after transformations gives:

$$ T^{\text{init}}=-\frac{\Updelta E}{\ln(p^{\text{init}})} $$
(23)

Here, p init = 0.8. Finally, as a stop criterion the maximum number N iter of iterations can be used. It can be estimated as after N temp changes of temperature the probability of worse solution acceptance is very low, which in practice makes the escape from the local minimum impossible. Assuming the final probability p final and since:

$$ T^{\text{final}}=\alpha^{N^{\text{temp}}} T^{\text{init}} $$
(24)

and

$$ p^{\text{final}}={\text{e}}^{-\frac{\Updelta E}{T^{\text{final}}}} $$
(25)

it can be found that:

$$ N^{\text{temp}}=\log_\alpha\left(\frac{\ln(p^{\text{init}})}{\ln(p^{\text{final}})}\right) $$
(26)

Here, it is assumed that p final = 0.0001 since for this probability only better solutions are accepted in practice and simulated annealing becomes a greedy algorithm. Of course, then \(N^{\text{iter}}=N^{\text{temp}}L^{\text{temp}}\).

The reasoning presented above, though heuristic, allows to achieve the satisfactory results of evolution and reduces the number of decisions that must be made while working with potential active contours as only N mar value must be chosen arbitrarily or during the experiments. In this paper, it is set arbitrarily to N mar = 2 which is a compromise between the ability to explore the contour space and the length of the evolution. As it was mentioned at the beginning of this section also the initialization of the contour is not a problem as it can be any feasible contour. In the experiments presented further, the initial potential sources lie at the center of the image (\({\mathsf{W}}\)/2, \({\mathsf{H}}\)/2) and have the same values of remaining parameters equal to 0 and μ max/2, respectively. Sample evolution process is presented in Fig. 5.

Fig. 5
figure 5

Sample evolution process for the image from testing set: ac evolving contour, d energy chart

3.5 Adaptation

As it was mentioned in Sect. 3.5 the actual problem while working with potential active contours is also the choice of proper values of number \(N^{l^o}\) of objects sources and \(N^{l^b}\) of background sources. It is crucial as it determines the class shapes that can be achieved during evolution. If that number is too small, some of the shapes can be unreachable, e.g. if only 1 object source or 1 background source is chosen only simple circular shapes can be found if the whole contour lies inside the image (Fig. 4). If that number is too large, it can significantly slow the optimization process as well as it can cause that the global optimum will be difficult to reach as the dimension of search space grows. The problem is that it is hard to predict the proper number of sources. The first solution is to choose them arbitrarily or after a series of initial experiments, which is done in this paper. The other solution presented in [42] is to introduce the adaptation algorithm where the evolution starts with a small number of sources and after each optimization phase additional sources are introduced until the satisfactory energy value is not reached. That adaptation may depend on the character of the energy function and the new sources can be introduced in the area where the best improvement is expected. Such an approach can allow to reduce the number N iter used in each optimization phase as at the beginning only general localization of the object is searched and further only the shape of the contour is improved where the intensive exploration of search space is not necessary. Those aspects of this method are under research.

4 Example

In this section a simple example of segmentation of artificially generated image data is presented. The used data set allows to illustrate the problem with noised images where the necessity of additional shape constraints is evident as well as to present the possible methodology that can be used to choose proper parameters of evolution.

4.1 Problem

As it was mentioned above, the data set used in this section was generated artificially. Each image contains the same object translated and rotated. To make sure that this object can be found using the assumed number of potential sources it was chosen based on the existing contour where \(N^{l^o}=1\) and \(N^{l^b}=2\) (Fig. 3). 100 images were generated and to each of them the noise in form of spots was added (Fig. 4). The problem was to find the object regardless of the existing noise.

4.2 Solution

To localize areas of the image where pixels possess the desired feature (in this case it is black color) the following energy function was used:

$$ E^{\text{image}}(c)=\frac{\sum_{(x,y)\in O^{l^o}} (1-F^{\rm pot}_1(x,y))}{n(O^{l^o})}+\frac{\sum_{(x,y)\in O^{l^b}} (1-F^{\rm pot}_0(x,y))}{n(O^{l^b})} $$
(27)

It assigns small values to the contours with small number of pixels lying inside the contour that do not possess the given feature and with the small number of pixels lying outside the contour that possess the given feature. Due to the noise, however, that energy component is insufficient (Fig. 4) and additional constraints must be added. Because the shape of the searched object is in this case known and the object is composed of one part its radius angle characteristic can be found (Fig. 3). To find that characteristic natural parametrization of the curve must be considered and, then, the contour is traversed along and in points equally distributed, the distance of the radius connecting that point and the weight center is calculated as well as the angle between that radius and horizontal axis. Such characteristic unambiguously describes shape. The additional energy term E shape calculates the difference between the known characteristic and the characteristic of the currently evaluated contour (it is possible if the starting point will be chosen in the same way for both contours). It is worth mentioning that using that approach the evaluation is translation invariant and after additional steps (contour can be rotated so that the angle for the starting point was equal to 0, the radii can be scaled using the maximum value of the radius in a given contour), it can be also rotation and scale invariant. In this paper, the rotation invariance was considered and the final energy was defined as:

$$ E^{\text{example}}(c)=E^{\text{image}}(c)+w^{\text{shape}}E^{\text{shape}}(c) $$
(28)

where the weight \({w^{\text{shape}}\in{\mathbb{R}}}\) decides what is more important—the shape of the contour or its fitting in the image.

To choose the proper value of w shape the generated set of images can be divided into training and testing set. Next, using the training set the contour evolution can be performed for different values of w shape selecting the one that finds the best precision and recall values described in Sect. 4.3. That value can be further evaluated using testing set.

4.3 Results

In this paper the generated set of images was divided into training set with 20 images and testing set with 80 images and 5 different values of w shape were tested: 0.2, 0.5, 1, 2 and 5. The best average value of precision and recall was achieved for w shape = 2.

Having found the value of w shape it was used to perform the segmentation of the images in testing set. Summary results are presented in the form of precision recall chart in Fig. 6. Most of the P and R values are close to 0.9 or above, which means that proposed methodology, though simple, gave satisfactory results. There are, however, also the isolated cases where precision or recall was below 0.7. It can be explained as the chosen type of noise could be significantly damaging for the objects in the image (the used spots were relatively large in comparison with the size of the sought objects) and in some cases it was not possible to restore the proper shape and localization.

Fig. 6
figure 6

The precision and recall charts for 80 images of testing set (the value of w shape = 2 was chosen in the training phase)

5 Medical application

In this section the application of potential active contour approach to the real problem of medical image segmentation is presented. The choice of the potential active contour approach resulted from specific properties of potential contour model that, in a natural way, allows to describe smooth, medical shapes. Other traditional active contour techniques such as snakes or geometric active contours require special steps, e.g. definition of additional smoothing internal energy components, in order to achieve similar results.

5.1 Problem

Obstruction of pulmonary arteries by the emboli is one of the most frequent cause of death among the population in the most developed societies. In recent decades the development of computed tomography (CT) allowed to introduce new diagnostic methods of pulmonary embolism (Fig. 7). Some of them were mentioned in [47]. This paper focuses on a method for the right ventricle systolic pressure assessment linked with shifts in interventricular septum curvature that requires drawing of endocardial and epicardial contours. Currently they are drawn manually, which is time-consuming as well as prone to errors due to technical mistakes and tiredness (Fig. 8). The aim of this work is to automate this process.

Fig. 7
figure 7

Image preparation: a four chamber long axis view (original data), b two chamber short axis view (reconstructed data)

Fig. 8
figure 8

Sample 4D image sequence (the first, sixth and tenth phase of different slices). Each image contains manually drawn: contour of right ventricle (on the left), contour of left ventricle (on the right, inside), contour of myocardium (on the right, outside) Fig. 9 Sample results of the proposed algorithm for chosen slices and phases of two heart image sequences: a, b the result of thresholding; c, d potential contour circumscribing both ventricles; e, f line separating left and right ventricle; g, h contour of right ventricle; i, j contour of left ventricle

The analyzed image data are obtained using ECG-gated computed tomography scanner in a standard chest protocol after intravenous injection of contrast media. Next, heart cycle is reconstructed in 10 phases and two chamber short axis view is generated leading to 4D image sequence, i.e. the set of 2D images for each reconstructed slice and for each phase of heart cycle.

5.2 Solution

The proposed process of segmentation consists of two phases which are to simulate the human top–down process of image analysis. The first one aims at automatic separation of left and right ventricle. The second one represents actual segmentation of left and right ventricles. Both phases utilize the potential active contour algorithm with different formulation of energy function in each phase. It should be emphasized that the presented approach is currently not fully automatic because an expert must choose, for each 4D image sequence, a proper threshold that allows to distinguish those areas in the images that represent blood with injected contrast from the rest of the image (Fig. 9). It allows to define function F feat presented in Sect. 3.3. This requirement, however, in comparison with manual contour drawing is not a difficult task. Nevertheless, there are also ongoing works that aim at solving that problem.

Fig. 9
figure 9

Sample results of the proposed algorithm for chosen slices and phases of two heart image sequences: a, b the result of thresholding; c, d potential contour circumscribing both ventricles; e, f line separating left and right ventricle; g, h contour of right ventricle; i, j contour of left ventricle

In the first phase, in order to separate left and right ventricles automatically, first both of them are sought using potential active contour method. The energy function was defined as:

$$ E^{\text{both}}(c)=\frac{n(O^{l^o})}{n(I)}+4\frac{\sum_{(x,y)\in O^{l^b}} F^{\rm pot}_1(x,y)}{n(O^{l^b})} $$
(29)

where I denotes the set of all the pixels in the image. It assigned a small value if the contour was relatively small and all the pixels above a given threshold lay inside the contour which should imitate the human way of reasoning (the first component considers the number of pixels inside the contour while the second one considers the number of pixels representing blood with contrast outside the contour). The potential contour model used here had \(N^{l^o}=1\) object source and \(N^{l^b}=2\) background sources. When the contour was found, the line separating both ventricles was searched. Identification of that line was performed by simple testing each line starting at the top edge of the image and ending at its bottom edge. That line was chosen that had the longest part inside the previously found contour and had, at that part, the lowest amount of points above the given threshold (Fig. 9). In fact, the search of this line can also be considered as an another active contour algorithm where evolution method is a brute-force algorithm evaluating every possible contour to find the optimal one. Further, the segment of the image lying on the left side of the line will be denoted as \(L^{l^l}\) and on the right side as \(L^{l^r}\).

Having found the line representing interventricular septum, it is possible, in the second phase, to find a left and right ventricle using a similar approach to the described above. In order to find those ventricles the following energy functions were used:

$$ E^{\text{right}}(c)=2\frac{n(O^{l^o})}{n(I)}+4\frac{\sum_{(x,y)\in O^{l^b}\cap L^{l^l}} F^{\rm pot}_1(x,y)}{n(O^{l^b}\cap L^{l^l})} $$
(30)

and

$$ E^{\text{left}}(c)=2\frac{n(O^{l^o})}{n(I)}+4\frac{\sum_{(x,y)\in O^{l^b}\cap L^{l^r}} F^{\rm pot}_1(x,y)}{n(O^{l^b}\cap L^{l^r})} $$
(31)

Both of them utilize additional information about the line separating ventricles as only pixels above the given threshold on the proper side of the line are considered. Here, also \(N^{l^o}=1\) object source and \(N^{l^b}=2\) background sources were considered.

In both phases the weights of energy components were chosen arbitrarily after initial series of experiments.

5.3 Results

The conducted experiments revealed that the first phase is able to find a proper septum line if only a proper threshold is chosen by an expert. The visual effects of the second phase results also seem to be satisfactory (Figs. 9, 11). Yet, the objective analysis of those results when compared to the reference contours drawn by an expert reveals certain shortcomings (Fig. 10). Though in most of the cases values of precision and recall are above 0.8, the contours are still too imprecise to be directly used in the diagnostic process. It is especially noticeable in the second analyzed image sequence where precision recall charts reveal that the contour found for left ventricles is almost always too big and the contour of right ventricle very often does not localize the whole ventricle. Moreover, the differences between the results of segmentations of the first and the second image sequence suggests that characteristics of images may vary and better process than thresholding is needed. To sum up, currently the resulting contours can only constitute the basis for further manual segmentation which, when hundreds of images are considered, is still a significant simplification of contour drawing.

Fig. 10
figure 10

The precision and recall charts for 80 images coming from two heart image sequences: a first image sequence, left ventricle; b first image sequence, right ventricle; c second image sequence, left ventricle; d second image sequence, right ventricle

Fig. 11
figure 11

Other sample results of the proposed algorithm: a, b, e, f, i, j first image sequence; c, d, g, h, k, l second image sequence

The imperfection of the presented results can be explained by the fact that in the proposed approach only visual information is considered, whereas an expert possesses additional medical knowledge that allows to draw a better contour. However, it can also be used in the presented framework because any information can be encoded in the energy function. For example, one can consider conscious expert knowledge (e.g. the relationship between shapes of both ventricles such as information that the interventricular septum should have a constant thickness, the information about the shape of heart or the information contained in other slices of the same heart) as well as some knowledge that is not conscious or is hard to express directly using mathematical formula and that can be obtained using machine learning techniques (e.g. neural network trained to estimate the proper localization of the contour in the given slice taking into account the image information from other slices). These aspects are under further investigation.

6 Summary

In the paper a new segmentation method, called adaptive potential active contours, is proposed. It originates from the relationship between active contour techniques and methods of optimal classifier construction. This relationship allows the use of the classic potential function classifier to define the new model of contour. The presented approach has the same advantages as other active contour techniques, which allows to define additional knowledge-based constraints, that should be imposed on the contour. Moreover, an additional adaptation mechanism was suggested that can improve segmentation results as it allows to change dynamically the class of shapes that can be considered during optimization. The proposed method is illustrated using two different examples. First one, where the artificially generated images were used, revealed the necessity of the shape constraints in the presence of noise in the image as well as introduced the methodology of parameters training. The second one, proved that the described method can be of use in real medical applications. Further work will involve development of adaptation mechanisms and preparation of better energy functions especially in those cases where analytical formula are not directly available and machine learning algorithms might be useful.