4.1 Introduction

There are many different image segmentation techniques that can be directly or indirectly applied to the tasks of object localization within an image. The main limitation of the classic methods, such as thresholding or region growing, is that they consider only what is available in the image itself, failing to utilize external knowledge about the structure of interest. Such knowledge is crucial in those tasks where the image itself contains insufficient information for proper semantic interpretation of its content. A typical example here is radiological image interpretation, which requires adequate anatomical knowledge, without which it would be impossible to distinguish between organs that have a similar representation of tissues in an image modality under consideration.

A possible solution to that problem is provided by the active contour techniques. This group of methods operate under the assumption that the space of contours, unambiguously identifying objects in the image, is defined. The main objective is to find an optimal contour within that space by proper selection of the objective function (energy) and the optimization algorithm (evolution). The active contour model owes its name to the fact that optimization is usually an iterative procedure, which results in a change of the contours shape after each iteration.

The external knowledge can be incorporated into the active contour procedure in several ways:

  • Proper selection of the contour model can eliminate semantically incorrect solutions.

  • Proper constraints imposed on the optimization process can prevent obtaining unacceptable solutions.

  • Proper components of the energy function can penalize solutions that do not reflect our expectations.

Active partitions can be considered as a generalization of active contours. During the localization process the contours divide the set of image pixels into two subsets representing the object and the background. Such a partition can be defined, however, for any set of elements describing the image content, e.g. superpixels, line segments, ellipses, etc. The change of image description can significantly reduce the space of analyzed primitives without losing important semantic information, which can be still encoded in the attributes assigned to those primitives. The chief advantages of this approach are as follows:

  • The reduced image representation enables the construction of solutions that resemble more closely a conscious image analysis process specific to human beings.

  • Incorporation of external knowledge seems to be more natural.

  • Reduced search space allows the use of more computationally demanding optimization algorithms.

  • More sophisticated optimization algorithms provide the ability to avoid problems with proper selection of initial solutions.

In this chapter, the concept of active partitions is presented and illustrated. Issues regarding medical image analysis fall outside the scope of this study. For simplicity reasons, the author focuses exclusively on the problem of warning road sign localization. The chapter is organized as follows. In Sect. 4.2 a short overview of the active contour techniques is provided, together with the methods of external knowledge incorporation. Section 4.3 introduces the basic concepts of active partitions. Section 4.4 presents a simplified example of active partition application. The chapter concludes with a short summary in Sect. 4.5.

4.2 Background

As already mentioned, the process of object localization in active contours is formulated as an optimization problem, where the objective function (energy) expresses the expectations about the structure of interest. Thus, the energy, if properly defined, should assign to the contour its optimal value (usually minimal) only when the contour represents the object of interest. It is obvious that this evaluation should consist of at least two types of components:

  • external components taking into account the position of the contour in the image (they can evaluate whether the contour lies on the visible boundaries or circumscribes the region with desired characteristics, etc.)

  • internal components taking into account the characteristics of the contour itself (they can evaluate local contour smoothness, global contour shape, etc.).

In the literature, there are many variants of active contours, each adopting a different contour model, which in turn determines the formulation of energy function components and imposes specific contour evolution strategies. In this section, a short review of active contour techniques is presented, which is followed by a discussion of the methods for encoding knowledge about the expected contour characteristic.

4.2.1 Active Contours

The term active contour was first proposed in [1] by Kass, Witkin and Terzopoulos, who described the snakes model, in which the contour was represented by a parametric curve in the image plane. Since contour parametrization is a function, the energy is a functional, and to analytically find an optimal contour, the calculus of variations needs to be used. The application of Euler-Lagrange equations leads in this case to the system of partial derivative equations. Its numerical solution, which requires contour discretization (the contour is transformed into a polygon), results in an iterative process of optimal solution finding. Since the position of contour points is modified at each iteration, the whole process can be interpreted as the movement of the contour under the influence of some internal and external forces. This provides the ability to avoid an explicit definition of the energy function and replace it by a direct definition of the forces modifying the contour according to user expectations.

Another popular variant of active contours is thegeometric active contours approach. It was proposed simultaneously by Malladi, Sethian and Vemuri in [2] and by Casseles, Catte and Dibos in [3]. In those methods, the internal parametrization of the curves is not considered since it does not influence the contours shape. Consequently, only forces normal to the contour are taken into account. At first, the energy function was not expressed explicitly. It was added in geodesic active contours by Casseles, Kimmel and Sapiro in [4] and by Yezzi, Kichenassamy, Kumar, Olver and Tannenbaum in [5]. Thus, contour evolution is usually defined directly by the forces. Using a level-set approach ([6]), where the contour is defined by sections of 3D surface, it is possible to obtain contours of different topology (describing separate regions or those with holes).

Another interesting solution is offered by active shape models, first described by Cootes and Taylor in [7]. Here, the contour is represented by a set of characteristic points, which need not compose a polygon. The possible relative positions of those points are statistically trained before evolution (the so called point distribution model) on the basis of images that have previously been manually marked. The evolution itself comprises two operations, performed at every iteration. First, locally optimal positions of the characteristic points are sought. This operation usually takes into account the expected image profile around this point. Next, the final position of the points is estimated, allowing only three geometrical modifications of the whole shape (translation, rotation and scale) and some local shape modifications that do not violate point distribution model constraints.

A completely different assumption was made by Grzeszczuk and Levin in [8]. Their method, Brownian strings, represents the contour in a linguistic way using a chain of directions describing how to move along the contour (the contour lies in the cracks between pixels). Another non-standard element introduced by this method is the optimization technique that it employs to detect the optimal contour. In this case, the simulated annealing algorithm is used to avoid problems with precise localization of the initial contour. Its application requires, however, a suitable choice of local contour modifications performed at every iteration. Due to the specificity of the contour model, those modifications are quite complex. The same optimization technique, but another contour model, was applied by Tomczyk and Szczepaniak in the potential active contours, proposed in [9]. In this method, the contour is defined by a set of potential field sources. There are two types of those sources and the contour lies where the summary potentials of both types are equal. The evolution of the contour requires potential source modifications which in this case involve changes in their location as well as in the parameters that control the generated potential field characteristic. The optimization technique applied in both those methods provides significant flexibility in defining energy functions, since there are no special requirements as to their form (they need not be differentiable).

In literature, many other variants of active contours can be found. A comparative study can be found in [10, 11]. The choice of specific variant depends on considered application. If the topology of the sought region may change, geometric active contours should be used. There are modifications of snakes that allow to change region topology but their implementation is less elegant. Snakes are good option if contour can be initialized relatively close to the optimum of the energy functions. Otherwise, methods allowing to explore the whole search space, like Brownian strings or potential active contours, should be considered. The former gives full flexibility of shape description, whereas the latter will be useful if smooth, rounded shapes are to be found. If the sought shapes do not differ too much (mainly in their position, orientation or scale), then properly trained active shape models will be the best choice.

For reasons of space, this section focuses further only on those active contour variants that use specific external knowledge regarding the objects of interest to enhance detection. These approaches are described in more detail below.

4.2.2 Knowledge

The ability to incorporate the external knowledge into the process of object localization is a fundamental advantage of active contour techniques. There are three possible elements where the expectations about the structure of interest can be expressed:

  • Contour model—Knowledge encoded in the contour model makes it possible to reduce the search space if specific properties of the object are known, for example in potential active contours where the space of describable contours contains only smooth and rounded shapes. Additionally, the achievable degree of roundness can be controlled by a number of potential sources. This can be observed also in the models that use Fourier descriptors presented in [12, 13] and splines discussed in [14, 15]. Smoothness, however, is not the only requirement that can be encoded in the model. In active rays ([16]) it assumed that not all the concave shapes need to be described. In this case, a distance to a fixed point in the image plain enables the description of all the desired shapes.

  • Evolution strategy—If the energy function is explicitly given and some general purpose optimization technique is used, then knowledge can be used to add hard constraints forbidding certain contour modifications. Typical examples include the point distribution model used in active shape models and some specific solution generators used in methods that apply the simulated annealing algorithm. If the energy need not be specified explicitly and the evolution strategy is designed directly, then knowledge is encoded during the design process. A good example are forces and force fields defined in snakes or geometric active contours. In [1], for instance, volcano and spring forces were described, whereas in [17] a template force was added to keep a desired shape of the contour.

  • Energy function—The expectations about the contour are typically expressed as soft constraints. They can be encoded in both internal and external components, and may have either local or global character. A standard internal local expectation is contour smoothness. In snakes it is expressed by the characteristic of curve parametrization derivatives. Another approach imposes the reduction of contour length. External local expectations focus mainly on the image characteristics on both sides of the contour. Global expectations concern usually the contours shape and the characteristics of the region inside and outside of the contour. The latter can be found in active regions ([18]) and active appearance models ([19]).

A separate problem is the acquisition and representation of the required knowledge. It is not a trivial task but its detailed discussion falls outside of the scope of this chapter. Let us only mention that many approaches use a kind of training procedures where the selection of optimal active contour parameters relies on manual pre-localization of objects. Such procedures can be found in active shape models or active appearance models, in Brownian strings or in potential active contours.

4.3 Active Partitions

Although the concept of contour is intuitively clear, and has been the subject of many practical implementations within the framework of the active contour model, there is no single, universally applicable formal definition of this term. An attempt to formally describe the concept of contour was made, among others, in [20]. The basic feature of the contour is its ability to unambiguously indicate which part of the image reflects the object described by the contour and which part constitutes the background. In other words, the contour possesses the ability of dividing the image pixel set into two partitions.

In practical applications, however, operating on contours in a pixel space is problematic. The main issue is the cardinality of the pixel set, since the number of pixel subsets (possible partitions) grows exponentially with the increasing size of the image. Active contours try to tackle this problem in different ways, as described in the previous section. A proper definition of the contour model and evolution constraints can reduce the space of available partitions. Moreover, appropriate contour initialization, such that makes it relatively close to the optimal solution, can allow one to use simpler optimization techniques, guaranteeing that the desired structure is detected. Another issue connected with pixel representations is the difficulty of defining contour energy, as it often requires defining energy components at a pixel level as well. This is something of a pitfall, which also manifests itself further while defining evolution strategies, when potential and force fields are required, for example, in snakes and geometric active contours. In such a situation, the process of higher-level (global) knowledge incorporation in the localization procedure becomes significantly more difficult.

Because of those reasons, active partitions, a generalization of active contours, was proposed in [21,22,23]. In that approach, the image is not represented by a set of simple pixels but by a reduced set of more complex, spatially localized elements \(\mathrm {E}=\{e_1,\ldots ,e_N\}\). Naturally, the term contour, understood as a line that separates the object elements from the background elements, is hardly applicable in this context. That is why in active partitions instead of the optimal contour, the optimal partition \(\mathrm {P}=\{\mathrm {E}_{O},\mathrm {E}_{B}\}\) is sought directly where \(\mathrm {E}_{O}\subseteq \mathrm {E}\) and \(\mathrm {E}_{B}\subseteq \mathrm {E}\) represent object and background elements, respectively, under the assumption that \(\mathrm {E}_{O}\cup \mathrm {E}_{B}=\mathrm {E}\) and \(\mathrm {E}_{O}\cap \mathrm {E}_{B}=\emptyset \).

4.3.1 Representation

Although active partitions do not assume anything about the nature of the elements \(\mathrm {E}\), to make the method more natural, it is good to refer to observations of the human visual system ([24]). From that perspective, it seems to be obvious that humans do not analyze images directly at a pixel level, focusing rather on local similarities (homogenous regions) and discontinuities (region borders). To reflect this observation, superpixels and line segments were proposed to represent image content in [21, 22], respectively. Examples of such representations of the images considered in this chapter are presented in Fig. 4.1.

Fig. 4.1
figure 1

Representation of image content: a, d, g sample images, b, e, h superpixels generated with SLIC algorithm, c, f, i line segments generated using LSD algorithm

In [21], to generate a superpixel representation, the simple linear iterative clustering SLIC algorithm was used ([25]). It is an adaptation of the k-means clustering algorithm with a properly defined pixel metric. This representation was used to find regions representing the interior of the left and right heart ventricle in CT images. To avoid problems with insufficient image information (heart muscle grows into heart chambers) the requirement for the partition of a minimum border size was added. This approach was adapted from snakes method. The simulated annealing method was used as an optimization algorithm, with a solution generator ensuring that only connected partitions were generated. In [22], the content of mammograms and road scenes was described using a modified line segment detector LSD algorithm ([26]). The line segments reflected the areas of the image where a significant difference of pixel intensity on both sides of those segments was observed. This representation was used to localize circular and triangular regions, some of which might indicate possible circumscribed lesions or warning signs, respectively. In this case, a heuristic search was proposed to reduce the space of the analyzed subsets of segments. The energy function was employed to evaluate the matching degree between a current solution and a given template.

An alternative region-based representation was presented in [23]. In that work, the MR knee images were represented by ellipses describing the regions of a similar color. Ellipses were generated using the cross-entropy clustering CEC algorithm ([27]). This helped to reduce the number of considered elements required for the correct localization of elongated structures forming the fragments of articulate cartilage. The optimal subset of ellipses was sought systematically, taking into account that a uniform color and constant, relatively small structure thickness was expected.

In all the above examples, the number of elements describing an image content is significantly smaller than the total number of pixels in that image. This may provoke concern that a change of the image representation may lead to a crucial information loss. To prevent this, all the considered elements have some additional attributes assigned. In the case of pixels, these attributes are their coordinates and color components. For more complex elements, the amount of information that can be assigned to them is naturally bigger. For superpixels, this can be their center, bounding box, average color, shape descriptor, etc. In the case of ellipses, one can additionally consider their orientation and flattening degree. Finally, for line segments, their orientation and length as well as the characteristics of the regions on both sides of those segments can be taken into account. Another important fact is that some useful information can be also encoded in the relations between elements, for example, in the neighborhood relation. It is typical for pixels but can be also introduced for other elements. Other relations can also be defined if they are more convenient than storing the attributes of specific elements. To sum up, although the number of the elements is reduced, the information about the image content can be preserved in additional attributes of those elements and relations between them.

4.3.2 Partition

Although in active partitions the contours cannot be defined in the same way as they are in pixel representations, some partition model must be assumed to provide a feasible partition description. The most general model is one that offers a full flexibility of partition description. In that model all subsets \(\mathrm {E}_{O}\) and \(\mathrm {E}_{B}\) are allowed. Since the number of elements describing the image content is reduced, such an approach is acceptable in certain applications if additional constraints are imposed on the energy function and the evolution strategy. Such a model was presented in [21,22,23] and is used during the global analysis described further in Sect. 4.4.1. Sometimes, however, such a flexibility may be a source of problems if there is no convenient way to express expectations about the partition structure (e.g. shape) in a form of soft and hard constraints. This is illustrated in Sect. 4.4.2.

4.3.3 Evolution

Most of the typical active contour approaches use a local search algorithm as an evolution strategy. Thus, it is crucial to initialize the contour close to the desired object boundary. This constitutes one of the key problems of active contour applications. The exceptions are Brownian strings and potential active contours, which apply the simulated annealing algorithm as an optimization technique. And, even though other search techniques could also be used, the same approach is proposed also for active partitions, due to its simplicity and theoretical convergence with the global optimum [28,29,30].

In that approach, at every iteration a new solution is proposed using a solution generator \(\mathrm {G}\). The solution generator should generate a random solution which is close to the current one and it should enable the exploration of the whole search space during the optimization process (there should always be a possibility to generate a solution sequence transforming one solution to the other). If the generated solution is better, it is accepted as a current one. If it is worse, it is accepted with a probability depending on the difference in objective function values and on an artificial parameter, called temperature. The temperature decreases during the whole process, thus reducing also the probability of accepting worse solutions (at the beginning the temperature is selected, such that the probability of accepting worse solutions is equal to 0.8). In theory, if the whole process is sufficiently slow (infinite) this procedure guarantees that local optima are avoided. Naturally, practical applications must have a finite number of iterations, but even then the obtained results are usually satisfactory.

The choice of generator \(\mathrm {G}\) depends on the selected partition model. In a general case, the simplest \(\mathrm {G}^{f}\) generator can be considered, where the generation of a new partition involves the movement of a single element from \(\mathrm {E}_{O}\) to \(\mathrm {E}_{B}\), or in an opposite direction. As shown further, such a generator, flexible as it is, does not take into account the spatial relationships between elements. Those relationships would allow the addition of topological constraints, which in turn would result in more natural partitions.

In the experiments presented in this chapter, two additional modifications were introduced to the standard simulated annealing algorithm. Firstly, the temperature was not decreased at every iteration but every few iterations (after a number of iterations \(\mathrm {L}\)). The 100 temperature changes were allowed during the optimization process, since then the probability of accepting worse solution was almost equal to 0 (the exponential cooling scheme was considered with the 0.95 factor). Secondly, before every change of temperature the best solution found so far was set as the current one. Of course, since simulated annealing is a non-deterministic algorithm, there is a need to ensure that the results obtained are acceptable and repeatable. It can be done by proper choice of \(\mathrm {L}\) which should be selected for a given application in its training phase.

Finally, let us explain the process of partition initialization. Here, also different strategies can be used. For example, it may be assumed that at the beginning \(\mathrm {E}_{O}=\mathrm {E}\) and \(\mathrm {E}_{B}=\emptyset \). In the experiments presented below, other approaches were used, depending on the assumed partition model and solution generator G (initial solutions should not violate generator constraints).

4.4 Example

The present paper examines the active partition approach to object localization, focusing specifically on the problem of warning road sign detection. The problem in question is split into two phases—global and local. The global phase aims at localizing the areas of yellow color. In the local phase, those areas are analyzed in detail to find warning signs. Such an approach should correspond to fast inspection of the viewed scene to find the regions of interest and to the careful analysis of those regions. To some extent it should also imitate a conscious human-specific process of warning sign localization. In both phases, the same superpixel image representation is used. This choice was based on the assumption that human attention focuses on compact, homogenous regions rather than on single pixels. In the rest of this chapter, all the operations connected with colors are performed using the CIELab color space. In particular, it is used in the SLIC algorithm to generate superpixels and whenever the similarity of colors is discussed.

4.4.1 Global Analysis

The goal of this phase is to localize compact, yellow regions of interest. In real scenes, of course, there can be more than one such region and, naturally, not all of them need to represent warning signs. A superpixel representation should provide the ability to avoid problems with local color discontinuities (e.g. due to noise) at the pixel level. A sample image is presented in Fig. 4.2a to illustrate the concepts discussed in this section.

Fig. 4.2
figure 2

Sample input of global analysis: a the image with a generated superpixel representation, b distribution of \(\nu _{yellow}\) (the brighter color, the more yellow color is present in the superpixel and its neighborhood)

4.4.1.1 Energy

Since yellow regions are to be found, the energy evaluating the partitions should ensure that all the superpixels in \(\mathrm {E}_{O}\) are to some extent yellow. This requirement is, however, not sufficient. The regions of interest may be composed of many connected superpixels and the above-mentioned requirement will be satisfied for every subset of those regions. Consequently, a natural expectation is that superpixels in \(\mathrm {E}_{B}\) are not yellow. This can be expressed in the following energy function:

$$\begin{aligned} E^{c}_{color}(\mathrm {P})=w_{O}\sum _{e\in \mathrm {E}_{O}}I(\upmu _{color}(e)< t)+w_{B}\sum _{e\in \mathrm {E}_{B}}I(\upmu _{color}(e)\ge t) \end{aligned}$$
(4.1)

The \(\upmu _{color}\) represents the percentage of pixels of a given color within a superpixel and \(t=0.4\) is an arbitrarily selected threshold. The function \(I\) returns 1 if the given condition is true, otherwise it returns 0. This objective is composed of two components. The first should ensure that, for the optimal partition, \(\mathrm {E}_{O}\) contains only superpixels with a significant number of pixels of a given color. The second should minimize the number of such superpixels in \(\mathrm {E}_{B}\). Weights \(w_{O}\) and \(w_{B}\) provide the ability to control the influence of the components. If not specified otherwise, it is assumed that both of them are equal to 1.

The partition \(\mathrm {P}\) minimizing the above, crisp energy function naturally represents the regions of interest. However, from an optimization perspective, this objective function has one drawback. If subset \(\mathrm {E}_{O}\) is far from the optimal one (the distance in an image plane is considered) all the local modifications of the same size result in the same change of the energy function value. It means that there is no guidance available for the search algorithm on where the optimum is located. Thus, the simulated annealing requires more iterations to find a proper solution (the \(\mathrm {L}\) value must be increased). To overcome this inconvenience, the fuzzy variant of the energy can be defined. First, the color influence for each superpixel is calculated:

$$\begin{aligned} \theta _{color}(e)=\sum _{e'\in \mathrm {E}} \frac{\upmu _{color}(e')}{1+w\rho (e,e')} \end{aligned}$$
(4.2)

Its value depends on the distance \(\rho \) between superpixel centers. Parameter \(w\), which provides the ability to control the strength of the influence, should depend on the image size (in this work \(w=1\)). Next, the obtained values are scaled to fit into the [0, 1] interval:

$$\begin{aligned} \nu _{color}(e)=\frac{\theta _{color}(e)}{\max _{e'\in \mathrm {E}}\theta _{color}(e')} \end{aligned}$$
(4.3)

An example of \(\nu _{yellow}\) distribution among superpixels is depicted in Fig. 4.2b. Finally, the fuzzy energy value is computed using the following formula:

$$\begin{aligned} E^{f}_{color}(\mathrm {P})=w_{O}\sum _{e\in \mathrm {E}_{O}} (1-\nu _{color}(e))+w_{B}\sum _{e\in \mathrm {E}_{B}}\nu _{color}(e) \end{aligned}$$
(4.4)

4.4.1.2 Generator

As already mentioned, the general generator \(\mathrm {G}^{f}\) produces solutions that are close in a subsets space. The generated modifications, however, do not necessarily reflect local, spatial deformations of \(\mathrm {E}_{O}\) in an image plane. To have this property, the solution generator should take into account the spatial relationships (neighborhood relations) between superpixels. Such a spatial relationship can be easily computed. Two superpixels can be considered neighbors if they are adjacent, i.e. if they have at least one pair of adjacent pixels. The neighborhood relationship provides the ability to define some additional topological concepts. For example, borders \(b\left( {\mathrm {E}_{O}}\right) \) and \(b\left( {\mathrm {E}_{B}}\right) \) can be defined as subsets of \(\mathrm {E}_{O}\) and \(\mathrm {E}_{B}\) where elements have at least one neighbor from \(\mathrm {E}_{B}\) and \(\mathrm {E}_{O}\), respectively. Thanks to this, two additional generators, taking into account the spatial distribution of superpixels, can be defined:

  • \(\mathrm {G}^{s}\)—generator which either removes one element on the border \(b\left( {\mathrm {E}_{O}}\right) \) or adds one element on the border \(b\left( {\mathrm {E}_{B}}\right) \) (of course \(b\left( {\mathrm {E}_{B}}\right) \) is adjacent to \(b\left( {\mathrm {E}_{O}}\right) \)),

  • \(\mathrm {G}^{c}\)—generator which behaves in the same way as \(\mathrm {G}^{s}\) except that it prevents new solutions from having holes or being composed of two disconnected parts (preserves connectivity).

The second generator can be useful if only connected subsets \(\mathrm {E}_{O}\) are to be extracted.

Because the initial partition must not violate generator constraints, in all the experiments presented in this section a random element is selected from \(\mathrm {E}\) to initialize the partition. Next, all the elements that are spatially close to this element in the given range are added to constitute \(\mathrm {E}_{O}\). Again, the neighborhood relation is used to decide which superpixels are close to each other. The random choice of the initial element should demonstrate that the proposed methodology helps to avoid problems with careful partition initialization.

4.4.1.3 Repeatability

The simulated annealing is a non-deterministic optimization algorithm. Consequently, there is a concern that this algorithm does not guarantee repeatable solutions. The concern is the more reasonable that in the presented variant of active partitions no special limitations have been imposed on the location of initial solutions.

Thus, in order to prove that the approach presented provides stable results, another experiment was conducted, aimed at selecting a proper value of \(\mathrm {L}\). In this experiment, for selected values of \(\mathrm {L}\), the partition evolution was repeated 50 times for random initial partitions. The obtained results were summarized in several ways. The distribution of final energy values is presented in Fig. 4.3a. Figure 4.3b presents the percentage of superpixels in \(\mathrm {E}\) that always belong to either \(\mathrm {E}_{O}\) or \(\mathrm {E}_{B}\). The graphical representation of repeatability is depicted in Fig. 4.3c–f. The white color indicates the pixels that are always partitioned in the same way, whereas black suggests a lower repeatability of superpixel assignment. The more intense the black, the lower the repeatability. If the evolution is repeatable, the whole image should be white. It can be observed that (\(\mathrm {L}\ge 50\)) several stable local optima are always found. To achieve perfect repeatability (global optimum) the optimization has to last longer (\(\mathrm {L}=5000\)). Of course, those values are applicable only to the class of images under examination and the considered energy and solution generator.

Fig. 4.3
figure 3

Evolution repeatability (50 executions with random initial solutions): a distribution of energy \(E^{f}_{yellow}\) value for optimal solutions for different number of iterations \(\mathrm {L}\), b percentage of superpixels that were always assigned either to \(\mathrm {E}_{O}\) or to \(\mathrm {E}_{B}\), c, d, e, f visualization of the assignment changes

Repeatability is a key issue connected with the evolutions ability to explore the whole search space, in particular if random initial solutions are allowed and local partition modifications are performed by the solution generator. In Fig. 4.4 this exploration ability was presented for different \(\mathrm {L}\) values. The more black color, the more often a given superpixel was assigned to \(\mathrm {E}_{O}\) in a single run of the simulated annealing algorithm. This experiment also proves that for \(\mathrm {L}=5000\), in the presented task, it should be possible to find a global optimum (all the superpixels were assigned to \(\mathrm {E}_{O}\) at least once).

Fig. 4.4
figure 4

Image exploration while partition evolution with \(\mathrm {G}^{c}\) generator (the darker the color the more frequently a given superpixel was assigned to \(\mathrm {E}_{O}\))

4.4.1.4 Multiple Objects

The goal of the global analysis is to quickly find an approximate position of all yellow and connected regions. For that purpose \(E^{f}_{yellow}\) and \(\mathrm {G}^{c}\) are used. Unfortunately, active partitions, just like active contours, are naturally designed to find only single objects (usually only one optimum of the energy function is sought). Here, a simple modification was introduced to enable a multiple object localization. When the optimum is found, the energy function is modified by setting \(\upmu _{yellow}\) values equal to 0 for all those elements that belong to \(\mathrm {E}_{O}\). This process is repeated until no significant optimum is found. Because it does not matter in what order those optima are detected there is no need to explore the whole search space. Thus, it is assumed that \(\mathrm {L}=200\). Thanks to this and the initial reduction of image size it was possible to speed up the whole process. The sample results of the multiple object localization process are shown in Fig. 4.5.

Fig. 4.5
figure 5

The process of detection of multiple regions of interest: a, c, e changes in \(\nu _{yelllow}\) distribution, b, d, f extracted region

Fig. 4.6
figure 6

Optimal partitions for standard partition model, \(E^{f}_{yellow}\) energy and \(\mathrm {G}^{c}\) generator

4.4.2 Local Analysis

The goal of the local analysis is to find warning signs in the previously localized regions of interest. The diversity of the images leads one to assume that there are more than one yellow regions in the analyzed image. Moreover, it may happen that those regions are adjacent, as the sign and information plate in Fig. 4.1d. In such a situation, the application of the active partition technique with \(E^{f}_{yellow}\) and \(\mathrm {G}^{c}\) cannot give satisfactory results. To overcome this problem, additional knowledge is required.

4.4.2.1 Model

A closer analysis of the results presented in Fig. 4.6 reveals that the previous approach does not take into account shape expectations. Those expectations can be added in many different ways. A typical approach would involve defining an additional energy component evaluating the similarity of the partition to the triangle. Although it is not impossible, such soft constraints are usually problematic, especially if they are supposed to be scale and rotation invariant. That is why, in the presented work, knowledge was added by changing the partition model to allow only triangles to be generated. In this model, the partition is described by three superpixels selected from \(\mathrm {E}\). Their centers (it is assumed that they are always organized in an anti-clockwise order) constitute a triangle. Superpixel is an element of \(\mathrm {E}_{O}\) if at least one of its pixels lies inside this triangle. The rest of superpixels forms \(\mathrm {E}_{B}\).

4.4.2.2 Generator

A modified partition model requires specialized solution generators. They should modify the partition by moving triangle vertices. Two such generators are described below:

  • \(\mathrm {G}^{v}\)—the generator which selects a random vertex and replaces it by one of its neighbors (the above-described neighborhood relationship is used); it forbids any modification that would change the anti-clockwise order of vertices,

  • \(\mathrm {G}^{e}\)—the generator which behaves in the same way as \(\mathrm {G}^{v}\) but also prevents any modifications that would lead to a non-equilateral triangle (for every triangle side it checks if the corresponding height has an expected length).

The changes in the model (search space) and in the solution generators (hard constraints) entail that a feasible initial solution for the simulated annealing algorithm is necessary. To achieve this goal, in all the experiments the equilateral triangle of a maximum size with one horizontal side is generated (although it is not random, it still does not depend on the image content). The results obtained for those generators and \(E^{f}_{yellow}\) are presented in Fig. 4.8.

4.4.2.3 Energy

The above results are still not satisfactory if there are two adjacent yellow regions. This can be overcome by considering another objective function. So far, \(E^{f}_{yellow}\) has not taken into account the spatial distribution of colors in \(\mathrm {E}_{O}\) and \(\mathrm {E}_{B}\). It is known, however, that warning signs have a red border enclosing its inner, yellow area. This observation can be expressed in the following way:

$$\begin{aligned} E^{b}(\mathrm {P})=w_{O}\sum _{e\in b\left( {\mathrm {E}_{O}}\right) } (\nu _{red}(e)-\nu _{yellow}(e))+w_{B}\sum _{e\in b\left( {\mathrm {E}_{B}}\right) }(\nu _{yellow}(e)-\nu _{red}(e)) \end{aligned}$$
(4.5)

On the \(\mathrm {E}_{O}\) border this energy function expects yellow superpixels, not the red ones, whereas on the \(\mathrm {E}_{B}\) border the expectation is exactly the opposite. Weights \(w_{O}\) and \(w_{B}\) (here equal to 1) enable the control of a trade-off between those two expectations. Sample distributions of \(\nu _{yellow}\) and \(\nu _{red}\) for image presented in Fig. 4.1d are depicted in Fig. 4.7. In Fig. 4.8c the result of triangle evolution for \(E^{b}\) with \(\mathrm {G}^{e}\) is presented.

Fig. 4.7
figure 7

Sample yellow and red color distribution

Fig. 4.8
figure 8

Results of evolution for triangular partition model with different energy functions and solution generators

4.4.2.4 Missing Objects

Not all of the regions of interest must contain warning signs. The proposed active partition approach will of course work for such images and, what is more, it will generate some optimal results. Samples are shown in Fig. 4.8. Those results are reasonable as the algorithm tries to find the best position of the triangles. To automatically distinguish such cases, without the need of visual inspection of the results, the values of energy functions for optimal partitions can be analyzed. If something is wrong, these values are significantly higher than those obtained for correct structures.

4.5 Summary

The approach proposed in this chapter is a generalization of the active contour technique which can be applied to more sophisticated image content representations than raw pixel data. Its main advantage is the reduction of the search space, which enables the application of evolution strategies that are less sensitive or invariant to the choice of initial solutions (Sect. 4.4.1.3). This also means less strict assumptions about feasible objective functions. Consequently, a more natural way of expressing the expectations about the structures of interest is provided (Sects. 4.4.1.1 and 4.4.2.3).

As in the case of active contours, the knowledge required for proper analysis of the image content in active partitions can be incorporated into the search process in three ways. A proper partition model may limit the set of acceptable partitions (Sect. 4.4.2.1). The same goal may be achieved by using additional evolution constraints (region connectivity and equilateral triangles in Sects. 4.4.1.2 and 4.4.2.2, respectively). Finally, information about the expected characteristics of the partitions may be incorporated into the energy function.

Another remarkable aspect of the presented approach is its flexibility. As demonstrated by the two-stage process of warning sign detection, it can be applied to both global and local image analysis. Moreover, in the global phase, multiple objects can be localized (Sect. 4.4.1.4) by adaptive modification of the energy function.

The proposed methodology endeavours to imitate, at least to some extent, the conscious human-specific process of image analysis. Various approaches have been put forward to model the activity of the human vision system. In the literature, there are many methods that have achieved outstanding results in the field of image content understanding—convolutional neural networks ([31]) being a perfect example. Those models, however, are hardly interpretable and usually require huge data sets from which the expert knowledge could be automatically extracted in a training phase. Those data sets are not always easily available, especially in medical applications. That is why it may be more convenient to encode expert experience directly using the approach presented in this chapter.

As a main challenge for future work with active partitions, the choice of the best image representation should be mentioned. And this is not only the problem of optimal parameter selection (i.e. parameters of SLIC, LSD or CEC algorithms). As it was presented in this chapter, different approaches may be considered. Superpixels and ellipses focus on local region homogeneity, whereas line segments indicate some homogeneity discontinuities. These are not, however, the only possibilities, and, since all of them possess different properties, a good idea could be the fusion of those image content descriptions. Yet, the drawback of those representations is a fact that they are chosen arbitrarily for different classes of images. Perhaps a better approach would be an automatic design of such descriptors for a given object localization problem. All those aspects are under further investigation.