1 Introduction

Image inpainting aims to fill the missing data of an image with new information so that the reconstructed image looks as natural as it has not been damaged. In recent years, much effort has been devoted for developing powerful image inpainting algorithms and tools. Unfortunately, so far neither of these methods can solve this problem perfectly; in particular, the repair of the structural information is still an open issue.

Previous works in image inpainting can be roughly divided into two categories, namely partial differential equation (PDE)-based image inpainting methods and example-based inpainting methods. The former leverages PDE to solve the image restoration issues, using the local consistency property of the image to calculate new information from the known areas adjacent to the regions to be repaired with complex mathematical computation. This method works well for images with small damaged areas; however, it generates less satisfactory results for images with relatively large damaged areas. Instead, example-based inpainting methods restore the damaged areas by searching for the nearest neighbor region in the known areas and use the value of this nearest neighbor to repair the damaged block. These types of methods formulate inpainting as a nearest neighbor searching problem, which are suitable for restoring large damaged areas. The proposed method in this paper also belongs to the latter category.

Example-based image inpainting methods improve the restoration quality by solving the following problems: (1) how to select a good repair order and (2) how to handle the nearest neighbor searching. Recently, researchers devoted to these areas and developed some effective algorithms. For example, the best nearest neighbor currently is regarded as the weighted sum of a few k nearest neighbors, in order to reduce the visual discontinuity; the repair process is then divided into the structure information restoration and the texture information restoration so that the repair accuracy could be improved and, on the other hand, the repair efficiency could be higher since the searching timings for the nearest neighbors would be much lower owing to the image structure divisions.

Although much progress has been made for image inpainting, some problems were still open. First, it can enhance the inpainting quality and efficiency to separately repair the structure information and the texture information; however, previous methods cannot well repair the structures by simply searching the structures only once. As shown in the red ellipse area in Fig. 1a, previous methods fail to reconstruct the scene structures completely. Second, the repair results could be enhanced by weighting k nearest neighbors when searching for the filling information, since more global image information is taken into account. However, another important image property—local coherency—is ignored, which therefore usually leads to unnatural repair results. It can be seen in the red ellipse area in Fig. 1b that obvious visual fractures occur on the roof. Additionally, previous methods would often result in a greedy procedure as they only account for the confidence value and the data value of the patch to be repaired. Hence, once an error is made, it may be inevitably magnified in the greedy repair procedure. As shown in the red ellipse area in Fig. 1c, the grass unnaturally extends to the sea, which is not visually plausible.

Fig. 1
figure 1

Illustration of the problems in previous methods. a The computed scene structures and the missing scenes structures. The red rectangles are the calculated feature points, and the red lines refer to the reconstructed scene structures. Note that in the ellipse marked area, the borders between the shore and the water are missing; b, and c show the uncorrected repair results for the marked areas (color figure online)

To remedy the above problems, this paper developed a novel structure-aware image inpainting method. Our method is suitable for repairing images with relatively larger size of missing areas, which is especially good at repairing the missing patches with obvious structure information. It offers the following contributions:

  1. 1)

    A new iterative image structure searching method is proposed. The structure information is searched and judged in each divided region until the area of the region meets a threshold.

  2. 2)

    Domain -based image repair is developed instead of patch-by-patch repair. The neighboring coherent patches would be clustered as a repair domain, which ensures the coherency and searching accuracy of the results.

  3. 3)

    A novel repair order calculation method is introduced to replace the conventional greedy procedure. This can greatly reduce negative impact of the error propagation.

2 Related work

Image inpainting methods can be roughly classified into two categories: PDE-based methods and example-based methods. Bertalmio et al. [1] proposed the idea of adopting PDE for image inpainting. Two different equations were developed for representation of the structure information and the texture information, and the repair for the damaged region was realized by solving the equations. They also introduced total variation (TV)-based image inpainting algorithms. This type of image inpainting works successfully for repairing images with small damaged areas, but not suitable for restoration of images with relatively large damaged regions. To overcome this limitation, example-based inpainting methods were proposed and received extensive attention. Below we will mainly review the progress of example-based inpainting methods.

Criminisi [2] proposed to fill the unknown patch by searching the most matching patch in the known areas of the image. This example-based image painting method achieved wonderful results for repairing images with large damaged areas. However, two problems of this method exist affecting the repairing quality: the repairing order for the contour regions and the accuracy of the patch matching. Researchers [16] attempted to solve these questions in the following years. Sun et al. [3] proposed structure propagation which allows the user to freely specify important missing structure information by extending a few curves or line segments from the unknown regions. Komodakis [4] and Wong and Orchard [5] improved Criminisi’s method by designing a global optimization method so that the reference information comes from the global area rather than a local region. Xu and Sun [6] introduced a new method for computing the repair order. They employed the sparsity of natural images and judged if a patch belongs to the image structure information with the sparsity of the patch’s nonzero similarity neighborhood. Thus, each patch’s computational priority can be obtained. In comparison with conventional example-based image inpainting methods, this method can better differentiate between the structure information and the texture information, while the patch’s sparsity can make the new synthesized region deform so that it would be consistent with the surrounding textures. Noriega and Roumy [7] observed that the calculation of the repair priority in previous methods was complex, which greatly affected the whole computational efficiency. They simplified the conventional repair priority calculation equations which increased the efficiency while it could still achieve compared image quality.

Ndjiki-Nya et al. [8] proposed to separately handle the texture information and the structure information so as to achieve better inpainting quality. This method consists of four procedures, namely image segmentation, structure difference, texture repairing, and post-processing. Their idea of, respectively, dealing with the texture information and the structure information inspired later researchers. Li and Zhao [9] determined the structure information with wavelet transform, which avoided the complicated image segmentation. In their method, the repair order was calculated by considering the texture and geometry characteristics of the region to be repaired. Cao et al. [10] pointed out that current example-based image inpainting methods can well restore both texture and local geometry and, however, some applications require to reconstruct nonlocal geometric features, e.g., long edges. They therefore presented a geometric sketch-based inpainting method, in which the sketch is interpolated and serves as a guide for the global reconstruction. This was the first to adapt the sketch method to evaluate the nonlocal structure information. Wu and Chou [11] proposed a Bezier curve-based image inpainting method. They applied iterative Otsu threshold method to image segmentation and thus obtained the edge information. This method can reconstruct the skeleton of the damaged area, breaking the restrictions of conventional boundary-based reconstruction method which can only produce linear and circular edges. Lee et al. [12] introduced a region segmentation mapping-based image inpainting method. In this method, the image is divided into segments and the repair priority is automatically computed, which greatly enhances the efficiency. Later, Martinez-Noriega and Roumy [13] demonstrated a new automatic image inpainting algorithm. The prior is defined for judging the structural information. They also presented a novel Macro-Filling Order (MFO) to serve as the scheduling order. This method offers a major advantage that neither segmentation nor manually intervention is involved. Additionally, Bugeau et al. [14] observed that previous image inpainting methods are based on three types of theories: image self-similarity, image consistency, and information propagation. Therefore, they were the first to combine the advantages of the above three aspects so as to realize better inpainting results. He and Sun [15] proposed to fill the missing region by combining a stack of shifted images via optimization, which could produce generally better results and runs faster than conventional methods. Recently, Le Meur et al. [16] introduced a novel example-based inpainting framework. They first performed inpainting on a coarse version of the input image and then employed a hierarchical super-resolution algorithm to further restore the missing details. This method gains in terms of both the algorithm efficiency and the restoration quality. Guillemot et al. [17] proposed a method for object removal and loss concealment by neighbor embedding. Experiments showed the effectiveness of this method. Although the methods in [1317] can achieve good inpainting results for most of the images, they still share the problem of relatively lower inpainting accuracy for images with rich structures, since they ignore the local consistency property of the images.

3 Methods

In this section, we will give the overview of our proposed algorithm and then detail on the main steps.

3.1 Overview

Figure 2 illustrates the overall work flow of our method. It consists of three main steps: judging and searching the structural information, repairing the structural information, and repairing the texture information.

Fig. 2
figure 2

Framework of our method

3.2 Searching and judging the structural information

In this step, we will search the possible structures from the damaged area in the image and further judge whether they can be accepted as structural information. The useful structural information usually refers to the regions intersecting with the boundaries of the unknown regions, or the border regions with significant changes in texture or with obvious geometry structures. In this paper, we select those with a simple geometry structure and consistent gradient which are approximately perpendicular to the border areas as structural information. The reason lies in that more complex structural information, such as a tree-like structure or a ring-like structure, usually belongs to texture information rather than structural information. Note that the damage areas are determined by user annotation as an input in our method [3]. Here, we marked the damage areas as the same color (e.g., blue) which is different from the other parts of the image.

After obtaining the structural information using the above strategy, we would judge whether they share the similar directions, i.e., whether they could be connected through the unknown areas; if they satisfy the above condition, we would accept them as the structural information for the following processing. Below we will introduce the detailed procedures.

Here, following the method in [7], Canny operator is used for extracting the feature pixels. Structural information searching aims to find the pixels which could be connected in the surrounding areas outside the damaged unknown region within 20 pixels, and the found pixels will be listed as structural information candidates. If one pixel i is within the 8 neighborhoods of another pixel j, these two pixels can be considered as connected. Through the above steps, a structural information set candidate would be obtained. Next, we would select in this structural information set candidate for the useful structural information.

One type of structural information to be rejected is the curve-like structures, i.e., the structural information with more than one nonterminal node; this is due to that if a structure has multiple nonterminal nodes, it is more likely to belong to a tree structure instead of a linear structure, while a structure without any nonterminal node should belong to a ring structure which would also be rejected. After the above first round of selection, we would further delete the structural information with too few feature points, i.e., if the number of the useful pixels is less than a threshold, the candidate information would be deleted. As aforementioned, the useful structural information must be approximately perpendicular to the boundaries of the unknown area. Therefore, if the structural information candidate does not satisfy this condition, it would also be rejected.

After the above process, the final useful structural information will be obtained. Then the information would be connected to form the structure which is used for dividing the whole image into regions. In this procedure, we would determine whether there are two pairs of structure information regarded as connected on a same boundary. We perform this process by calculating the distance between two structural lines which is used to determine whether the structure is along the same line. Note that here we also took into account the distance between the endpoints of two structural lines, and the angle between two structural lines.

The above refers to one iteration of the structure searching and connecting; next we will perform it iteratively via multiple region division so as to find more accurate structures. This iterative process can be summarized as follows: after generating a new region, we would change the parameter values of Canny operator in order to get richer structural information. Then, we would further search the structural information in the new region and connect them to generate structures. If the size of the new generated area is larger than the product of factor and the size of the entire image, this area would be further divided until its size is less than the product of factor and the size of the entire image. Note that factor is a user-defined parameter.

As shown in Fig. 3, richer structural information can be computed using our method (e.g., the bottom red line which refers to the structural boundary between the shore and the sea in Fig. 3d). This experiment showed that our structural information searching and connecting strategy is reliable and effective.

Fig. 3
figure 3

Comparison for the structural information searching and connecting. a The original input image, b the marked area to be repaired, c the searched and connected structural information using the method in [13], and d the searched and connected structural information using our novel iterative method (color figure online)

3.3 Repairing the structural information

For the structural information restoration, we mainly adapt the successful method in [3]. Given the unknown region \(\varOmega \), the entire image I, and the known area \(I-\varOmega \), our aim is to obtain the structural information C to be repaired.

We first sample C in the region where C intersects with \(\varOmega \), leading to obtain a collection \(L=\left\{ {p_i } \right\} \), wherein the structure information pixels are the centers of each repair region. Similarly, we sample in the known region and get the resulted collection \(\left\{ {p_j } \right\} \). Then, for each \(p_i\), we would compute its nearest neighbor in \(\left\{ {p_j } \right\} \), and use \(P(x_i)\) as the color of the filling block. The above process will be continued until the structural information for each block is filled. Figure 4 demonstrates the structural information restoration result for a given image.

Fig. 4
figure 4

Structural information restoration result. a The original input image with the marked area to be repaired, and b the repaired structural information using our method

3.4 Repairing the texture information

In contrast to conventional image inpainting algorithms, this paper proposed two improvements for texture information restoration: First, we performed texture information repairing in domains rather than each block; second, different from the previous greedy repairing process, we specially designed a new repair order calculation method, which makes the greedy repair order tend to be plausible and thus greatly improves the quality of the inpainting results.

In traditional example-based image inpainting methods, the nearest neighbor searching was usually performed for each patch individually, and then its color value was used to fill the holes without taking into account the local consistency and the globality of the image. Frédéric et al. [16] used weighted k nearest neighborhood instead of a single nearest neighbor in the process of patch matching; however, they still did not account for the local consistency of the image. Motivated by this, we proposed the definition of repair domain to fully take advantage the local consistency of the image. Here, a domain consists of the adjacent patches, which serves as the minimal repair unit in our method. Experiments showed that our method can produce repairing results which are more consistent with human visual characteristics.

As shown in Fig. 5, \(\varOmega \) is the unknown region, \(I-\varOmega \) denotes the known region, \(\varepsilon \) represents the contour of the unknown region, and \(p_1, p_2, p_3 \), and \(p_4 \) are three patches to be repaired. We first perform matching for \(p_1\), i.e., search n nearest neighbors (here \(n=4\)) which are \(P_{11}, P_{12}, P_{13}\), and \(P_{14}\). Then, we perform matching for \(p_2 \) and acquire its nearest neighbors \(P_{21}, P_{22}, P_{23}\), and \(P_{24}\). In a similar way, the nearest neighbors of \(p_3 \) and \(p_4 \) are, respectively, found as \(P_{31}, P_{32}, P_{33}\), and \(P_{34}\) and \(P_{41}, P_{42}, P_{43}\), and \(P_{44}\). We separately calculate the differences between the horizontal coordinate and the horizontal coordinate of the patch to be matched and its nearest neighbor. For example, the horizontal coordinate difference between \(P_{13}, P_{22}, P_{34}\), and \(p_1, p_2, p_3 \), is 1; the vertical coordinate difference between them is \(-2\). Hence, we can regard \(p_1, p_2 \), and \(p_3 \) belong to a domain \(R_1 \), which means that they can be repaired using the domain \(R_2 \) where \(P_{13}, P_{22}\), and \(P_{34}\) locate. In contrast, neither of the n nearest neighbors of \(p_4 \) satisfies the condition that the horizontal coordinate difference is 1 and the vertical coordinate difference between them is \(-2\) and, therefore, \(p_4\) does not belong to the domain \(R_1 \). In this manner, we can fully exploit the local consistency of the image, making each patch to be repaired not isolated. The restoration result would be therefore more smooth, and more consistent with the human visual characteristics.

Fig. 5
figure 5

Illustration of the domain generation. In the known region \(\varOmega \), the red patch, the blue patch, and the green patch are, respectively, marked as \(p_{1}, p_{2}\), and \(p_{3}\); in the unknown region \(1-\varOmega \), the patches with the same color with that in \(\varOmega \) refer to the corresponding nearest neibor candidate. When \(p_{1}, p_{2}\), and \(p_{3}\) are adjacent, they could be grouped as a repair domain which would serve as the minimal repair unit (color figure online)

Fig. 6
figure 6

Illustration of the calculation of the confidence values. a The confidence values before repairing (note that 1 and 0, respectively, refer to the known pixels and the unknown pixels), b calculation of the confidence value of each patch in the unknown region, c the confidence values after one iteration of repair, and d the confidence values to be used for the next calculation

The filling calculation in previous methods suffers from a greedy process, i.e., once an error occurs, it will be quickly amplified, and thus generates undesired result which is contrary to the human visual habits; this phenomenon is also known as error propagation. In order to solve this problem as much as possible, when calculating the value of each pixel we take into account the filling order (or whether it is repaired) of each pixel. The original greedy process thereby becomes more reasonable and reliable, which can greatly reduce the impact of the error propagation phenomenon on the restoration result.

$$\begin{aligned} P(p)= & {} C(p)D(p), \end{aligned}$$
(1)
$$\begin{aligned} C(p)= & {} \frac{\sum _{q\in \varPsi _p \cap ({I}-\varOmega )} {C(q)} }{\left. {\left| {\varPsi _p } \right. } \right| }, \end{aligned}$$
(2)
$$\begin{aligned} D(p)= & {} \frac{\left| {\left. {\nabla {I}_p^\bot \cdot n_p } \right| } \right. }{\alpha }, \end{aligned}$$
(3)

Our strategy is expressed by Eqs. (1) through (3). Equation (1) gives the calculation for the priority value P, which consists of the confidence value C and the data value D. Here, p is a point on the contour of the unknown region, \(\varPsi _p \) refers to a block with p as the center, \(n_p \) is the normal vector at point p which is perpendicular to the contour, \(\nabla {I}_p^\bot \) denotes the extension direction of the brightness variation, i.e., the isophote direction, at point p. The data value D reflects the strength of the structural information at the position to be repaired; the larger the value D, the stronger the structural information indicates. The confidence value C represents the number of the valid points at the position to be repaired. Here, the valid point refers to the point in the known area or the repaired point. In contrast, the confidence value of the pixels in the unknown region without being repaired is set to 0. Therefore, the larger the confidence value is, the greater the number of pixels in the block of known regions is, and thus, it would own a high calculation priority. Note that the confidence value of the valid point is simply set to 1, which would lead to an error if the repaired pixel is not correct. Thus, this error propagation would further continue in the following process. As shown in Fig. 6, in our new strategy the confidence value of each block would be changed as 1/k multiplying the nonzero confidence value after it has been repaired. Thus, the confidence value would decrease when the distance between the unknown region and the known region gradually becomes larger, i.e., the confidence of this pixel belonging to the known region would decrease so that the confidence value of the block would also gradually reduce accordingly.

4 Results and discussion

We tested the above method on a PC equipped with Intel(R) Core i3-540 3.07GHz CPU, 2GB memory. The programming is written using OpenCV with Visual Studio 2010. Various experiments were performed to validate our method.

In Fig. 7, we compared our method with state-of-the-art methods. It can be seen that in the red ellipse areas in Fig. 7, the repair results for the roof differ greatly. In Fig. 7b through 7d, obvious visual fractures exist, and undesired color appears, which are not consistent with the human visual effects. Similarly, in Fig. 7e, although no obvious abnormal color can be found, in the middle part of the roof the fracture is also very clear, which might be caused by the single patch-based repairing. In contrast, observing the effect of our repair method in Fig. 7f, owing to the domain-based repairing rather than taking single block as a repairing unit, our restoration results seem more complete and continuous, which is in accordance with the human visual characteristics. It can be observed in the area marked with a green ellipse in Fig. 7b that the land extends into the sea which goes against the common sense. The reason for this problem lies in the unreasonable repair order—a greedy repair process, which means once an error occurs during the repair process, it will continue and be magnified, leading to the final unplausible result. Martınez-Noriega and Roumy’s method [7] employed a macro-filling order which can successfully avoid the error propagation [2] when handling the coastal part. However, for the shaded area of the house, the color looks much lighter which is clearly incompatible with the common sense that the shade color should be lighter and darker. Again, this may be caused by the unreasonable repair order, as it performs an equal treatment for the repaired block and the block in the known region when calculating the confidence value. In contrast, our method can obtain a more accurate confidence value by changing the confidence value of each block in each computational cycle. As shown in Fig. 7f, our method achieves better visual effects for the shadow area of the house.

Fig. 7
figure 7

Comparison of the repair results among state-of-the-art methods and our method. a The image with marked area to be repaired, b the result of Criminisi et al.’s method [2], c the result of Photoshop, d the result of Le Meur et al.’s method [16], e the result of Martınez-Noriega et al.’s method [7], and f the result of our method (color figure online)

Figure 8 performs another comparison among our method and other image inpainting methods. It can be observed in the red ellipse area in Fig. 8 that our method (Fig. 8e) achieves better visual repair results: in particular, the structural lines are more complete, and the repaired region looks more consistent with the surrounding areas. Moreover, the repaired result of the contour of the black object is more clear and plausible than Fig. 8a–c, which is comparable with the repair result in Fig. 8d. Figure 9 displays more restoration results generated using our new method. Overall, the structural repair results look naturally and visually plausible, which is difficult to detect the artifact. Figure 10 demonstrates a failure example of our approach. Note that the left side of the entrance and the missing steps are not greatly repaired. The color of the missing patches is similar to that of the surrounding areas, leading to the uncorrected structure information.

Fig. 8
figure 8

Comparison of the repairing results among different methods. a The result of Criminisi et al.’s method [2], b the result of Photoshop, c the result of Le Meur et al.’s method [16], d the result of Martınez-Noriega and Roumy’s method [7], e the result of our method, and f the image with marked area to be repaired (color figure online)

Fig. 9
figure 9

More inpainting results. From left to right, the image with the marked area to be repaired, the result generated by Photoshop, the result of Criminisi et al.’s method [2], the result of Martınez-Noriega and Roumy’s method [7], and the restoration result using our method

Fig. 10
figure 10

A failure example of our method

Thanks to the use of the domain-based repairing strategy and a new nongreedy repair order, our image restoration results are more plausible, which are consistent with the human visual habits. Meanwhile, the computational efficiency of our approach does not decrease with the new priority calculation algorithm and the domain-based computational manner. The overall computational efficiency still depends on the efficiency of the nearest neighbor search processing, which is related to the size of the unknown area as well as the size of the known region in the image to be repaired. However, the novel domain-based image repairing method also has some limitations. For instance, it may give rise to a certain block phenomenon when repairing an image with messy textures, which is one of our future works.

4.1 Quantitative comparison

He and Sun [15] introduced a coherence metric for measuring the inpainting quality of an image, which was expressed as follows:

$$\begin{aligned} d_\mathrm{coherence} =\frac{1}{N}\sum _{P\in \varOmega } {\min }_{Q\in \overline{\varOmega }} ||P-Q||^{2}, \end{aligned}$$
(4)

where N is the number of unknown pixels, P is a patch in the synthesized region \(\varOmega \), and Q is a patch in the known region \(\overline{\varOmega }\). This measure penalizes any patch P in the synthesized region if its best match Q in the known region is not similar to it. We employed this metric for comparison (see Table 1) of the inpainting quality of the results generated by different methods in Fig. 9. It can be seen that our method performed better than other methods under the coherence metric.

Table 1 Comparison of the inpainting quality of different methods using the coherence metric

5 Conclusion and future work

This paper has presented a novel domain-based structure-aware image inpainting method. Our method can generate more complete and accurate structural information by designing an iterative structure searching strategy. For the texture repairing, the new method fully exploited the local consistency of the image and connected the related patches to form domains, which were the minimal repairing units in our method. This method not only ensures the visual continuity between the blocks, but also better accounts for the globality of the whole image. Additionally, we proposed a novel filling order calculation algorithm so that when computing the confidence value, the repaired block and the block in the known region are no longer handled in the same way, which would lead to a more reasonable repairing order.

Although the introduction of domains achieved successful results for most of the cases, it might suffer from relatively clear seams between domains for images with messy textures. We will further improve our method and deeply explore in this area in the future.