1 Introduction

Background subtraction is a hot research topic in computer vision and has many applications in video surveillance, human-machine interaction and sports video analysis [1, 8, 2527]. In real-world applications, the videos are usually recorded by a moving camera in which case standard background subtraction algorithms do not work. A possible solution is to first generate wide-angle view images with image mosaic softwares and then apply standard background subtraction algorithms on the wide-angle view images. In the past few decades, a large number of mosaic algorithms were proposed, which achieve good performances [12, 22, 28]. There are also some existing mosaic softwares available, such as AutoStitch [2], Microsoft ICEFootnote 1, and Panorama MakerFootnote 2. However, to ensure good mosaic effect, most current algorithms and softwares have to meet the following constraints:

  1. 1.

    The scenes must be static. Most mosaic algorithms are designed for static scenes. If the scenes contain moving objects such as moving vehicles, pedestrians, leaves under wind and so on, ghosting effect would appear in mosaic results, or even worse mosaic will fail due to the parallax of moving objects [2, 14].

  2. 2.

    The camera should be kept rotating around its optical center when capturing images. A large deviation from optical center would also cause ghosting effect and thus result in the mosaic failure [2, 4, 10, 17, 20].

  3. 3.

    Large exposure differences are not allowed between images. Large exposure difference will lead to inaccurate registration and mosaic failure [3, 24].

There is some work that tries to solve the dynamic mosaic problems mentioned above. Echigo et al. and Iranio et al.  [7, 13] used a median filtering operator to eliminate the ghost problem in their mosaic processes. However, these methods require large overlapping regions without obvious intensity differences between the images. Shum and Szeliski [21, 23] proposed a method that uses a set of transformations to construct a full view panorama, but they are not suitable for mosaic conditions with strong motion parallax.

David [3] first used phase correlation to perform image registration, and then calculated the intensity difference between the registered images by the overlapping regions, and finally search for the optimal stitching line by using the Dijkstra algorithm. However, if there are large exposure differences between the registered images, the optimal stitching line searching may not be achieved using intensity difference alone.

Milles and Dudek [19] proposed an optimal seam searching criterion that combines intensity difference and gradient difference. The Dijkstra algorithm was then used to optimize the seam line searching. However, if the moving objects are similar to the background of the adjacent image to be matched, the optimal seam line may divide the whole moving object into two parts. In this case, the composite image would contain ghosting effect.

Image registration is a critical step for image mosaic. The commonly used registration methods are mainly divided into three classes: template matching, mutual information (MI), and features-based methods. The template matching methods firstly get a matching template by selecting a window from the overlapping regions [15], and then search in another image until it reaches the highest matching degree. This kind of methods can solve the image registration with low computational complexity when there is only pure translation. MI-based methods were originally proposed for the registration of multi-modality images. It was commonly used for medical image processing in recent years [29]. However, MI-based approaches are often powerless in the presence of serious occlusions. Mikolajczyk and Schmid [18] compared several popular feature point matching algorithms, and drew the conclusion that the SIFT algorithm  [16] is better than other feature-based algorithms in many aspects. The SIFT feature has many advantages, such as invariant to image scale and rotation, robust to affine distortion, and even to occlusion, noise, and illumination variations.

In this paper, to address the problems mentioned above, we propose an algorithm named CSDP, which was mainly designed for dynamic scenes with moving objects, large exposure differences and big parallax. Firstly, SIFT matching was applied to perform image registration.Then we propose an optimal seam searching criterion modified from Mills and Dudek [19]. The optimal seam searching criterion is established by combining gradient difference with edge-enhanced weighting intensity difference, which provides an effective mechanism for avoiding problems caused by moving objects and as a result the stitching line will bypass the moving objects. Subsequently, we searched for the stitching line based on the improved optimal seam searching criterion by applying dynamic programming [6] which is relatively easier than the Dijkstra algorithm. Finally, the multi-resolution image fusion method was used to minimize inconsistency nearby the stitching line.

In the following sections, we will give more details about the proposed method using the example images in Fig. 1. To verify the effectiveness of the proposed method, we compare it with three popular softwares, i.e., AutoStitch, Microsoft ICE, and Panorama Maker as well as Mills’ approach.

Fig. 1
figure 1

Example images for mosaicing to demonstrate the implementation procedure of the proposed CSDP method

2 CSDP algorithm for dynamic scene mosaic

There are mainly three steps in the proposed method: image registration, improved optimal seam searching, and multi-resolution image fusion.

2.1 Image registration

As mentioned in Sect. 1, we adopt SIFT feature matching to register two images including the following four steps:

  • Step 1 Extracting SIFT descriptors from two adjacent images to be registered.

  • Step 2 Applying an approximate nearest neighbor algorithm for SIFT descriptor matching.

  • Step 3 Deleting the mismatching points (outline points) by RANSAC algorithm [9]. As shown in Fig. 2, the correct matching points (inline points) are marked in the red circle, and the mismatching points deleted by RANSAC algorithm are marked in the green circle.

  • Step 4 Computing the projection between two images based on the matching points using the general linear least square method or some recently developed approaches, such as [30].

Fig. 2
figure 2

SIFT matching results. The correct matching points are marked in the red circle and the mismatching points are marked in the green circle

2.2 Improved optimal seam searching

After registering (i.e., the projection matrix was computed), we can figure out the overlapping regions between two adjacent images. And then we stitch the registered images together along the seam in the overlapping region. The ideal optimal seam should satisfy the following requirements [6]:

  • The difference in intensity around the stitch line should be minimal.

  • The difference in geometrical structure around the stitch line should be minimal.

Actually, it is very hard to meet the above two requirements at the same time since there are inevitable exposure differences between the images taken by a hand-held camera. To overcome this problem, we first correct the exposure difference using the inline matching points mentioned in Sect. 2.1. After exposure correction, we introduce an edge-enhanced weighting intensity difference (EWID) operator to replace the traditional intensity difference mentioned in Mills and Dudek [19]. For the similarity of geometric structure in the overlapping region, we formulate it by gradient difference. The detailed procedures are introduced in the following subsections.

2.2.1 Exposure correction

The exposure differences between the images are unavoidable whether taken by the same camera at different times or different cameras at the same time. Assuming the optical reflection characteristics of objects in the overlapping regions remain constant, we can estimate the exposure correction between two images by a linear model:

$$\begin{aligned} pixel_{1}=\alpha \times pixel_{2}+\beta \end{aligned}$$
(1)

where \(pixel_{1}\) and \(pixel_{2}\) represent the pixel values of the original images and the image after exposure correction, respectively, \(\alpha \) means gain and \(\beta \) means bias.

To calculate the parameters \(\alpha \) and \(\beta \) , we propose to denoise image \(I_{1}\) and \(I_{2}\) by Gaussian blur filtering. Let \(I'_{1} \) and \( I'_{2}\) denote the denoised images, respectively. And then we compute \(\alpha \) and \(\beta \) by using the least square method based on the inline points mentioned in Sect. 2.1. Supposing \(pixel'_{1} \in I'_{1},pixel'_{2} \in I'_{2}\) are the pixel values of a pair of inline matching points from adjacent images after Gaussian blur filtering. The linear equation of \(\{ \alpha ,\beta \} \) is formulated as

$$\begin{aligned} A\times m=pixel'_{1i} \end{aligned}$$
(2)

where \(A=[pixel'_{2i},1],m=[\alpha ,\beta ]^{T} ), i=1,2,\ldots ,n \) and \(n\) is the total number of inline points. The results of the exposure correction are illuminated in Fig. 3a (with Gaussian blur filter) and Fig. 3b (without Gaussian blur filter). As we can see that the exposure correction model with Gaussian blur is more robust and accurate.

Fig. 3
figure 3

An illustration of the effectiveness of the proposed exposure correction. The red snowflake points represent the inline matching point pairs extracted by SIFT, and the fitting straight line is fitted using linear least square. a The linear fitting result without Gaussian blur preprocessing and b the linear fitting result after Gaussian blur preprocessing

2.2.2 Edge-enhanced weighting intensity difference

In Mills and Dudek [19], the intensity difference was obtained by a pixel-by-pixel normalized ratio

$$\begin{aligned} \delta ^{I}_{ij}=\frac{abs(Im_{1_{ij}}-Im_{2_{ij}})}{max(Im_{1_{ij}},Im_{2_{ij}})} \end{aligned}$$
(3)

where \(Im_{1_{ij}}\) and \(Im_{2_{ij}}\) are corresponding pixels in the overlapping regions of adjacent images, and obviously \(\delta ^{I}_{ij} \in [0,1]\). The intensity difference computed using Eq. 3 is shown in Fig. 4a.

Fig. 4
figure 4

An illustration of the Intensity difference image of the overlapping region. a The traditional intensity difference image. b The intermediate image that is dilated and connected from the traditional intensity difference image. c The final EWID image

From Fig. 4a, we can see multiple moving objects: a bus, cars, and a biker. Here, we focus on the moving bus that overlaps with the static tree. Traditional intensity difference method divided the moving bus into multiple parts. To overcome this problem, we propose an improved edge-enhanced weighting intensity difference (EWID). The detailed procedures are summarized as follows:

  • Step 1 Denoising images \(Im_{1}\) and \(Im_{2}\) by a Gaussian blur filter of \(5\times 5 \) kernel operators \(G(x,y,\delta )\), we can get:

    $$\begin{aligned} Im'_{1}&= G(x,y,\delta )\times Im_{1} \nonumber \\ Im'_{2}&= G(x,y,\delta )\times Im_{2} \end{aligned}$$
    (4)
  • Step 2 Building intensity difference by

    $$\begin{aligned} \delta ^{I}_{ij}=\frac{abs(Im'_{1_{ij}}-Im_{2'_{ij}})}{max(Im'_{1_{ij}},Im'_{2_{ij}})} \end{aligned}$$
    (5)
  • Step 3 Threshold binarization :

    $$\begin{aligned} Im^{msk}_{ij}= {\left\{ \begin{array}{ll} 1 &{} (\delta ^{I}_{ij} \ge Thrd )\\ 0 &{} (\delta ^{I}_{ij} < Thrd ) \end{array}\right. } \end{aligned}$$
    (6)

    here, \(Im^{msk}_{ij}\) is the binary image of \(\delta ^{I}_{ij}\). We set the threshold \(Thrd\) to 0.7 in this paper which achieves the best performance in the experiments.

  • Step 4 Making a morphological dilation operation for the binary image \(Im^{msk}\) twice, and then connecting the dilated image with 4 pixels both on horizontal and vertical directions to generate the image \(Im^{msk'}\) which is shown in Fig 4b.

  • Step 5 Large weighting intensity-difference value \(weight\) is given to the edge of moving objects by computing the average of image \(Im^{msk'}\). In this paper, we set the parameter \(weight\) to 5. So after enhancing weights on the edge, the intensity difference image \(\delta ^{I'}_{ij}\) can be computed as

    $$\begin{aligned} \delta ^{I'}_{ij}= {\left\{ \begin{array}{ll} weight &{} (Im^{msk'}_{ij}=1 \cap \delta ^{I}_{ij}<thrd )\\ \delta ^{I}_{ij} &{} (else ) \end{array}\right. } \end{aligned}$$
    (7)

Figure 4c shows the EWID image \(\delta ^{I'}\) where the green pixel points take up a value of \(weight\). According to Fig. 4c, the edge of moving objects (bus, cars and biker) are connected as a whole. As described in the following section, the above mentioned improvement can effectively prevent the seam from dividing the moving objects into several parts.

2.2.3 Gradient difference

If the exposure differences between two images are obviously nonlinear, the exposure correction will become ineffective. In this case, the intensity differences cannot effectively express the similarity between images, so we set up the gradient images \(\Delta Im_{1}\) and \(\Delta Im_{2}\) for those images

$$\begin{aligned} \Delta Im_{1_{i,j}}\!&= \!\sqrt{(Im_{1_{i+1,j}}\!-\!Im_{1_{i-1,j}})^{2}\!+\!(Im_{1_{i,j+1}}\!-\!Im_{1_{i,j-1}})^{2}}\nonumber \\ \Delta Im_{2_{i,j}}\!&= \!\sqrt{(Im_{2_{i+1,j}}\!-\!Im_{2_{i-1,j}})^{2}\!+\!(Im_{2_{i,j+1}}\!-\!Im_{2_{i,j-1}})^{2}}\nonumber \\ \end{aligned}$$
(8)

And the computation of gradient difference \(\delta ^{\Delta }\) in this paper is the same as traditional intensity difference \(\delta ^{I}\). An example of the gradient difference image is shown in Fig. 5.

Fig. 5
figure 5

An example of the computed gradient difference image

2.2.4 Optimal seam searching criteria

Based on the combination of EWID and gradient difference, we establish the optimal seam line searching criteria as follows:

$$\begin{aligned} \Delta =w_{1}\delta ^{I'}+w_{2} \delta ^{\Delta } \end{aligned}$$
(9)

In this formula, we select \(w_{1} \) and \(w_{2} \) by adopting the weighting method in Duplaquet [6]:

$$\begin{aligned} w_{2}=(abs(\ln (a))+abs(\beta ))^{2} \end{aligned}$$
(10)

Here, \(w_{2}\) is the weight of the gradient difference. During the mosaic process with huge exposure difference, the accuracy of exposure correction using a linear model will be reduced, and the reliability of the similarity measurement between intensity differences will also be decreased. In that case, the gradient difference will be assigned a greater weight.

$$\begin{aligned} w_{1}=1-w_{2} \end{aligned}$$
(11)

where \(w_{1}\) is the weight of intensity difference. If the value is negative, we set it to be 0.

After the procedure above, we finally create a seam searching criterion, as shown in Fig. 6. Because of the minimal exposure difference between these two example images, \(w_{1}\) trend to 1. Therefore, the image of seam searching criteria is similar to the EWID image as shown in Fig. 4c.

Fig. 6
figure 6

An example of the optimal seam searching results by combining EWID and gradient difference

2.2.5 Optimal seam searching implementation

In this section, we use dynamic programming associated with Eq. 9 to find out the optimal seam criterion. Supposing each pixel position of the first row residing in the overlap region corresponds to a seam line, then the optimal seam should have the minimal value according to the seam searching criterion that was mentioned in the previous section. The main steps are summarized as follows:

  • Initialization. As shown in Fig. 7a, each pixel in each column on the first row corresponds to a seam line, and is assigned with a value calculated by Eq. 9.

    Fig. 7
    figure 7

    An illustration of the optimal seam searching method. Each circle represents a pixel value obtained using Eq. 9. a Initialization. b Expanding for the first time

  • Expanding sequentially until reaching the last row. Adding the current points of each seam line to the minimum value of the three pixels on the next row, as shown in Fig. 7b, and then setting the column with the minimum pixel as extension direction, updating the current point of the column with the minimum value.

The algorithm complexity of the above process is \(O(mn)\) where \(m\) and \(n\) denote the numbers of row and column, respectively. The proposed method is more efficient than the Dijkstra algorithm [5] with complexity of \(O(mn^2)\).

In Fig. 8, the red line is the optimal seam obtained by dynamic programming. The optimal seam without improvement is shown in Fig. 8a. We can see this optimal seam line cuts the moving bus into two parts.The optimal seam after improvement is shown in Fig. 8b. Because of the introduction of edge-enhanced weighting intensity difference algorithm, the moving bus can effectively maintain its integrity, and there is no incomplete object in the mosaic image. Figure 9 is the mosaic result of the example images in Fig. 1 where the stitch line is shown in the black line. In Fig. 9a, the red box demonstrates that the composited image contains residual of the moving bus, but there is no such a problem in Fig. 9b.

Fig. 8
figure 8

Optimal seam searching results. a The traditional optimal seam searching result cuts through the moving bus. b The improved optimal seam searching result bypasses the moving bus

Fig. 9
figure 9

Mosaic renderings of two different methods. The black line is the stitching line. a Mosaic result obtained by the traditional optimal seam searching method. The red box marks the residual of the moving bus. b Mosaic result obtained by the improved optimal seam searching method

2.3 Multi-resolution mosaic based on optimal seam

To minimize the inconsistency nearby the optimal seam, we adopted the multi-resolution method [11] to stitch two images \(I_A\) and \(I_B\) along the optimal seam. We first generated a mask \(I_R\) based on the optimal seam. \(I_R\) is assigned to 0 on the left side of this seam line, and 1 on the right side. The steps of composition are summarized as follows:

  • Step 1 Generating Laplacian pyramid \(L_A\) and \(L_B\) for \(I_A\) and \(I_B\), respectively;

  • Step 2 Generating Gaussian pyramid \(G_R\) for \(I_R\);

  • Step 3 Compositing image pyramid \(S\). The \(L\) layer of \(S\) is formulated as

    $$\begin{aligned} LS_l(i, j)&= G_{Rl}(i, j)L_{Al}(i, j)\nonumber \\&\quad +\,(1-G_{Rl}(i, j))L_{Bl}(i, j) \end{aligned}$$
    (12)

    where \(G_{Rl}(i, j)\) represents the Gaussian pyramid of the \(L\) layer in the mask image, \(L_{Al}(i, j)\) and \(L_{Bl}(i, j)\) represent the Laplacian pyramid of the \(L\) layer in the left and right image, respectively.

  • Step 5 Rebuilding the \(S\) from the top layer to the bottom layer.

3 Experiment results

In this paper, we implemented the proposed CSDP algorithm in OpenCV 2.0 to implement the proposed. To sufficiently verify the CSDP’s effectiveness, we compared it with three popular mosaic softwares (i.e., AutoStitch, Microsoft ICE, and Panorama Maker) and Mills’ approach on multi-group pictures in multiple scenes that contain moving objects, huge exposure difference and strong motion parallax. Among them Microsoft ICE, Panorama Maker and Mills’ approach have the optimal seam searching function, but AutoStitch not yet. We will give more details in the following sections regarding three different scenes.

3.1 General dynamic scene

Figure 10 shows the mosaic results of the example images shown in Fig. 1 obtained by AutoStitch, Microsoft ICE , Panorama Maker, Mills’ approach and our method (CSDP), respectively. From the comparison, we can draw the following conclusions: AutoStitch method has the ghosting problem on moving objects, as shown in Fig. 10c; Mills’ approach and Microsoft ICE can only find the local optimum seam that has no protection mechanisms for the moving objects, the mosaic image by Mills’ approach has slightly incomplete problem as identified by red rectangle in Fig. 10d and Microsoft ICE has an obviously incomplete moving bus as shown in Fig. 10a; CSDP and Panorama Maker can get good mosaic results as shown in Fig. 10b, e, respectively.

Fig. 10
figure 10

Mosaic results in the dynamic scene 1. a Microsoft ICE. b Panorama maker. c AutoStitch. d Mills’approach. e CSDP

Figure 11 shows the mosaic results of another group of pictures taken from the same scene as in Fig. 10. AutoStitch still has the ghosting phenomenon on the moving object. Microsoft ICE and Panorama maker have an incomplete moving car as shown in Fig. 11a, b. CSDP and Mills’ approach obtained good mosaic results in this scene as shown in Fig. 11e, d respectively.

Fig. 11
figure 11

Mosaic results in the dynamic scene 2. a Microsoft ICE. b Panorama maker. c AutoStitch. d Mills’approach. e CSDP

3.2 Dynamic scene with exposure difference

Figure 12 is a group of pictures taken by a hand-held camera in our laboratory. At the moment of shooting, one subject is moving on purpose, and we artificially add huge exposure difference at the same time. As shown in Fig. 12b, e, the ghosting phenomenon appears in the mosaic result of Microsoft ICE due to the head moving, and that Mills’ approach is marked by a red rectangle area. In this scene, Panorama maker cannot register the images well, which makes severe dislocation in the composited image as shown in Fig. 12c. As for AutoStitch, we can see from the red rectangle in Fig. 12d, it has the worse ghosting problem than that of Microsoft ICE. Before improving the searching criterion, as shown in Fig. 12g, we get the optimal seam line as the same as Microsoft ICE and Mills’ approach. After the improved optimal seam as shown in Fig. 12h, we can get good mosaic result as shown in Fig. 12f.

Fig. 12
figure 12

Mosaic results in the dynamic scene with exposure difference. a Images for mosaic. b Microsoft ICE. c Panorama maker. d AutoStitch. e Mills’approach. f CSDP. g Seam before improvement. h Seam after improvement

3.3 Strong motion parallax scene

Figure 13a is a group of images for mosaic with strong motion parallax. As shown in Fig. 13b, c, because of inaccuracy of registration and seam-searching strategy, both Microsoft ICE and Panorama maker have dislocation phenomenons marked by red rectangle areas. And AutoStitch has the huge ghost phenomenon in multiple places as lamp and computer as shown in Fig. 13d while Mills’s approach and CSDP get good mosaic results as shown in Fig. 13e, f, respectively.

Fig. 13
figure 13

Mosaic results in the strong motion parallax scene 1. a Images for mosaic. b Microsoft ICE. c Panorama maker. d AutoStitch. e Mills’approach. f CSDP

Figure 14a contains two pictures taken by a hand-held camera in an outdoor environment with strong motion parallax. As shown in Fig. 14b, c, Panorama maker, Mills’ approach and CSDP can get effective mosaic results. However, Microsoft ICE and AutoStitch cannot register images and result in the incomplete mosaic results.

Fig. 14
figure 14

Mosaic results in the strong motion parallax scene 2. a Images for mosaic. b Panorama maker. c Mills’approach. d CSDP

3.4 Experimental analysis

Since AutoStitch does not have the optimal seam searching function, it will result in the ghosting phenomenon and mosaic failure in the presence of moving objects and strong motion parallax. Microsoft ICE , Panorama Maker and Mills’ approach have the optimal seam function but with limited application scopes. In general in dynamic scene and some uncertain conditions, the mosaic image obtained by Microsoft ICE, Panorama Maker and Mills’ approach have incomplete moving objects. In dynamic scene with huge exposure difference among images, Panorama Maker sometimes cannot finish the matching process. The optimal seam search used by Microsoft ICE and Mills’ approach can only have a local optimum without the consideration of the moving object’s integrality. In the scene with strong motion parallax due to the misregistration, Panorama maker has a dislocation phenomenon, and Microsoft ICE and AutoStitch sometimes cannot finish the mosaic process without matching images to be found. For our CDSP algorithm, we can always get effective mosaic results by using an improved optimal seam searching criterion.

4 Conclusion

Mosaic failure always happens in dynamic scenes due to huge exposure difference and strong motion parallax. To solve these problems, we proposed the CSDP mosaic algorithm based on SIFT matching and dynamic programming. The main contributions are summarized as follows:

  • SIFT descriptors are invariant to rotation, image scaling, brightness variations, and partially invariant to view angles, affine transformation and noise. For these reasons, we use SIFT operator to match images. We can further figure out the inline points and outline points between the matching pairs of point by RANSAC algorithm.

  • Based on Mills and Dudek [19], we proposed an improved optimal seam line searching criteria. The optimal seam searching criterion is established by combining gradient difference with the proposed edge-enhanced weighting intensity difference operator, which provides an effective mechanism for moving objects so as to avoid the ghosting problem and incomplete effects.

  • We proposed the combination of the multi-resolution fusion algorithm and optimal seam line searching strategy to minimize inconsistency nearby the stitching lines.