Introduction

Image registration is originated from computer vision and started from 1960’s, but only at the beginning of 1980’s was it paid the sufficient attention. Generally speaking, the image registration is a kind of process for matching two or more images of the same scene retrieved from different sensors in different time or from different view spots, including pixel intensity match and spatial position alignments. As the fundamental research in image processing, the Image registration is the precondition and the essential part in many applications of image processing technologies, such as area or object change detection, image fusion, image mosaicing and combination, and automatic target tracing etc. All of them have been applied into the application fields as military, medicine, remote sensing, computer vision and industrial product detection. Since the images of the same scene or the same object that sensed in different time and from different positions might have various sizes and resolutions, how to find out the invariant features and how to match them efficiently are hard tasks for automatic image registration.

Feature extracting and feature matching are two basic tasks in automatic image registration. The point features in a scale space provide a good way for feature extracting and feature matching because of their property of scale-invariance or affine-invariance (Rosten et al. 2010; Gueguen and Pesaresi 2011; Forlenza et al. 2012; Paulo Ricardo et al. 2014), for example: (1) the multi-scale Harris detector is used to extract image features and adopted Mahalanobis distance to accomplish feature matching between images of different resolutions; (2) the scale-invariance feature transform (SIFT) proved that it is steadily invariant to image scaling and rotation; (3) the normalized Laplace detector is applied to deal with image color information, and is implemented feature extracting and feature matching in real time; (4) Harris-Laplace detector showed that Harris-Laplace has the better performance in repeatability, localization and scale variation than other detectors in a scale space.

Anyhow, many registration algorithms have been proposed in the last decades (Barbara et al. 2003; Pei et al. 2006; Wang 2008; Dae-Ho et al. 2011). But currently, the tasks are mostly done by manually, which is not only inefficient but also imprecision. To overcome these shortages, in this paper, a new image registration algorithm is proposed to register two remote sensing images of different sizes and resolutions automatically.

The algorithm mainly contains four parts: (a) the multi-scale Harris-Laplacian corner detector is used to detect and localize corners in reference and registration images; (b) the detected corner descriptors are calculated based on the SURF descriptor; (c) the multi-resolution corner matching is made according to Euclid distance; (d) the LoG is finally adopted to automatically determine the scale factor between reference and registration images. A number of image tests have been performed by the studied algorithm, and the algorithm works in a right way

Multi-resolution Corner Detection by Harris Laplacian Detector

Image registration algorithms are generally based on pixel intensities, such as mutual information (Chen et al. 2003) and maximum likelihood (Li and Leung 2004), and image features. Normally, the image registration based on image characteristics uses the interior image features to register; the features including line, corner, object contour, etc. (Wang et al. 2009; Lionel et al. 2012). The corner feature shows its great superiority over the other features in registration algorithms, so it is more capable of registering images from different types of transforms and illumination variations.

Harris corner detector (Harris et al. 1998) is a much more efficient and stable detector than all other corner detectors, and it has good repeatability for translation, rotation and small illumination variance (Dufournard et al. 2000). However, once there is a large scale change between the images, Harris corner detector will cause mismatching. To overcome this defect, many scholars have proposed their improvement algorithms for Harris detector from different aspects (Ying et al. 2009; Zhang et al. 2011). Among them, the Harris-Laplacian corner detector proposed by Mikolajczyk et al. (2001) is an efficient and convenient scale adaptive corner detector that combines the Harris with Laplacian pyramids. The main idea of the Harris Laplacian detector as below:

Harris function is constructed as a symmetric matrix based on the second moments. The matrix is often used for feature detection and local image structure description. This matrix must be adapted to scale changes to make image resolution invariant.

If I(x, y) is the function of an image with grey levels, L(x, y; σ) = I(x, y) ∗ G(x, y; σ), \( G\left(x,y;\sigma \right)=\frac{1}{2\pi {\sigma}^2}{e}^{-\left({x}^2+{y}^2\right)/2{\sigma}^2} \), the scale-adapted second moment matrix is denoted as:

$$ \mathbf{C}\left(\mathbf{x},{\sigma}_I,{\sigma}_D\right)={\sigma_D}^2G\left({\sigma}_I\right)\ast \left[\begin{array}{cc}\hfill {I}_u^2\left(\mathbf{x},{\sigma}_D\right)\hfill & \hfill {I}_u{I}_v\left(\mathbf{x},{\sigma}_D\right)\hfill \\ {}\hfill {I}_u{I}_v\left(\mathbf{x},{\sigma}_D\right)\hfill & \hfill {I}_v^2\left(\mathbf{x},{\sigma}_D\right)\hfill \end{array}\right] $$
(1)

Where, σ I is the integration scale, σ D the differentiation scale, σ I  = ξ n σ 0, σ D  =  I , n a scale level, ξ a scale factor, σ 0 a initial scale, and s a constant factor. ξ must be small enough to find the location and the scale of an interest point with high accuracy. And s should not be too small, otherwise the smoothing of derivation is too large. However, s should be as small as σ I can smooth matrix C. Generally, σ 0 = 1.6, \( \xi =\sqrt{2} \),and s is chosen in the range 0.5 ~ 0.75 (Wang et al. 2009). I v (x, σ D ) and I u (x, σ D ) are the differential Gaussian kernels of standard deviation σ D called differential scale, in two different directions. G(σ I ) is a weight Gaussian kernel with a standard deviation σ I called integral scale. Asterisk represents convolution operation.

The matrix presents the gradient distribution in the local neighborhood of a point. The eigenvalues of the matrix represent two principal signal changes in the neighborhood. This property enables the features extracted, such as corners, junctions etc., and the signal change is significant in its orthogonal direction. Such features are stable in arbitrary lighting conditions and are representatives for an image. The Harris measure is based on the principle which combines the trace and the determinant of the second moment matrix:

$$ cornerness= \det (C)-k\kern0.5em {\mathrm{trace}}^2(C)>t $$
(2)

Where, k is a constant factor, and t is a threshold. As tested, the typical values of k and t are 0.04 and 1000 respectively. When the points that satisfy Eq. (2), the maximum of 9 neighbor points in the matrix Corner is selected as a Harris feature.

If Harris feature makes the normalized Laplacian function LP(X, σ n ) = σ 2 n |L xx (X, σ n ) + L yy (X, σ n )|, it satisfy:

$$ LP\left(X,{\sigma}_n\right)\kern0.5em >\kern0.5em LP\left(X,{\sigma}_{n-1}\right)\kern1em \cap \kern1em LP\left(X,{\sigma}_n\right)\kern0.5em >\kern0.5em LP\left(X,{\sigma}_{n+1}\right)\kern1em \cap \kern1em LP\left(X,{\sigma}_n\right)\kern0.5em >\kern0.5em Tl $$
(3)

Then, this feature is selected as a Harris-Laplace feature. In Eq. (3), Tl is the threshold of Laplace function, σ n is equal to σ I in Eq. (1), and n stands for the scale level.

The transformation of translation, rotation and scaling can be presented as

$$ \left[\begin{array}{c}\hfill {x}^{\hbox{'}}\hfill \\ {}\hfill {y}^{\hbox{'}}\hfill \\ {}\hfill 1\hfill \end{array}\right]=\left[\begin{array}{ccc}\hfill S \cos \left(\theta \right)\hfill & \hfill -S \sin \left(\theta \right)\hfill & \hfill {x}_0\hfill \\ {}\hfill S \sin \left(\theta \right)\hfill & \hfill S \cos \left(\theta \right)\hfill & \hfill {y}_0\hfill \\ {}\hfill 0\hfill & \hfill 0\hfill & \hfill 1\hfill \end{array}\right]\left[\begin{array}{c}\hfill x\hfill \\ {}\hfill y\hfill \\ {}\hfill 1\hfill \end{array}\right] $$
(4)

Here it can be notated as x ' = S Hx + x 0, where matrix \( \mathbf{H}=\left[\begin{array}{cc}\hfill \cos \left(\theta \right)\hfill & \hfill - \sin \left(\theta \right)\hfill \\ {}\hfill \sin \left(\theta \right)\hfill & \hfill \cos \left(\theta \right)\hfill \end{array}\right] \) is the rotation matrix, S is a scalar.

Substitute this transform matrix into the corner function that can have the relationship of corner responses in different scales (the parameters are referred to Eqs. (1) and (4)).

$$ {\mathbf{C}}^{\hbox{'}}\left({\mathbf{x}}^{\hbox{'}},S{\sigma}_I,S{\sigma}_D\right)=\mathbf{C}\left(\mathbf{x},{\sigma}_I,{\sigma}_D\right)/{S}^2 $$
(5)

The Eqs. (45) show that if an image is scaled by a factor S, then its corner response will be scaled by a factor 1/S 2. This relationship can be adopted to realize the multi-scale corner detection.

The Harris-Laplacian corner detector is tested in a number of remote sensing images, a detecting example is shown in Fig. 1. By comparing, the number of detecting points of SIFT is much more than that of Harris Laplacian (say, more than 10 times). The matching errors of SIFT is little lower than that of Harris Laplacian with different rotation angles (Fig. 1a), but the right matching probability of SIFT is much lower than that of Harris Laplacian (Fig. 1b).

Fig. 1
figure 1

Comparison between Harris Laplacian and SIFT for maching accuracy and probability

However, at certain detecting accuracy level, since the computing speed of the Harris Laplacian detector is much faster than that of the other detectors as shown in Fig. 2 (comparing with SIFT by two classes of image groups), it is suitable for the images in this study. If images are very large, it is possible to use the optimised KD-trees for fast image descriptor matching (Herbert et al. 2008).

Fig. 2
figure 2

Comparison between Harris-Laplacian and SIFT for matching time cost

Speed-Up Robust Feature Descriptor

Feature point descriptors are now at the core of many computer vision technologies, such as object recognition, 3D reconstruction, image retrieval, and camera localization. Since applications of these technologies have to handle more data and even need to run on mobile devices with limited computational resources, there is a growing need for local descriptors that could compute and match much faster. One way to speed up matching and reduce memory consumption is to work with short descriptors. They can be obtained by applying dimensionality reduction, such SURF: Speeded-Up Robust Features (Herbert et al. 2008).

The corner descriptor has the direction effect on the precision and the speed of a registration algorithm. On the one hand, if the descriptor vector is too long then it will cause a large computational burden for the next registration procedure. On the other hand, if the vector is too short, it has less distinguishing ability and the low precision which might lead to mismatch. The SURF is a performance scale and rotation-invariant interest point detector and descriptor. It approximates or even outperforms previously proposed schemes with respect to repeatability, distinctiveness, and robustness, so it can be computed and compared much faster. This is achieved by relying on integral images for image convolutions, building on the strengths of the leading existing detectors and descriptors.

The SURF descriptor is an efficient corner descriptor based on the Haar wavelet response, and its feature vector has only 64 components. Comparing with the SIFT descriptor which has 128 components (Lowe 2004), both the computational speed and the performance are improved much. What’s more, by utilizing the integral image technique the computational speed can be further increased.

Figure 3 illustrates the descriptor responses to three types of image structures. The three images represent three typical image structures: a constant area, an X direction frequency changed area and an X direction gradually changed area. Below them are their SURF descriptors. These responses show that SURF feature descriptor has the excellent distinctive ability to different types of image structures, and these results in the different corner feature vectors in a feature space have a long distance, thus increasing the anti-noise ability of the algorithm.

Fig. 3
figure 3

SURF descriptor response of different image structures

In order to calculate the feature vector for a corner, the SURF procedure is selected. The SURF first calculates the Haar wavelet responses (in Fig. 4a) in both x and y directions within a circular neighborhood and the responses are weighted with a Gaussian kernel. And then a sliding orientation window with size of π/3 is used to find out the dominant orientation. After these an orientated square region around a corner is constructed (in Fig. 4b), and the region is split into a number of 4 × 4 sub-regions. In each sub-region the Haar wavelet filter responses along abscissa and the ordinate direction is calculated on every 5 × 5 sample points respectively, and the results are notated as dx and dy. Finally these responses are added up in each direction, ∑dx, ∑dy as the two components of a feature vector. Based on the above procedure, the corner feature vector is calculated as the follows.

Fig. 4
figure 4

a Haar wavelet filter coefficients (Black represents −1, White represents +1). b SURF descriptor calculation: an oriented square region centered on the corner and weighted with a Gaussian

To increase the anti-noise ability it is necessary to firstly use a Gaussian kernel of standard deviation σ = 3s (s is the scale) centered at the corner to weigh the wavelet response. In order to take the polar intensity information, the absolute wavelet response is also summed up as the other two components of a feature vector, i.e. ∑|dx|, ∑|dy|.

So a four dimensional feature vector for each sub-region is formed as v i  = (∑dx, ∑dy, ∑|dx|, ∑|dy|), i = 1, 2, ⋯, 16 and finally the corner feature descriptor vector is the combination of all these 16 sub-region feature vectors together, i.e. v = (v 1, v 2, …, v 16)T.

Multi-scale Corner Detection and Matching

Multi-scale corner detection is used as for fusion detection results of different resolutions in many vision tasks such as tracking, SLAM (simultaneous localisation and mapping), localisation, image matching and recognition (Matthew et al. 2005; Fan et al. 2010; Wang December 2011). Hence, a large number of Multi-scale corner detectors exist in the literature. As a key component in many computer vision applications, the corner matching might be sufficient alone, but is also an ideal platform for “bootstrapping” denser and more complex analysis of images. Of all possible features, “corners” are the most widely used; their two dimensional structure providing the most information about image motion. Matching algorithms typically assume that the correlation windows from each image are related by a simple translation, an assumption which is valid in a number of typical applications. In this study, the following Multi-scale corner detection and corner matching algorithms are used.

Multi-scale Corner Detection

General, the corner-based image registration algorithm should be robust with respect the noise and reduce geometric deformation noise effects (Hui et al. 2010). Therefore, in our study, an image is firstly filtered by a median filter, and then the Harris-Laplacian corner detector is applied to the filtered image to find out the corners. The corner detecting result is shown in Fig. 5a and b, where the image is the remote sensing image of Ningbo Seaport in China. In the image, there are buildings, ships and natural scene, thus the image structure is diversity and there are sufficient corners for the performance evaluation of the Harris- Laplacian detector. The Harris detector without multi-scale attributes is applied to the same image in Fig. 5b, and the result is shown in Fig. 5c.

Fig. 5
figure 5

a Harris detector used in original image (white cross). b Harris-Laplacian detector used in original image (white cross). c Harris used in scaled two times, rotated 20° image. d Harris-Laplacian detector used in scaled two times, rotated 20° image

In Fig. 5a, Harris detector is used and white crosses meant corners. In Fig. 5b, Harris-Laplacian detector is used in the same image as Fig. 5a. We can see clearly that the Harris-Laplacian detector has the better result than the Harris detector, because there are much more fake corners in Fig. 5a than in Fig. 5b. To show the advantage of the multi-scale Harris-Laplacian detector, the Harris in scaled two times (Fig. 5c) and the Harris-Laplacian in scaled two times (Fig. 5d) separately are applied in the same image, and the result of the multi-scale Harris-Laplacian detector is much better. Compare Fig. 5b to d, the multi-scale Harris-Laplacian detector still has the better result.

The results show that the Harris-Laplacian corner detector can adapt itself well to the scaled and rotated images, and it has the higher repeatability to find out the same corners of an object at different scales in this study.

When the image scale is changed, the same corners are hardly detected by the Harris detector. The reason behind this is that when image is scaled with a factor greater than 1, the local image structure becomes less sharp in a small region because of the pixel interpolation. But for the Harris detector, the differential scale σ D and the integral scale σ I are constants, e.g. they are always ‘1’, therefore they cannot reflect the actual pixel intensity change information in a large range after the image scale has been changed. At mean time the Harris still calculate corner response function at an original image scale as it does before image is scaled, so it fails to utilize the larger range of the image gradient information to detect a big corner. If the threshold is decreased to increase the detector sensitivity, too many false corners will be treated as corners; as a result it cannot sense a corner in a larger image scope. After the corner detection, the next step is to match the corners.

Corner Matching

Owning to its good performance against the change of perspective and easy to detect, the corner could be a most widely used feature for matching (Zhao et al. 2011). The common approach for corner matching is to take a small region of pixels from around the detected corner and compare it with a similar region from around each of the candidate corners in the other image. Generally, the matching time usually depends on an initial matching algorithm (There are several ways based on gray information to complete the initial matching, normalized cross correlation (NCC), sum of squared difference (SSD), and sum of absolute difference (SAD). The most popular measure of similarity is the normalized cross correlation.

In this study, refer to the above idea, the corners detected in a reference image and its registration image are matched according to Euclid distance of their feature vectors. As described in Section3, the corner feature descriptor vector is v = (v 1, v 2, …, v 16)T.

For the two sets of corner feature vectors P and Q in the two images, when min ||V i  − V j || < t; V i  ∈ P, V j  ∈ Q, where t is the threshold, the corners (i, j) are treated as a candidate matching pair. To get the high veracity and anti-noise ability the histogram correlation in the rounded neighborhood of the candidate corner pairs is also used to eliminate the incorrect matches.

Automatic Selection of Scale Factor

The characteristic scale of the image structure can be calculated with the assistance of a scale selection operator. For some key points such as corner, region center, or cross point, the scale selection operator response produces a scale curve when the operator’s scale increases gradually. When the operator’s scale is matching local image structure, the response curve reaches to an extreme (Fig. 6). As it is shown in Fig. 6, through localizing the extreme’s position of the response curve one can quickly find out the scale of the local image structure. The evaluation of several types of scale selection operators can be found in Mikolajczyk and Schmid (Mikolajczyk et al. 2001), where it is well known that the LoG operator is better than the others for both its performance and its stability in a scale selection task, and the further more it is easy to make realization. So here the LoG scale selection operator is chosen. A normalized LoG scale selection operator is defined as:

Fig. 6
figure 6

LoG scale selection operator responses corresponding to two object scales. The extreme location ratio is the scale factor between the objects

$$ \left| LoG\left(\mathbf{x};\kern0.5em {\sigma}_n\right)\right|={\sigma}_n^2\left|{L}_{xx}\left(\mathbf{x};\kern0.5em {\sigma}_n\right)+{L}_{yy}\left(\mathbf{x};\kern0.5em {\sigma}_n\right)\right| $$
(6)

As shown in Fig. 7, when the operator’s scale σ n increases up to the size that matches the local image structure size, L xx , L yy are the second-order differential of Gaussian. LoG operator’s response reaches to an extremum. Therefore the LoG kernel can be interpreted as a scale matching operator.

Fig. 7
figure 7

Registration result of Fig. 5a and d by first applying the scaling transformation to (d) and then applying rotation and translation transformation

In the bottom in Fig. 6, the left curve is the LoG operator response to the above left image, and the right curve is the LoG operator response to the above right image. As shown obviously in Fig. 6, when the image scale changes the LoG response extreme will also shift correspondingly and the shift distance is associated tightly to the scale change factor. The extreme locations of the left and the right curves are 22.5 and 47.0 respectively, and their ratio is 2.08, which is much closed to the actual scale ratio 2.0 between the images.

Even if the corner features vectors are calculated and matched carefully, there might be a minor error in the mismatch. These mismatched corners might cause the registration useless. Besides, the corners detected at high scale levels generally have the considerable scale deviation from their true scales. So instead of computing the transform matrix, the scale factor between the reference image and the registration image is firstly calculated. For the purpose in this study some modifications are operated for the scale selection scheme described above.

Let C f and C g be the match corner sets in the reference image and the registration image, and x i  ∈ C f , x ' i  ∈ C g are a pair of matched corners, then, based on Eq. (5), the scale factor to the corner pair (x i , x ' i ) is

$$ {S}_i=\frac{\left| LoG\left({\mathbf{x}}_i;\kern0.5em {\sigma}_n\right)\right|}{\left| LoG\left({\mathbf{x}}_i^{\hbox{'}};\kern0.5em {\sigma^{\hbox{'}}}_n\right)\right|}=\frac{\sigma_n^2\left|{L}_{xx}\left({\mathbf{x}}_i;\kern0.5em {\sigma}_n\right)+{L}_{yy}\left({\mathbf{x}}_i;\kern0.5em {\sigma}_n\right)\right|}{{\sigma^{\hbox{'}}}_n^2\left|{L}_{xx}\left({\mathbf{x}}_i^{\hbox{'}};\kern0.5em {\sigma^{\hbox{'}}}_n\right)+{L}_{yy}\left({\mathbf{x}}_i^{\hbox{'}};\kern0.5em {\sigma^{\hbox{'}}}_n\right)\right|} $$
(7)

For all the corner pairs, a set of scale factors S are obtained. Then each scale factor S i ’s probability is computed as following: for a scale element S i  ∈ S, the probability of S i is P i  = P(S i ) = N i /N, where N i is the number of S i and N is the size of the set S. Then the real scale factor between the two images is S i that has the maximum probability, which means the real scale factor s = arg max(P i ).

Experiments

The algorithm contains four procedures, and each computational procedure is elaborated specifically. In order to register images automatically, the corresponding corners must be firstly detected in the images, and then these corner SURF descriptors are computed and matched by means of the minimum Euclid distance or the method of Nearest-Neighbour distance ratio (Schmid et al. 2000; Mikolajczyk and Schmid 2007; Szeliski 2010). After this, the scale factor is calculated with a normalized LoG scale selection operator, and finally the registration operation is performed. Each procedure step of the algorithm is described as the follows.

A number of remote sensing images are tested by the proposed algorithm as experiments, a part of testing procedures are referred to Mikolajczyk, K. & Schmid, C. (Szeliski 2010). As one of examples, the procedure of the registration transformation is described as follows. The scale factor is found out in the last step for the image in Fig. 5b. The scale transformation is applied firstly to the image in Fig. 5b in order to keep it within the same scale level in Fig. 5a. Now since the two images have been transformed at the same scale, it is now easily to verify the true correspondence of these corners that are used to determine the scale factor in the last section by a fast correlation computation to eliminate the possible false matched corner pairs. Then the scaled image is rotated and translated by using these verified corner pairs to the finally registered image, and the result is shown in Fig. 7.

The comparison between proposed registration algorithm and SIFT based registration algorithm is made in Tables 1 and 2, in which the Calc. MM Time (Calculating Map Matrix Time) item represents the time cost to calculate map matrix using these matched points, and the MM Error (Map Matrix Error) item is computed by e = ||M − M ' ||, where M is the predefined map matrix used to obtain the registration image and M ' is the map matrix evaluated by the algorithm. From Tables, one can see that within a certain level of detecting accuracy, the SIFT detects too many points (4–12 times of the points by Harris Laplacian) in the feature detection step, and these result in a large computing burden for the following registration steps, i.e. the descriptor computation and matching stages. Because affine registration task only needs no more than three pairs of matched points to compute the map matrix, so most of these computations make little sense to the registration result. The experiment result shows that this proposed algorithm is capable of tackling with the problem of the automatic registration mixed with translation, rotation and scaling transformations, which are the most usual three types of image transformations encountered in the real applications. In Tables, compared to the SIFT algorithm, by the studied algorithm, the matching time reduced 85–92 %, and in a certain accuracy level, the number of the matched points is cut down to 58–86 %; Calc. MM Time in crease 2.6–5.7 times, but MM error just increse 7.3–8.4 %. The SIFT accuracy is little higher than that of proposed algorithm, but the speed of the proposed algorithm is much faster than that of SIFT. Even the number of detected points by the Harris Laplacian detector depends on user chosen thresholds; it is possible to auto-set a threshold in the similar feature images, the threshold can be determined by different ways (Wang 2008; Zhao et al. 2011). The other example is for airport remote sensing images, and it is shown in Fig. 8. Here one thing needs to address is that the scale selection is not affecting H-L detection too much as shown in Fig. 9.

Table 1 Proposed algorithm Compared with SIFT on image #1 within a certain accuracy level
Table 2 Proposed algorithm Compared with SIFT on image #2 within a certain accuracy level
Fig. 8
figure 8

Registration result of airport remote sensing image. a Original remote sensing image of airport b Detection result of reference image c Detection result of registration image d Matching result

Fig. 9
figure 9

Comparison between Harris and H-L detectors with different scales. a σ = 1 Harris detection b σ = 1 H-Ldetection c σ = 2 Harris detection d σ = 2 H-L detection

Conclusions

In this paper, an automatic remote sensing image registration algorithm is proposed. The algorithm firstly employs the modified multi-scale Harris-Laplacian detector for corner detection in a registration image and its reference image. Within a certain of detecting accuracy level, the time cost of the studied algorithm is much lower than that of the SIFT algorithm, and the right matching probability of the studied algorithm is higher than that of the SIFT algorithm. Then the algorithm utilizes the SURF feature descriptor for corner feature vector construction. Finally the algorithm uses the LoG scale selection operator to find out the scale factor between the registration image and its reference image automatically. This is the difference between the proposed registration algorithm and the other existing algorithms.

Experiments show that the studied algorithm can resolve the problem of registering images at different scales and resolutions automatically without manual assistance, as it is predicted in the beginning of this paper. Although this algorithm has its superiority for automatic image registration of remote sensing images, and it can also be used for the other image registrations.

The future work is to compare the studied algorithm with the other registration algorithms in more details (e.g. the algorithm by Mikolajczyk, K. & Schmid (Mikolajczyk and Schmid 2007)), try the optimised KD-trees for fast image descriptor matching for the images with huge size. Further more, it is important to make the algorithm not only has the property for fully automatic image registration but also has the characteristics for strong anti-noise ability and computing efficiency. In this study, although the number of the detecting points by the Harris Laplacian detector is by manual, the testing results show that the number of the detecting points is much lower than that of SIFT (4–12 times difference), and it is meaningful. The further work is to develop an auto-threshold algorithm detecting the number of the points in the same level, for the images of the similar features.