1 Introduction

With the wide application of imaging technology in all walks of life, people’s demand for image resolution is higher and higher. However, it is expensive to improve the resolution of the image only from the aspect of hardware. In the process of image acquisition, the image quality cannot meet the actual requirements due to the inherent resolution limitation, blur and noise of the optical system. Therefore, it is necessary to obtain a higher resolution image through image processing algorithm. High-resolution reconstruction is one of the methods to improve image resolution. High-resolution (HR) reconstruction refers to the process of reconstructing the original HR images [1, 2] from one or several low-resolution (LR) images.

Image registration plays an indispensable role in image HR reconstruction. SIFT (Scale Invariant Feature Transform) [3] is a robust scale-invariant Feature description method, and has been applied to face recognition, image mosaic, image registration and other fields [4]. However, SIFT algorithm has the characteristics of large amount of data, high time complexity and long time consuming. In view of the shortcomings of SIFT method with large data volume and longtime consumption, an image registration method based on Speeded Up Robust Features (SURF) was proposed. First, Magdy et al. [5] used the SURF method to extract feature points, and the matching point pairs of two images were obtained using the nearest neighbor matching. Then, the transformation relationship between images is obtained by combining Random Sampling Consensus [6] and least square method, and finally the registered image is obtained. For the registered image, the image fusion algorithm can be used to obtain high resolution image. Image fusion will collect the image data through image processing and computer technology, finally integrated into high-quality images, to improve the utilization of image information, improve the accuracy and reliability of computer interpretation. Among image fusion algorithms, the Sum of Modified Laplacian (SML) algorithm reflects the edge feature information of the image, and can clearly represent the details of the image at every scale and resolution [7]. In this way, by comparing the scale images corresponding to the two source images, the prominent details of the source images can be fused into the fusion image, so as to enrich the information of the fusion image and achieve a better fusion effect.

However, in the field of imaging, how to effectively simplify the experimental equipment and fully improve the efficiency of the imaging system while ensuring the imaging quality is also a problem that needs to be solved. Ghost imaging, as a new imaging method, makes use of quantum entanglement or spatial intensity correlation to acquire object imaging, which breaks through the traditional imaging concept of linear optics [8,9,10] and has attracted extensive attention in recent years. The non-classical light source and classical thermal light source were initially used in the ghost imaging experiment [11]. With the further research, the technology developed rapidly [12,13,14], from pseudorex ghost imaging [15], computed ghost imaging [16], compressed sensing ghost imaging [17], differential ghost imaging [18], blind ghost imaging [19] to quantum ghost imaging [20]. Liangsheng et al. [21] proposed a two-layer watermarking scheme based on computational ghost imaging and singular value decomposition. Dongfeng et al. [22] proposed a novel technique for polarization-multiplexing ghost imaging to simultaneously obtain multiple polarimetric information by a detector. Fei et al. [23] proposed a quantum circuit implementation of ghost imaging experiment, where the speckle patterns and phase mask are encoded by utilizing the quantum representation of images.

This paper proposes a high resolution reconstruction method of ghost imaging via SURF-NSML. First, a series of low-resolution images obtained by multiple measurements by the ghost imaging system are applied, and then the image registration method of Speeded Up Robust Features (SURF) and the fusion algorithm of New Sum of Modified Laplacian (NSML) are applied. These LR images are registered and fused to obtain an initial HR image. Finally, the initial HR image is further optimized by the image super-resolution reconstruction strategy based on flattening to obtain a high-resolution image with better visual characteristics. Two images in the natural scene and two images in the medical field were reconstructed, respectively, and then SSIM and RMSE indexes were used to verify and analyze the reconstruction method.

2 HR reconstruction method of ghost imaging via SURF-NSML

The HR reconstruction method of ghost imaging via SURF-NSML proposed in this paper is shown in Fig. 1. It mainly includes four parts: Acquisition of low resolution images, image registration via Speeded Up Robust Features, fusion reconstruction via New Sum of Modified Laplacian, image high-resolution optimization.

Fig. 1
figure 1

Schematic diagram of the reconstruction method presented in this paper

2.1 Acquisition of low resolution images

In this method, a ghost imaging system is used to obtain a series of low-resolution images. In ghost imaging system, a bucket detector with no spatial resolution is used to record the light intensity after the diffraction of an object. The laser is irradiated vertically on the spatial light modulator (such as SLM and DMD), and the light field modulated by SLM irradiates the target image \(T\left( {x,y} \right)\). \(I_{i} \left( {x,y} \right)\) is the distribution of the light field hitting target. The value of light intensity reflected from the object is collected by a bucket detector.

The light intensity value is obtained according to the Fresnel propagation function, and the calculation formula is as follows:

$$B = \int {\int {\left( {I\left( {x,y} \right)T\left( {x,y} \right)} \right)} \;{\text{d}}x{\text{d}}y} = \sum\limits_{{}}^{{}} {\sum\limits_{{}}^{{}} {I\left( {x,y} \right)T\left( {x,y} \right)} }$$
(1)

In the imaging process, the ghost imaging instrument platform is repeated M times, and a series of low-resolution images can be reconstructed using the associative function based on the light intensity value and modulation matrix obtained each time. The calculation formula is as follows:

$$T_{M} \left( {x,y} \right) = \frac{1}{N}\sum\limits_{i = 1}^{N} {\left( {B_{M \, i} - \left\langle {B_{M} } \right\rangle } \right)\left( {I_{i} \left( {x,y} \right) - \left\langle {I_{i} \left( {x,y} \right)} \right\rangle } \right)}$$
(2)

where N represents the collection times of experiments carried out by the platform for each operation, and images are reconstructed using N SLM modes during the operation. \(\left\langle {B_{M} } \right\rangle\) represents the average light intensity value of the \(M{\text{th}}\) operation, \(T_{M}\) represents the LR image reconstructed at the \(M{\text{th}}\) operation.

2.2 Image registration via SURF

After a series of LR images are obtained through the ghost imaging system, to accurately estimate the shift between the images, SURF algorithm is adopted here to perform image registration. SURF determines the matching degree by calculating the Euclidean distance between two feature points. The shorter the Euclidean distance, the better the matching degree of the two feature points. The trace judgment of Hessian matrix is also added. If the signs of the trace of two feature points are the same, it means that the two features have contrast changes in the same direction; if they are different, it means that the contrast changes of the two feature points are in opposite directions. Image registration process based on SURF is shown in Fig. 2.

Fig. 2
figure 2

Image registration process based on SURF

The overall thought process of SURF feature detection and description is as follows:

Step 1 Feature detection

Feature point detection is based on scale space theory. Point \(\hat{x} = \left( {x,y} \right)\) in image \(I\left( {x,y} \right)\), the Hessian matrix on scale \(\delta\) is defined as

$$H = \left[ {\begin{array}{*{20}c} {L_{xx} \left( {\hat{x},\delta } \right)} & {L_{xy} \left( {\hat{x},\delta } \right)} \\ {L_{xy} \left( {\hat{x},\delta } \right)} & {L_{yy} \left( {\hat{x},\delta } \right)} \\ \end{array} } \right]$$
(3)

where \(L_{xx}\),\(L_{xy}\) and \(L_{yy}\) are the result of the convolution of the second derivative \(\frac{{\partial^{2} }}{{\partial_{x}^{2} }}g\left( \delta \right)\) of the Gaussian filter with, \(g\left( \delta \right) = \frac{1}{{2\pi \delta^{2} }}e^{{{\raise0.7ex\hbox{${ - \left( {x^{2} + y^{2} } \right)}$} \!\mathord{\left/ {\vphantom {{ - \left( {x^{2} + y^{2} } \right)} {2\delta^{2} }}}\right.\kern-\nulldelimiterspace} \!\lower0.7ex\hbox{${2\delta^{2} }$}}}} .\)

Here, the scale image pyramid is constructed with a method similar to SIFT, and the candidate feature points are selected after the extremum is obtained with the Hessian matrix. Then, interpolation operations are carried out in the scale space and image space to obtain the stable position and scale value of the feature points.

Step 2 Determine the main direction

To ensure rotation invariance, first, with the feature point as the center, the Haar wavelet responses of the points in the neighborhood with a radius of 6s (s is the scale value of the feature point) in the x and Y directions are calculated, and the Gaussian weight coefficients are assigned to these response values, so that the response contribution near the feature point is large, while the response contribution far from the feature point is small, which is more in line with the objective reality. Then, the response phases within the range of 60°are combined to form a new vector, and the direction of the longest vector is selected as the principal direction of the feature point. In this way, the main direction of each feature point can be obtained by calculating the feature points in Step 1 one by one.

Step 3 Descriptor formation

With the feature points as the center, the coordinate axis is first rotated to the main direction, and a square region with a side length of 20 s is selected according to the main direction. The window region is divided into 4*4 sub-regions. In each sub-region, the wavelet response is calculated, and the response value is also given a weight coefficient to increase the robustness of geometric transformation. Then, the absolute value of each sub-region response are added to form a description vector, and then the vector is normalized, so as to have a certain robustness to illumination.

Step 4 Feature matching

After the feature points of the reference image and the image to be registered were obtained by SURF method, the feature matching was carried out. As shown in Fig. 3. The feature vector contains the information of the neighborhood of the feature points, and the nearest neighbor matching method can be used to find out the potential matching pairs without additional information calculation. In this paper, the nearest neighbor vector matching method is adopted to find out all potential matching pairs by calculating the Euclidean distance of the feature points extracted from two images.

Fig. 3
figure 3

Feature matching diagram using SURF method

Step 5 Parameters estimation and resampling

Random Sampling Consensus was combined with the least square method to calculate the transformation relationship between images and remove the influence of mismatching. Finally, the registration image is resampled to obtain the registration result in the same coordinate system.

2.3 Fusion reconstruction via NSML

After a series of low resolution images are registered, NSML is used for image fusion to obtain the initial high resolution reconstructed images. To obtain better visual characteristics, more detailed information and prominent fusion effect, the improved New Sum of Modified Laplacian (NSML) image fusion algorithm is adopted here. Based on the only horizontal and vertical directions in traditional SML calculation of the Laplace operator value of each pixel with variable step size, the algorithm adds four directions on the diagonal line. The algorithm is defined as follows:

$$\begin{aligned} {\text{ML}}\left( {x,y} \right) & = \left| {2I\left( {x,y} \right) - I\left( {x - {\text{step}},y} \right) - I\left( {x + {\text{step}},y} \right)} \right| \\ & \quad + \left| {2I\left( {x,y} \right) - I\left( {x,y - {\text{step}}} \right) - I\left( {x,y + {\text{step}}} \right)} \right| \\ & \quad + \left| {1.4I\left( {x,y} \right) - 0.7I\left( {x - {\text{step}},y - {\text{step}}} \right) - 0.7I\left( {x + {\text{step}},y + {\text{step}}} \right)} \right| \\ & \quad + \left| {1.4I\left( {x,y} \right) - 0.7I\left( {x + {\text{step}},y - {\text{step}}} \right) - 0.7I\left( {x - {\text{step}},y + {\text{step}}} \right)} \right| \\ \end{aligned}$$
(4)
$${\text{NSML}}\left( {x,y} \right) = \sum {\sum {{\text{ML}}\left( {i,j} \right)} }$$
(5)

Here, \({\text{step}}\) is the variable step size parameter, \({\text{ML}}\) value accumulator window size is \(\left( {2N + 1} \right) * \left( {2N + 1} \right)\), and parameter settings of \({\text{step}}\) and \(N\) are mainly dependent on noise. The NSML fusion algorithm has excellent sharpness discrimination ability. The higher the sharpness of an image, the larger the corresponding \({\text{NSML}}\) value is. Assuming that \(F_{1} ,F_{2}\) and \(F\) respectively represent pixel values of the source image and the fusion image corresponding to the same position, while \({\text{NSML}}_{1} ,{\text{NSML}}_{2}\) respectively represent the \({\text{NSML}}\) clarity of \(F_{1} ,F_{2}\), then the fusion rules based on the NSML algorithm are as follows:

$$F\left( {x,y} \right) = \left\{ {\begin{array}{*{20}c} {F_{1} \left( {x,y} \right)\begin{array}{*{20}c} , & {{\text{NSML}}_{1} \left( {x,y} \right) > {\text{NSML}}_{2} \left( {x,y} \right)} \\ \end{array} } \\ {F_{2} \left( {x,y} \right)\begin{array}{*{20}c} , & {{\text{NSML}}_{2} \left( {x,y} \right) > {\text{NSML}}_{1} \left( {x,y} \right)} \\ \end{array} } \\ \end{array} } \right.$$
(6)

2.4 Image high-resolution optimization

In this process, the initial HR image is further optimized by a image super-resolution reconstruction strategy based on flattening. Specifically, for an image \(F\left( {x,y} \right)\) of size \(m \times n\), the gray value of pixels corresponding to any position \(\left( {x,y} \right)\)\(\left( {1 \le x \le m,1 \le y \le n} \right)\) is \(f\left( {x,y} \right)\) \(\left( {0 \le f\left( {x,y} \right) \le L - 1} \right)\) (\(L\) is the total number of image gray levels). According to the Probability Density Function of gray levels in the image, the occurrence probability of different gray levels can be expressed as:

$$p_{i} = \frac{1}{m \times n}\sum\limits_{x = 1}^{m} {\sum\limits_{y = 1}^{n} {\delta \left( {f\left( {x,y} \right) - i} \right)} } ,\quad 0 \le i \le L$$
(7)

Here, \(\delta \left( x \right) = \left\{ {\begin{array}{*{20}c} {1,x = 0} \\ {0,x \ne 0} \\ \end{array} } \right.\). Using histogram equalization optimization method, the Cumulative Distribution Function is used to perform the following transformation on the input gray level.

$$\sum\limits_{k = 0}^{i} {p_{k} - \frac{f\left( i \right)}{{L - 1}}} \ge 0,\quad 0 \le i < L$$
(8)

Function \(f\left( i \right)\)\(\left( {0 \le i < L} \right)\) represents the mapping relationship between the image gray level \(i\) before optimization and the gray level \(i^{\prime} = f\left( i \right)\) after optimization. In general, the grayscale mapping expression commonly used in histogram equalization is:

$$i^{\prime} = f\left( i \right) = \left| {\left( {L - 1} \right) \cdot \sum\limits_{k = 0}^{i} {p_{k} } } \right|,\quad 0 \le i < L$$
(9)

The above transformation generates an image whose gray level is relatively balanced and covers the whole range [0,1]. After histogram equalization, the contrast and average brightness of the image are significantly improved, histogram is significantly expanded on the whole brightness scale, the image quality is improved.

3 Result analysis

High resolution images can be obtained using the algorithm in this paper, which is not only suitable for reconstruction of images in natural scenes, but also for reconstruction of medical images. Generally speaking, in the process of sampling, the larger the sampling interval is, the spatial resolution is low, the imaging quality is poor, and the mosaic effect occurs in serious cases. The smaller the sampling interval, the higher the spatial resolution and the better the image quality. Therefore, in our imaging method, to obtain high spatial resolution images, we use full sampling information for reconstruction. The reconstruction effect is shown in Table 1.

Table 1 Reconstruction results of images in different scenes

Through the reconstruction of natural scene images and medical images, respectively, two reference image evaluation indexes structural similarity (SSIM) and root mean squared error (RMSE) were used for evaluation and analysis, and this algorithm was compared with the reconstructed images of ghost imaging (GI) and sum of modified laplacian-ghost imaging (SML-GI).

3.1 Feasibility analysis

Structural Similarity (SSIM) was adopted to verify the feasibility of the scheme. Structural similarity algorithm is a method to measure the degree of similarity between the image to be evaluated and the original image. It has a good consistency with the subjective feeling of human eyes. The larger the value, the better, and the maximum value is 1. The structural information is defined as brightness \(l\), contrast \(c\) and structural attribute \(s\). Mean \(\mu_{x} ,\mu_{y}\), standard deviation \(\delta_{x} ,\delta_{y}\) and covariance \(\delta_{xy}\) were respectively used to measure. Then the expression is:

$$\left\{ \begin{aligned} l\left( {x,y} \right) & = \frac{{2\mu_{x} \mu_{y} + C_{1} }}{{\mu_{x}^{2} + \mu_{y}^{2} + C_{1} }},c\left( {x,y} \right) = \frac{{2\delta_{x} \delta_{y} + C_{2} }}{{\delta_{x}^{2} + \delta_{y}^{2} + C_{2} }},s\left( {x,y} \right) = \frac{{\delta_{xy} + C_{3} }}{{\delta_{x} \delta_{y} + C_{3} }} \\ {\text{SSIM}} & = \left[ {l\left( {x,y} \right)} \right]^{\alpha } \left[ {c\left( {x,y} \right)} \right]^{\beta } \left[ {s\left( {x,y} \right)} \right]^{\gamma } \\ \end{aligned} \right.$$
(10)

\(C_{1} ,C_{2} ,C_{3}\) is a very small positive number. When \(\alpha = \beta = \gamma = 1\), the SSIM formula in formula (10) can be simplified as:

$${\text{SSIM}} = l\left( {x,y} \right) \cdot c\left( {x,y} \right) \cdot s\left( {x,y} \right) = \frac{{\left( {2\mu_{x} \mu_{y} + C_{1} } \right)\left( {2\delta_{xy} + C_{2} } \right)}}{{\left( {\mu_{x}^{2} + \mu_{y}^{2} + C_{1} } \right)\left( {\delta_{x}^{2} + \delta_{y}^{2} + C_{2} } \right)}}$$
(11)

SSIM is used to evaluate the degree of similarity between the resulting image and the ideal image, as shown in the Fig. 4.

Fig. 4
figure 4

SSIM values of reconstruction results of different images at different stages

①-LR, ②-LR, ③-LR, ④-LR, ⑤-LR and ⑥-LR represent a series of low-resolution images obtained by the imaging system. Six of them are selected here as examples. \(I{\text{ - HR}}\) represents the initial high-resolution image obtained after fusion algorithm, and \(O - {\text{HR}}\) represents the optimized high-resolution image finally obtained through the algorithm in this paper. As can be seen from Fig. 4, the images obtained by fusion reconstruction using this method have higher SSIM value than the low-resolution images directly obtained by the imaging system, and the reconstruction effect is closer to the ideal image. Taking natural image 1 and medical image 2 as examples, the SSIM value of the reconstruction results obtained by this method are 0.9953 and 0.9927, almost close to 1. And this method is not only applicable to natural images, but also to medical images, which meets the applicability of imaging technology.

3.2 Fidelity analysis

Aiming at the problem of image quality fidelity evaluation, Root Mean Squared Error (RMSE) is used to measure the image quality. The main purpose is to calculate the relationship between the fused image and the ideal image. The calculation formula of the two is as follows:

$${\text{RMSE}} = \sqrt {\frac{{\sum\limits_{i = 1}^{M} {\sum\limits_{j = 1}^{N} {\left( {V_{i,j} - V_{i,j}^{\prime } } \right)^{2} } } }}{M \times N}}$$
(12)

where \(M \times N\) represents the size of the image, and \(V_{i,j}\) and \(V_{i,j}^{\prime }\) represent the pixel values of the original image and the reconstructed image, respectively. If the RMSE value calculated is relatively small, it means that the fused image is close to the ideal image, which can reflect that the evaluation effect of this method is better. RMSE values of reconstruction results of natural image 1 and medical image 2 at different stages are shown in Figs. 5 and 6.

Fig. 5
figure 5

RMSE value of natural image reconstruction

Fig. 6
figure 6

RMSE value of medical image reconstruction

As can be seen from Fig. 5, the reconstructed natural image 1 obtained by the method proposed in this paper has a lower RMSE, with a value of 0.1034. However, the lowest RMSE value in the low-resolution image obtained by the imaging system is 0.3411, which is significantly reduced by 0.2377 compared with the latter. Therefore, the natural image reconstructed by the former method has a better effect. In Fig. 6, the lowest RMSE value of the low-resolution image obtained by the imaging system is 0.3673, and the highest RMSE value is 0.6618. However, the RMSE value of reconstructed medical image 2 obtained by the method in this paper is 0.0964, significantly reduced by 0.2709–0.5654. Figure 6 and Fig. 7 prove that the fusion image of the proposed method is closer to the ideal image and can better satisfy the intuitive feeling of human vision.

Fig. 7
figure 7

Diagram of experimental equipment

3.3 Comparative analysis of experiments

To better verify the superiority of the algorithm presented in this paper, several different reconstruction methods are compared. The experimental platform is shown in Fig. 7. A 10UJ-5KHZ-532 nm laser is used, with a spectral width of 0.2 nm and a jitter error of the photosynchronous trigger signal of 0.4 ns. The light source is irradiated to the target object through a mirror (here, natural and medical images are taken as the target images, the pixel size is 64*64 and the physical size is 2.26*2.26 cm). The light reflected from the object is irradiated through the lens to the DMD (Model V-7001) modulator. The reflected light intensity on the DMD surface is received by the detector-Photomultiplier Tube (PMT, Model H10721.01), and then transmitted to the acquisition card of the computer (Model is M2I.2030. exp), and finally imaging is carried out by the computer associated reconstruction algorithm. The experimental device does not need to use spectroscopic and filter device, so the experimental device is simplified. It is received by a bucket detector, which greatly increases the utilization of experimental equipment.

Reconstructed images of different methods were evaluated and analyzed by SSIM and RMSE indexes, as shown in Table 2.

Table 2 SSIM and RMSE of images reconstructed by different methods

It can be seen from Table 2 that, for both natural images and medical images, the reconstruction results of the proposed method have higher SSIM value than GI and SML-GI algorithm, and lower RMSE value than the other two methods. It shows that the reconstructed image has better structural similarity with the ideal image, and is more close to the visual perception of the ideal image, and can reproduce the target image with better visual characteristics and higher quality.

4 Conclusion

To obtain high resolution images, this paper proposes a high resolution reconstruction method of ghost imaging via SURF-NSML. First, a series of low-resolution images produced by the ghost imaging system. Then, image registration method of Speeded Up Robust Features (SURF) and fusion algorithm of New Sum of Modified Laplacian (NSML) were used, and these low-resolution images are registered and fused to obtain an initial high-resolution (HR) image. Finally, the initial HR image was further optimized using the image super-resolution reconstruction strategy based on flattening to obtain a HR image with better visual characteristics and more detailed information. This HR image reconstruction method does not need the use of spectroscopic devices, filters and other devices, simplifying the experimental equipment; and it is received by a bucket detector, which greatly increases the utilization of experimental equipment. Through the reconstruction of images in the natural scene and in the medical field, SSIM and RMSE indexes are used for evaluation and analysis, and the comparison of reconstruction image between this method, GI and SML-GI algorithm, it proves that this method combining the device platform and language algorithm can reconstruct the target image with higher quality. And it can promote the research of image reconstruction.