Abstract
In the last three decades, multi-frame and single-frame super-resolution and reconstruction techniques have been receiving increasing attention because of the large number of applications that many areas have found when increasing the resolution of their images. For example, in high-definition television, high-definition displays have reached a new level and resolution enhancement cannot be ignored; in some remote sensing applications, the pixel size is a limitation; and in medical imaging, the details are important for a more accurate diagnostic or acquiring high-resolution images while reducing the time of radiation to a patient. Some of the problems faced in this area, that in the future require dealing more effectively, are the inadequate representation of edges, inaccurate motion estimation between images, sub-pixel registration, and computational complexity among others. In this chapter, an overview of the most important methods classified into two taxonomies, multiple- and single-image super-resolution, is given. Moreover, two new techniques for single-image SR are proposed.
Access provided by CONRICYT-eBooks. Download chapter PDF
Similar content being viewed by others
Keywords
1 Introduction
Image super-resolution (SR) refers to the process of creating clear and high-resolution (HR) images from a single low-resolution (LR) image or from a sequence of low-resolution observations (Schultz and Stevenson 1994). In this chapter, the most important SR techniques are explained.
The methods for SR are addressed including the definition of each method. We address big topics of work in super-resolution as: the pure interpolation with high scales of amplification, the use of dictionaries, the variational procedures and the exploiting of gradients sharpening. Each section in this chapter yields a guide for the technical comprehension of each procedure. The technical procedures of the cited articles are not fully reproduced but neither is a superficial description made without ideas for a practical realization.
The first main separation between the SR methods is determined by the resources to employ in the process. In the first case, a group of LR images are used. These procedures refer to the first publications about the topic. In the second case, due to practical situations, the SR is carried out by using only the input image of low resolution. Figures 5.1 and 5.2 show the taxonomies of the more evident classification of the methods, multiple-image SR or single-image SR.
In the second class of methods, we refer to the domain of application, spatial domain or frequency domain. The following proposed differentiation between SR methods is based on the mathematical models in order to reach the high resolution. Transformations, probabilistic prediction, direct projection, learning dictionaries, reduction of dimension and reconstruction models under minimization procedures and residual priors are discussed. A common goal is the incorporation of the lost high-frequency details. Finally, we propose two new methods for single-image SR. The first one is based on gradient control, and the second one is a hybrid method based on gradient control and total variation.
The rest of the chapter is organized as follows: In Sect. 5.2, the methods are explained. In Sect. 5.3, the results of the proposed methods are presented. In Sect. 5.4, the metrics used to characterize the methods are presented. Finally, the chapter concludes in Sect. 5.5.
2 Methods
The accuracy in the estimation of the HR image is result of a right selection of mathematical tools and signal processing procedures as transformations, learning models, minimization techniques, and others for reaching the major content of high spatial frequencies or details in the output image. In this section, current methods as well as the proposed SR procedures are explained. SR models for single image and multiple image are considered.
2.1 Image Model
Down-sampling and warping are two processes in consideration for a more realistic representation of the image at low resolution. In the first process, the image is averaged over equal areas of size \( q\, \times \,q \) as can be seen from Eq. (2.1). In the warping process, the image is shifted along x and y directions, and the distances a and b are in pixels. Also, a rotation Ɵ is assumed on the image (Irani and Peleg 1990; Schultz and Stevenson 1994) as can be observed in Eq. (2.2).
In a SR algorithm, the model of degradation is fundamental for comparative purposes and evaluation of the effectivity of the algorithm. Equation (5.1) considers the blurring and down-sampling processes, and Eq. (5.2) represents the warping operation. For a number k of LR images with noise added, the model becomes (Irani and Peleg 1990),
Equation (5.3) incorporates the distortions that yield a LR image, d is the down-sampling operator, h k is the blurring operator, w k is the warping operator, and f(x, y) is the HR image. Furthermore, the blurring process can be composed by distortions due to the displacement h d , the lens h l , and the sensors h s . The result is a convolution operation. The transformations are shown in Fig. 5.3.
2.2 Image Registration
Translation and rotation relationship, between a LR and a HR image, is calculated using Eq. (5.4) (Keren et al. 1998),
where \( x_{k}^{t} \) and \( y_{k}^{t} \) are the displacements, q x and q y the sampling rates, and \( \theta \) the rotation angle. Two acquisitions g1 and g2 with rotation and displacements can be related using the following Eq. (5.5).
2.3 Approximation Between Acquisitions
The approximation to this parameter has been solved using the Taylor series representation. In the first step, \( \sin \,\theta \) and \( \cos \,\theta \) are expressed in series expansion using the first two terms.
Then, the function g1 can be expanded with Taylor series,
Finally, the parameters a, b, and \( \theta \) of Eq. (5.7) are determined using partial derivatives on the final expansion and solving the equation system.
2.4 Frequency Domain
The models in frequency domain consider the sampling theory. There, a 2D array of Dirac deltas (D T ) performs the sampler function. The array has the same form in time and frequency domains (2D impulse train). The acquisition process multiplies the array of D T with the image in the spatial domain point by point. This operation in frequency domain becomes a convolution operation. The advantage is that the resolution of the convolution kernel (sampling array in the frequency domain in the interval of [−π, π]) can be increased for optimal scales of amplification, checking the high-frequency content at the output of the process. The Fourier transform of the sampling is shown in Eq. (5.8),
and the convolution with the image can be expressed as in Eq. (5.9),
The high-frequency content in \( S_{\text{amp}} \) must be maximized. This strategy has been used in (Morera 2015). Figure 5.4 shows a 1D sampling array in space and frequency domains.
2.5 Wavelet Transform
The wavelet transform introduces the analysis of the image generally in four fields of information. The common decomposition brings directional information of fluctuation of the image signal. The coefficients of the transformation are present in four groups. The low-frequency coefficients which are a coarse representation of the image, the horizontal, the vertical and the diagonal coefficients which represent details of directional variations of the image. The most common strategy for SR using wavelets applies a non-sub-sampled wavelet or static wavelet before a wavelet reconstruction, and the first step produces a decomposition of four images with the same dimension as the input. Then, the wavelet reconstruction produces an amplified image with scale factor 2, this strategy is employed in (Morera 2014).
2.6 Multiple-Image SR
The main goal in this group of techniques is the simulation of the process of formation of the image in order to reject the aliasing effects due to the down-sampling effect. A group of acquisitions of the same scene in LR is required for estimation of the HR image.
2.6.1 Iterative Back-Projection
Iterative back-projection (IBP) methods were the first methods developed for spatial-based SR. IBP algorithm yields the desired image that satisfies that the reconstruction error is close to zero. In other words, the IBP is convergent. Having defined the imaging model like the one given in Eq. (5.3), the distance \( \left\| {Af - g} \right\|_{2}^{2} \) is minimized, where matrix \( A \) includes the blur, down-sampling and warping operations, \( f \) is the original HR image, and \( g \) is the observed image. The HR estimated image is generated and afterward refined. Such a guess can be obtained by registering the LR images over a HR grid and then averaged them (Irani and Peleg 1990, 1991, 1992, 1993). The iterative model given in Eq. (5.10) is used to refine the set of the available LR observations. Then, the error between the LR images and the observed ones is obtained and back-projected to the coordinates of the HR image to improve the initial estimation (Irani and Peleg 1993). The Richardson iteration is commonly used in these techniques.
where \( w_{k}^{ - 1} \) is the inverse of the warping operator, \( \dot{d} \) is the up-sampling operator, \( \dot{h} \) is a deblurring kernel, k = 1…K is the number of LR acquisitions, \( f^{(t + 1)} (x,y) \) is the reconstructed SR image in the (t + 1)th iteration, and \( f^{(t)} (x,y) \) is the reconstructed SR image in the previous (t)th iteration. The shortcoming of this algorithm is that produces artifacts along salient edges.
2.6.2 Maximum Likelihood
The noise term in the imaging model given in Eq. (5.3) is assumed to be additive white Gaussian noise (AWGN) with zero mean and variance \( \sigma^{2} \). Assuming the measurements are independent and the error between images is uncorrelated, the likelihood function of an observed LR image g k for an estimated HR image \( \hat{f} \) (Cheeseman et al. 1994; Capel and Zisserman 1998; Elad and Hel-Or 2001; Farsiu et al. 2004; Pickup et al. 2006; Pickup 2007; Prendergast and Nguyen 2008; Jung et al. 2011a) is,
The log-likelihood transforms the product into a summation. Therefore, Eq. (5.11) becomes the summation of a term C that does not depend on \( f \) and the summation of the exponents of the exponential function as shown in Eq. (5.12),
The maximum likelihood (ML) solution (Woods and Galatsanos 2005) seeks a super-resolved image \( \hat{f}_{\text{ML}} \) which maximizes the log-likelihood for all observations. Notice that after maximization the constant term vanishes. Therefore, the super-resolved images can be obtained by maximizing Eq. (5.12) or, equivalently, by minimizing the distance between \( \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\frown}$}}{g}_{k} \) and \( g_{k} \) as,
2.6.3 Maximum a Posteriori
Given the LR images g k , the maximum a posteriori (MAP) method (Cheeseman et al. 1994) finds an estimate \( \hat{f}_{\text{MAP}} \) of the HR image by using the Bayes rule in Eq. (5.14),
The estimate can be found by maximizing log of Eq. (5.14). Notice that the denominator is a constant term that normalizes the probability conditional. This term is going to be zero after maximization then,
Applying statistical independence between the images gk, Eq. (2.15) can be written as,
where
The probability \( p\left( {g_{k} |f} \right) \) is named the regularization term. This term has been modeled in many different forms; some cases are:
-
1.
Natural image prior (Tappen et al. 2003; Kim and Kwon 2008, 2010).
-
2.
Stationary simultaneous autoregression (SAR) (Villena et al. 2004), which applies uniform smoothness to all the locations in the image.
-
3.
Non-stationary SAR (Woods and Galatsanos 2005) in which the variance of the SAR prediction can be different from one location in the image to another.
-
4.
Soft edge smoothness a priori, which estimates the average length of all level lines in an intensity image (Dai et al. 2007, 2009).
-
5.
Double-exponential Markov random field, which is simply the absolute value of each pixel value (Debes et al. 2007).
-
6.
Potts–Strauss MRF (Martins et al. 2007).
-
7.
Non-local graph-based regularization (Peyre et al. 2008).
-
8.
Corner and edge preservation regularization term (Shao and Wei 2008).
-
9.
Multi-channel smoothness a priori which considers the smoothness between frames (temporal residual) and within frames (spatial residual) of a video sequence (Belekos et al. 2010).
-
10.
Non-local self-similarity (Dong et al. 2011).
-
11.
Total subset variation, which is a convex generalization of the total variation (TV) regularization strategy (Kumar and Nguyen 2010).
-
12.
Mumford–Shah regularization term (Jung et al. 2011b).
-
13.
Morphological-based regularization (Purkait and Chanda 2012).
- 14.
2.7 Single-Image SR
Single-image SR problem is a very ill-posed problem. It is necessary an effective knowledge about the HR image to obtain a well-posed HR estimation. The algorithms are designed for one acquisition of low resolution of the image. Some of the strategies proposed are summarized following,
-
1.
Pure interpolation using estimation of the unknown pixels in the HR image, modification of the kernel of interpolation, and checking the high-frequency content in the estimated output HR image.
-
2.
Learning the HR information from external databases. In this case, many strategies of concentration of the information of the image and clustering are used. Then, the image is divided into overlapping patches and this information is mapped over a dictionary of LR–HR pairs of patches of external images.
-
3.
Manage the information of gradients in the image.
-
4.
Hybrid models used to reconstruct the image with a minimization procedure in which some prior knowledge about the estimation error is included.
2.7.1 Geometric Duality
The concept of geometric duality is one of the most useful tools in the parametric SR with least-square estimation for interpolation, and one of the most cited algorithm in comparison with SR method is the new edge-directed interpolation (NEDI) (Li and Orchad 2001).
The idea behind is that each low-resolution pixel also exists in the HR image and the neighbor pixels are unknown. Hence, with two orthogonal pairs of directions around the low-resolution pixel in the HR image (horizontal, vertical, and diagonal directions), a least-square estimation can be used in each pair. The equation system is constructed in the LR image, and then, the coefficients are used to estimate pixels in the HR initial image. The first estimation is made by using Eq. (5.17),
where the coefficients are obtained in the same configuration as in the LR image. In this case, the unknown pixels between LR pixels that exist in the HR image (in vertical and horizontal directions) are estimated. In the next step, the unknown pixels between LR pixels that exist in the HR image (in diagonal directions) are estimated. The pixels of each category are shown in Fig. 5.5.
In (Zhang and Wu 2008), take advantage of NEDI. There, a new restriction is applied including the estimated pixels in the second step, and the minimum square estimation is made using the 8-connected pixels around a central pixel in the diamond configuration shown in Fig. 5.6. They define a 2D piecewise autoregressive (PAR) image model of parameters \( a_{k} ,k \in [1, \ldots ,4] \) to characterize the diagonal correlations in a local diamond window \( W \) and the extra parameters \( b_{k} ,k \in [1, \ldots ,4] \) to impose horizontal and vertical correlations in the LR image as shown in Fig. 5.6b. The parameters are obtained using a linear least-square estimator using four 4-connected neighbors for \( b_{k} \) (horizontal and vertical), and four 8-connected diagonal neighbors, available in the LR image for \( a_{k} \).
To interpolate the missing HR pixel in the window, the least-square strategy of Eq. (5.18) is carried out.
where \( x_{i} \) and \( y_{i} \) are the LR and the HR pixels, respectively, \( x_{i\diamondsuit k}^{(8)} \) are the four 8-connected LR neighbors available for a missing \( y_{i} \) pixel and for a \( x_{i} \) pixel, and \( y_{i\diamondsuit k}^{(8)} \) denotes its HR missing four 8-connected pixels.
Other approach for NEDI algorithms (Ren et al. 2006; Hung and Siu 2012) uses a weighting matrix \( {\mathbf{W}} \) to assign different influence of the neighbor pixels on the pixel under estimation. The correlation is affected by the distance between pixels. The diagonal correlation model parameter is estimated by using a weighted least-square strategy.
where \( {\mathbf{A}} \in R^{4 \times 1} \) is the diagonal correlation model parameter, \( {\mathbf{L}} \in R^{64 \times 1} \) is a vector of the LR pixels, \( {\mathbf{L}}_{LA} \in R^{64 \times 4} \) are the neighbors of \( {\mathbf{L}} \), and \( {\mathbf{W}} \in R^{64 \times 64} \) is the weighting matrix of Eq. (5.20).
where \( \sigma_{1} \) and \( \sigma_{2} \) are global filter parameters, \( {\mathbf{L}}_{c} \in R^{4 \times 1} \) is the HR geometric structure, \( {\mathbf{L}}_{LAi} \in R^{4 \times 1} \) is the ith LR geometric structure, \( \left\| {\, \cdot \,} \right\|_{p} \) denotes the p-norm (1 or 2), and \( {\mathbf{V}}_{LAi} ,{\mathbf{V}}_{c} \in R^{2 \times 1} \) are the coordinates of L i and L c . The all-rounded correlation model parameter \( {\mathbf{B}} \in R^{8 \times 1} \) is given by,
where \( {\mathbf{L}}_{LB} \in R^{64 \times 8} \) are the neighbor’s positions in \( {\mathbf{L}} \).
2.7.2 Learning-Based SR Algorithms
In these algorithms, the relationship between some HR and LR examples (from a specific class like face images, animals) is learned. The training database as example shown in Fig. 5.8 needs to have proper characteristics (Kong et al. 2006). The learned knowledge is a priori term for the reconstruction. The measure of these two factors of sufficiency and predictability is explained in (Kong et al. 2006). In general, a larger database yields better results, but a larger number of irrelevant examples only increase the computational time of search and can disturb the results. The content-based classification of image patches (like codebook) during the training is suggested as alternative in (Li et al. 2009).
The dictionaries to be used can be a self-learned or an external-based dictionary. Typically, some techniques like the K-means are used for clustering of n observations into k clusters and the principal component analysis (PCA) algorithm is employed to reduce the dimensionality. In dictionary learning, it is important to reduce dimensionality of the data. Figure 5.7 shows a typical model of projection for dictionary learning for SR. Figure 5.8 shows an example of a LR image input and a pair of LR–HR dictionary images.
The projection PCA is based on finding the eigenvectors and eigenvalues of an autocorrelation. This can be expressed as,
where \( {\varvec{\Omega}} \) is the autocorrelation matrix of the input data \( {\mathbf{U}} \), \( {\varvec{\Psi}} \) is the matrix of eigenvectors, and \( {\varvec{\Lambda}} \) is a diagonal matrix containing the eigenvalues. The eigenspace is the projection of \( {\mathbf{U}} \) into the eigenvectors. The data at high and low resolution \( {\mathbf{U}}_{h} \) and \( {\mathbf{U}}_{l} \) are used to find the minimum distance in a projection over the found eigenspace.
In dictionary search, the patches represent rows or columns of the data matrix \( {\mathbf{U}}^{h} \) or \( {\mathbf{U}}^{l} \). The strategy is to find the position of the patch at HR with a minimum distance respect to the projection of a LR patch in the eigenspace of HR.
2.7.3 Diffusive SR
Perona and Malik (1990) developed a method that employs a diffusion equation for the reconstruction of the image. The local context of the image is processed using a function to restore the edges.
where \( c \) is a function to control the diffusivity; for example if \( c = 1 \), the process is linear isotropic and homogeneous, and if \( c \) is a function that depends on \( {\mathbf{I}}_{x} \), i.e., \( c = c({\mathbf{I}}_{x} ) \), the process becomes a nonlinear diffusion; however, if \( c \) is a matrix-valued diffusivity, the process is called anisotropic and it will lead to a process where the diffusion is different for different directions. The image is differentiated in cardinal directions, and a group of coefficients are obtained in each point using the information of the gradient.
Finally, the image is reconstructed by adding the variations in the iterative process of Eq. (5.28).
This principle has been a guide for local in time processing over the image used in image processing algorithms for adaptation to a local context in the image.
2.7.4 TFOCS
The reconstruction methods require powerful mathematical tools for minimization of the error in the estimation. A resource commonly used is the split-Bregman model in which the norm L1 is employed. In (becker et al. 2011), the library templates for first-order conic solvers (TFOCS) were designed to facilitate the construction of first-order methods for a variety of convex optimization problems. Its development was motivated by its authors’ interest in compressed sensing, sparse recovery, and low-rank matrix completion. In a general form, this tool let us solve x for the inverse problem:
where the function f is smooth and convex, h is convex, \( {\rm A} \) is a lineal operator, and \( b \) a bias vector. The function h also must be prox-capable; in other words, it must be inexpensive to compute its proximity operator of Eq. (5.30)
The following is an example of solution with TFOCS; consider the following problem,
this problem can be written as:
where \( h(x) = 0 \) if \( \left\| x \right\|_{1} \le \tau \) and +∞ otherwise. Translated to a single line of code:
The library was employed in Ren et al. (2017) for the minimization of a function of estimation in which two priors are employed: the first a differential respect to a new estimation based on TV of a central patch respect to a window of search adaptive high-dimensional non-local total variation (AHNLTV) and the second a weighted adaptive geometric duality (AGD). Figure 5.9 shows the visual comparison between bicubic interpolation and AHNLTV-AGD method after HR image estimation.
2.7.5 Total Variation
Total variation (TV) uses a regularization term as in MAP formulation. It applies similar penalties for a smooth and a step edge, and it preserves edges and avoids ringing effects; Eq. (5.33) is the term of TV,
where ∇ is the gradient operator. The TV term can be weighted with an adaptive spatial algorithm based on differences in the curvature. For example, the bilateral total variation (BTV) (Farsiu et al. 2003) is used to approximate TV, and it is defined in Eq. (5.34),
where \( S_{x}^{k} \) and \( S_{y}^{l} \) shift \( f \) by k and l pixels in the x and y directions to present several scales of derivatives, \( 0 < \alpha < 1 \) imposes a spatial decay on the results (Farsiu et al. 2003), and P is the scale at which the derivatives are calculated (so it calculates derivatives at multiple scales of resolution (Farsiu et al. 2006). In (Wang et al. 2008), the authors discuss that an a priori term generates saturated data if it is applied to unmanned aerial vehicle data. Therefore, it has been suggested to combine it with the Hubert function, resulting in the BTV Hubert of Eq. (5.35),
where A is the BTV regularization term and α is obtained as α = median [|A − median|A|·|]. This term keeps the smoothness of the continuous regions and preserves edges in discontinuous regions (Wang et al. 2008). In (Li et al. 2010), a locally adaptive version of BTV, called LABTV, has been introduced to provide a balance between the suppression of noise and the preservation of image details (Li et al. 2010). To do so, instead of the L1 norm an Lp norm is used. The value of p for every pixel is defined based on the difference between the pixel and its surroundings. In smooth regions, where the noise reduction is important, p is set to a large value, close to two, and in non-smooth regions, where edge preservation is important, p is set to small values, close to one. The same idea of adaptive norms, but using different methods for obtaining the weights, has been employed in (Omer and Tanaka 2010; Song et al. 2010; Huang et al. 2011; Liu and Sun 2011; Mochizuki et al. 2011).
2.7.6 Gradient Management
The gradients are a topic of interest in SR. The changes in the image are a fundamental evidence of the resolution, and a high-frequency content brings the maximal changes of values between consecutive pixels in the image. The management of gradient has been addressed in two forms: first, by using a dictionary of external gradients of HR and second, by working directly on the LR image and reconstructing the HR gradients with the context of the image and regularization terms.
In these methods (Sun et al. 2008; Wang et al. 2013), a relationship is established in order to sharp the edges. In the first case, the gradients of an external database of HR are analyzed, and with a dictionary technique, the gradients of the LR input image are reconstructed. In the second case, the technique does not require external dictionaries, the procedure is guided by the second derivative of the same LR images amplified using pure interpolation, then a gradient scale factor is incorporated extracted from the local characteristics of the image.
In this chapter, we propose a new algorithm of gradient management and the application for a novel procedure of SR. For example, a bidirectional and orthogonal gradient field is employed. In our algorithm, two new procedures are proposed; in the first, the gradient field employed is calculated as:
Then, the procedure is integrated as shown in Fig. 5.10; for deeper understanding, refer to (Wang et al. 2013).
The second form of our procedure is the application of the gradient field with independence. That is, the gradient fields are calculated by convolving the image with discrete gradient operators of Eq. (5.37) to obtain the differences along diagonal directions. The resulting model is shown in Fig. 5.11.
2.7.7 Hybrid BTV and Gradient Management
This section proposes the integration of two powerful tools for SR, the TV and gradient control. In the proposed case, the gradient regularization is applied first using the proposed model of Sect. 5.2.7.6. The technique produces some artifacts when the amplification scale is high, and the regularization term takes high values. The first problem is addressed by TV also exposed previously. This algorithm brings an average of similar pixels around the image for estimation of the high resolution. Here, two characteristics can collaborate for a better result.
The general procedure of the proposed method is shown in Fig. 5.12, and the visual comparison between the LR image and the HR image is exposed in Fig. 5.16. The proposed new algorithm is named orthogonal and directional gradient management and bilateral total variation (ODGM-BTV). It is only an illustration of the multiple possibilities for the creation of SR algorithms.
3 Results of the Proposed Methods
In this section, the results of the proposed methods are illustrated. Experiments on test and real images are presented with scaling factors of 2, 3, and 4. The objective metrics used were peak signal-to-noise ratio (PSNR) and the structural similarity (SSIM), and results are given in Tables 5.1, 5.2, and 5.3. Subjective performance of our SR schemes is evaluated in Figs. 5.13, 5.14, 5.15, and 5.16.
3.1 Gradient Management
In these experiments, the group of images shown in Fig. 5.15, included in the BSDS500 database, was used. The amplification factors were 2, 3, and 4. Tables 5.1, 5.2, and 5.3 show the increment in PSNR and SSIM of the second alternative proposed with independence of the two gradient fields. Also, the test image was used to observe the sharpening effect around contours, and the results are shown in Fig. 5.13. Figure 5.14 shows the plot of the row 60, taken from the test image of Fig. 5.13, to illustrate the edge transitions for the HR recovered image.
3.2 Hybrid BTV and Gradient Management
Figure 5.16 shows the result of the proposed method ODGM-BTV using a scale of amplification of 4.
Algorithm:
-
Input: LR image, iteration number
-
For i = 1: iteration number
-
1.
Apply the BTV algorithm to the LR input image.
-
2.
Apply the bidirectional orthogonal gradient management.
-
3.
Update the LR input image with the HR output image.
-
end
-
Output HR image
4 Metrics
The PSNR in dBs of Eq. (5.38) and SSIM of Eq. (5.39) are the metrics most used to evaluate SR algorithms.
where x and y are the two signals to compare, \( {\text{MSE}}(x,y) \) is the mean square error, and \( v_{\hbox{max} } \) is the maximum possible value in the range of the signals. The SSIM factor (Wang et al. 2004) is calculated as,
where \( \mu_{x} \) and \( \mu_{y} \) are the mean value of x and y, \( \sigma_{x}^{2} \), \( \sigma_{y}^{2} \), and \( \sigma_{xy} \) are the variance and covariance of x and y; \( c_{1} \) and \( c_{2} \) are constants terms. Another metric derived from the SSIM is the mean SSIM (MSSIM) of Eq. (5.40)
where M is the number of the areas being compared.
4.1 Discussion of the Results
Tables 5.1, 5.2, and 5.3 show an enhancement of the quality parameters SSIM and PSNR of our proposed method over the management of a single gradient. Also, the scales of amplification are greater than 3 with major increments of the quality factors for high scale factors. Our procedure employs a natural following of the gradients, and let to give a more precise dimension of the slopes, it is an important contribution to the state of the art of the algorithms of gradient management. Although the goal of our chapter is an overview of complements for super-resolution and not contributions of a novel algorithms or improvement of the results in the state of the art. The overview shows that SR is a very rich field of investigation. In each step, we can find a possibility of application of some method using the strongest principle of functioning. An example is the combination of the BTV and ODGM, the visual effect is very interesting in Fig. 5.16, and the major resolution by area can be observed. The contribution in this case avoids artifacts from gradient management, and at the same time, a less blurred image is obtained in comparison with BTV method due to the sharping procedure over the edges.
The review of the literature brings some conclusions. The investigation in this topic is extended, and the contributions for the state of the art are in the most of the cases little changes over well-known procedures. Unfortunately, the goal is based on a quality measurement and the benchmark for guide of the results is based on different configurations of the Eqs. (5.38), (5.39), and (5.40). The consequence is that the comparison between many reported algorithms and models is difficult and not always possible. In this point, the borders between classifications of the methods are diffused by this reason the comparison between methods in an overview more than attempts of classification and the explanation of the classification is not useful. Nevertheless, the great creativity exhibited in the different methods and the totally different mathematical solutions make it difficult to establish mathematical comparisons and objective conclusions without considering only empirical results based on measurement metrics.
5 Conclusions
SR is an exciting and diverse subject in the digital processing area and can take all possible forms. Each algorithm has a place in this area of research and is extremely complex and comprehensive. The study of these techniques should be oriented from the beginning because the development of each of them is broad and difficult to reproduce. Sometimes, a small advance can be made in one of them. Also, the initial condition is different in each case and some bases of comparison are required. In the literature, some standard measurements are proposed but the application conditions are diverse. A useful strategy to approach SR research is the knowledge of the cause of preexisting algorithms. Advantages and disadvantages are important factors to consider in order to combine characteristics that produce more convincing effects and better qualities of the output image in a system. The proposed example makes edge sharpening and average for estimation; the first method produces artifacts, but the second fails to produce clear edges. A case was proposed in which these two characteristics can be positively complemented. For future work, we continue the study of multiple possibilities in the field of SR estimation using the transformation of the image and learning from different characterizations as wavelets fluctuations with dictionary learning. Other interesting field is the minimization procedures for multiple residuals priors in the estimations as was made in works as (Ren et al. 2017).
References
Becker, S., Candès, E., & Grant, M. (2011). Templates for convex cone problems with applications to sparse signal recovery. Mathematical Programming Computation, 3, 165–218.
Belekos, S., Galatsanos, N., & Katsaggelos, A. (2010). Maximum a posteriori video super-resolution using a new multichannel image prior. IEEE Transactions on Image Processing, 19(6), 1451–1464.
Capel, D., & Zisserman, A. (1998). Automated mosaicing with super-resolution zoom. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Santa Barbara, California, USA, 1, 885–891.
Cheeseman, P., Kanefsky, B., Kraft, R., Stutz, J., & Hanson, R. (1994). Super-resolved surface reconstruction from multiple images (1st ed.). London, United Kingdom: Springer Science + Business Media.
Dai, S., Han, M., Xu, W., Wu, Y., & Gong, Y. (2007). Soft edge smooth-ness prior for alpha channel super resolution. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Minneapolis, Minnesota, USA, 1, 1–8.
Dai, S., Han, M., Xu, W., Wu, Y., Gong, Y., & Katsaggelos, A. (2009). SoftCuts: a soft edge smoothness prior for color image super-resolution. IEEE Transactions on Image Processing, 18(5), 969–981.
Debes, C., Wedi, T., Brown, C., & Zoubir, A. (2007). Motion estimation using a joint optimisation of the motion vector field and a super-resolution reference image. In Proceedings of IEEE International Conference on Image Processing (ICIP), San Antonio, Texas, USA, 2, 479–500.
Dong, W., Zhang, L., Shi, G., & Wu, X. (2011). Image deblurring and super-resolution by adaptive sparse domain selection and adaptive regularization. IEEE Transactions on Image Processing, 20(7), 1838–1856.
Elad, M., & Hel-Or, Y. (2001). A fast super-resolution reconstruction algorithm for pure translational motion and common space-invariant blur. IEEE Transactions on Image Processing, 10(8), 1187–1193.
Farsiu, S., Robinson, D., Elad, M., & Milanfar, P. (2003). Robust shift and add approach to super-resolution. In Proceedings of SPIE Conference on Applications of Digital Signal and Image Processing, San Diego, California, USA, 1, 121–130.
Farsiu, S., Robinson, D., Elad, M., & Milanfar, P. (2004). Fast and robust multi-frame super-resolution. IEEE Transactions on Image Processing, 13(10), 327–1344.
Farsiu, S., Elad, M., & Milanfar, P. (2006). A practical approach to super-resolution. In Proceedings of SPIE Conference on Visual Communications and Image Processing, San Jose, California, USA, 6077, 1–15.
Huang, K., Hu, R., Han, Z., Lu, T., Jiang, J., & Wang, F. (2011). A face super-resolution method based on illumination invariant feature. In Proceedings of IEEE International Conference on Multimedia Technology (ICMT), Hangzhou, China, 1, 5215–5218.
Hung, K., & Siu, W. (2012). Robust soft-decision interpolation using weighted least squares. IEEE Transactions on Image Processing, 21(3), 1061–1069.
Irani, M., & Peleg, S. (1990). Super-resolution from image sequences. In Proceedings of 10th IEEE International Conference on Pattern Recognition, Atlantic City, New Jersey, USA, 1,115–120.
Irani, M., & Peleg, S. (1991). Improving resolution by image registration. CVGIP Graphical Models and Image Processing, 53(3), 231–239.
Irani. M., & Peleg, S. (1992). Image sequence enhancement using multiple motions analysis. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Champaign, Illinois, USA, 1, 216–222.
Irani, M., & Peleg, S. (1993). Motion analysis for image enhancement: Resolution, occlusion, and transparency. Journal of Visual Communication and Image Representation, 4(4), 324–335.
Jung, C., Jiao, L., Liu, B., & Gong, M. (2011a). Position-patch based face hallucination using convex optimization. IEEE Signal Processing Letters, 18(6), 367–370.
Jung, M., Bresson, X., Chan, T., & Vese, L. (2011b). Nonlocal Mumford-Shah regularizers for color image restoration. IEEE Transactions on Image Processing, 20(6), 1583–1598.
Keren, D., Peleg, S., & Brada, R. (1998). Image sequence enhancement using subpixel displacements. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), Ann Arbor, Michigan, USA, 1, 742–746.
Kim, K., & Kwon, Y. (2008). Example-based learning for single-image super-resolution. Pattern Recognition, LNCS, 5096, 456–465.
Kim, K., & Kwon, Y. (2010). Single-image super-resolution using sparse regression and natural image prior. IEEE Transactions on Pattern Analysis and Machine Intelligence, 32(6), 1127–1133.
Kong, D., Han, M., Xu, W., Tao, H., & Gong, Y. (2006). A conditional random field model for video super-resolution. In Proceedings of 18th International Conference on Pattern Recognition (ICPR), Hong Kong, China, 1, 619–622.
Kumar, S., & Nguyen, T. (2010). Total subset variation prior. In Proceedings of IEEE International Conference on Image Processing (ICIP), Hong Kong, China, 1, 77–80.
Li, X., & Orchard, M. (2001). New edge-directed interpolation. IEEE Transactions on Image Processing, 10(10), 1521–1527.
Li, F., Jia, X., & Fraser, D. (2008). Universal HMT based super resolution for remote sensing images. In Proceedings of 15th IEEE International Conference on Image Processing (ICIP), San Diego, California, USA, 1, 333–336.
Li, X., Lam, K., Qiu, G., Shen, L., & Wang, S. (2009). Example-based image super-resolution with class-specific predictors. Journal of Visual Communication and Image Representation, 20(5), 312–322.
Li, X., Hu, Y., Gao, X., Tao, D., & Ning, B. (2010). A multi-frame image super-resolution method. Signal Processing, 90(2), 405–414.
Liu, C., & Sun, D. (2011). A Bayesian approach to adaptive video super resolution. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, Colorado, USA, 1, 209–216.
Mallat, S., & Yu, G. (2010). Super-resolution with sparse mixing estimators. IEEE Transactions on Image Processing, 19(11), 2889–2900.
Martins, A., Homem, M., & Mascarenhas, N. (2007). Super-resolution image reconstruction using the ICM algorithm. In Proceedings of IEEE International Conference on Image Processing (ICIP), San Antonio, Texas, USA, 4, 205–208.
Mochizuki, Y., Kameda, Y., Imiya, A., Sakai, T., & Imaizumi, T. (2011). Variational method for super-resolution optical flow. Signal Processing, 91(7), 1535–1567.
Morera, D. (2015). Determining parameters for images amplification by pulses interpolation. Ingeniería Investigación y Tecnología, 16(1), 71–82.
Morera, D. (2014). Amplification by pulses interpolation with high frequency restrictions for conservation of the structural similitude of the image. International Journal of Signal Processing, Image Processing and Pattern Recognition, 7(4), 195–202.
Omer, O., & Tanaka, T. (2010). Image superresolution based on locally adaptive mixed-norm. Journal of Electrical and Computer Engineering, 2010, 1–4.
Perona, P., & Malik, J. (1990). Scale-space and edge detection using anisotropic diffusion. IEEE Transactions on Pattern Analysis and Machine Intelligence, 12(7), 629–639.
Peyre, G., Bougleux, S., & Cohen, L. (2008). Non-local regularization of inverse problems. In Proceedings of European Conference on Computer Vision, Marseille, France, 5304, 57–68.
Pickup, L., Capel, D., & Roberts, S. (2006). Bayesian image super-resolution, continued. Neural Information Processing Systems, 19, 1089–1096.
Pickup, L. (2007). Machine learning in multi-frame image super-resolution. Ph.D. thesis, University of Oxford.
Prendergast, R., & Nguyen, T. (2008). A block-based super-resolution for video sequences. In Proceedings of 15th IEEE International Conference on Image Processing (ICIP), San Diego, California, USA, 1, 1240–1243.
Purkait, P., & Chanda, B. (2012). Super resolution image reconstruction through Bregman iteration using morphologic regularization. IEEE Transactions on Image Processing, 21(9), 4029–4039.
Ren, C., He, X., Teng, Q., Wu, Y., & Nguyen, T. (2006). Single image super-resolution using local geometric duality and non-local similarity. IEEE Transactions on Image Processing, 25(5), 2168–2183.
Ren, C., He, X., & Nguyen, T. (2017). Single image super-resolution via adaptive high-dimensional non-local total variation and adaptive geometric feature. IEEE Transactions on Image Processing, 26(1), 90–106.
Schultz, R., & Stevenson, R. (1994). A Bayesian approach to image expansion for improved definition. IEEE Transactions on Image Processing, 3(3), 233–242.
Shao, W., & Wei, Z. (2008). Edge-and-corner preserving regularization for image interpolation and reconstruction. Image and Vision Computing, 26(12), 1591–1606.
Song, H., Zhang, L., Wang, P., Zhang, K., & Li, X. (2010). An adaptive L1–L2 hybrid error model to super-resolution. In: Proceedings of 17th IEEE International Conference on Image Processing (ICIP), Hong Kong, China, 1, 2821–2824.
Sun, J., Xu, Z., & Shum, H. (2008). Image super-resolution using gradient profile prior. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Anchorage, Alaska, 1, 1–8.
Tappen, M., Russell, B., & Freeman, W. (2003). Exploiting the sparse derivative prior for super-resolution and image demosaicing. In Proceedings of IEEE 3rd International Workshop on Statistical and Computational Theories of Vision (SCTV), Nice, France, 1, 1–24.
Villena, S., Abad, J., Molina, R., & Katsaggelos, A. (2004). Estimation of high resolution images and registration parameters from low resolution observations. Progress in Pattern Recognition, Image Analysis and Applications, LNCS, 3287, 509–516.
Wang, Z., Bovik, A., Sheikh, H., & Simoncelli, E. (2004). Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4), 600–612.
Wang, Y., Fevig, R., & Schultz, R. (2008). Super-resolution mosaicking of UAV surveillance video. In Proceedings of 15th IEEE International Conference on Image Processing (ICIP), San Diego, California, USA, 1, 345–348.
Wang, L., Xiang, S., Meng, G., Wu, H., & Pan, C. (2013). Edge-directed single-image super-resolution via adaptive gradient magnitude self-interpolation. IEEE Transactions on Circuits and Systems for Video Technology, 23(8), 1289–1299.
Woods, N., & Galatsanos, N. (2005). Non-stationary approximate Bayesian super-resolution using a hierarchical prior model. In Proceedings of IEEE International Conference on Image Processing (ICIP), Genova, Italy, 1, 37–40.
Zhang, X., & Wu, X. (2008). Image interpolation by adaptive 2-D autoregressive modeling and soft-decision estimation. IEEE Transactions on Image Processing, 17(6), 887–896.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this chapter
Cite this chapter
Morera-Delfín, L., Pinto-Elías, R., Ochoa-Domínguez, HdJ. (2018). Overview of Super-resolution Techniques. In: Vergara Villegas, O., Nandayapa , M., Soto , I. (eds) Advanced Topics on Computer Vision, Control and Robotics in Mechatronics. Springer, Cham. https://doi.org/10.1007/978-3-319-77770-2_5
Download citation
DOI: https://doi.org/10.1007/978-3-319-77770-2_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-77769-6
Online ISBN: 978-3-319-77770-2
eBook Packages: EngineeringEngineering (R0)