Single Image Super Resolution Algorithm with a New Dictionary Learning Technique K-Eigen Decomposition

Zhou, Yingyue; Zang, Hongbin; Xu, Su; Zhang, Hongying

doi:10.1007/978-3-319-21969-1_32

Yingyue Zhou¹⁴,
Hongbin Zang¹⁴,
Su Xu¹⁴ &
…
Hongying Zhang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 9219))

2145 Accesses

Abstract

In this paper, we propose an algorithm to improve some important details of sparse representation based image super resolution (SR) framework. Firstly, a new dictionary learning technique K-Eigen decomposition (K-EIG) is proposed. It improves the classical K-SVD algorithm in dictionary atom updating. K-EIG accelerates the learning process and keeps the similar performance of the learned dictionary. Secondly, image patch classification and edge patches extension are integrated into the SR framework. Two over-complete dictionary-pairs are trained based on K-EIG. In reconstruction, the input low resolution (LR) image is split into patches and each one is classified. The patch type decides which dictionary-pair is chosen. Then the sparse representation coefficient of the LR signal is inferred and the corresponding high resolution (HR) patch can be reconstructed. Experimental results prove that our algorithm can obtain competitive SR performance when compared with some classical methods. Besides, the time-consuming of dictionary-pair learning is lower.

You have full access to this open access chapter, Download conference paper PDF

Image Super-Resolution Reconstruction Based on Two-Stage Dictionary Learning

Multi-stage Dictionary Learning for Image Super-Resolution Based on Sparse Representation

Super-Resolution from One Single Low-Resolution Image Based on R-KSVD and Example-Based Algorithm

Keywords

1 Introduction

Super resolution (SR) image reconstruction refers to a signal processing approach to obtain a high resolution (HR) image from observed single or multiple low resolution (LR) image(s) [1]. For the single image SR reconstruction problem, simple interpolation methods like “bilinear” or “bicubic” interpolation cannot do a good job because only the neighbor pixels are used to estimate the missing information. The enlarged image is blurry especially in the regions existing edges and textures. Some researchers [2, 3] merged the natural image priors into the traditional interpolation methods and generated the image with sharper edges. But these methods are unstable and sometimes easy to produce artifacts. Another kind of methods employs the machine learning techniques which are very popular nowadays. Freeman et al. [4, 5] first proposed this idea and named it as “example-based super-resolution”. They estimated the missing details with the help of abundant example-pairs extracted from an external training set. The algorithm attempts to learn the co-occurrence prior between LR and HR image patches. Then the prior is used in SR reconstruction. Example-based SR method breaks through the bottleneck of the conventional SR methods and generates more distinct image. Inspired by the manifold learning, neighbor embedding for SR was proposed [6]. In this method, multiple examples can contribute simultaneously to the generation of each HR image patch. In 2008, Yang et al. [7] employed “sparse coding” to represent an image patch sparsely over an example set. Moreover, the connection between LR and HR example is replaced by the same sparse representation coefficient over the LR or HR example set. Then, Wang [8] and Yang [9] successively used a pair of learned compact over-complete dictionary to replace the two huge example sets. This change speeds up the process of sparse coding greatly. However, it is very time-consuming to learn the over-complete dictionary-pair. So, some researchers try to accelerate the learning process [10–12]. In recent years, Dong et al. [13] proposed a nonlocal centralized sparse representation technique and Peleg et al. [14] presented a statistical prediction model based on sparse representations, which all generates the better SR images.

Here, we propose a fast dictionary learning technique K-Eigen Decomposition (K-EIG) and employ it into the sparse representation based SR framework. In addition, we merge the classification of image patches and the extension of edge patches skillfully into the whole algorithm to promote the quality of the SR images. The outline of this paper is as follows. In Sect. 2, we analyze the essence of the sparse representation based image SR methods. Then, the details of our proposed algorithm are presented in Sect. 3. In Sect. 4, we show the simulations and results. Conclusion is given in Sect. 5.

2 Reviews and Analysis

Let $ \varvec{I}_{l} $ be the given LR image and $ \varvec{I}_{h} $ be the corresponding HR image which needs to be estimated. Usually, the degradation model can be written as:

$$ \varvec{I}_{l} = (\varvec{I}_{h} * f) \downarrow s, $$

(1)

where $ f $ is a blurring kernel, $ * $ is the convolution operation and $ \downarrow s $ means down-sampling with a factor $ s $. In addition to the prior degradation model, an external training set $ \varvec{TS} $ containing various HR images $ \varvec{T}_{h} $ can be used. The objective of the example-based image SR methods is to predict the missing details in $ \varvec{I}_{l} $ by “borrowing” information from some similar examples in $ \varvec{TS} $ [15] and then reconstruct $ \varvec{I}_{h} $. Next, we make an analysis from the following three aspects for the essence of sparse representation based image SR methods. Meanwhile, some representative prior work is also reviewed.

The first aspect is how to build the link between LR and HR image patches based on $ \varvec{TS} $ through learning. The link is a learned prior for SR reconstruction. With the degradation model, the LR version $ \varvec{T}_{l} $ for each $ \varvec{T}_{h} $ can be formed by $ \varvec{T}_{l} = (\varvec{T}_{h} * f) \downarrow s $. For the sampling convenience, $ \varvec{T}_{l} $ is enlarged to be with the same size of $ \varvec{T}_{h} $ by interpolation. Then, a number of image patch-pairs $ (\varvec{Tp}_{l} ,\varvec{Tp}_{h} ) $ are sampled from every image-pair $ (\varvec{T}_{l} \text{,}\varvec{T}_{h} ) $, representing the LR and HR version of the same region respectively. In the sparse representation based image SR methods, an effective link is through the common sparse representation coefficient [7–10]. Specifically speaking, a $ \varvec{Tp}_{l} $ can be represented by a sparse vector over a LR over-complete dictionary $ \varvec{D}_{l} $ and at the same time the corresponding $ \varvec{Tp}_{h} $ owns the same sparse vector over a HR dictionary $ \varvec{D}_{h} $. Therefore, a key problem is how to construct the over-complete dictionary-pair $ (\varvec{D}_{l} ,\varvec{D}_{h} ) $ to meet the requirement. The second aspect is how to express the LR and HR image patches. In [5], the authors pointed out that the highest spatial-frequency components of the LR image are most important in predicting the extra details. Usually, there are three ways to get them, the high-pass filter [4, 5, 10], the image primitives [15, 16] and the low-order derivatives [6, 9]. For each LR image patch, a signal vector $ \varvec{v}_{l} $ can be formed from its highest spatial-frequency components. For a HR image patch, its mean value or LR version is often subtracted and then $ \varvec{v}_{h} $ is formed. These signal vector-pairs $ (\varvec{v}_{l} ,\varvec{v}_{h} ) $ represent the raw image patch-pairs. The third aspect is how to reconstruct $ \varvec{I}_{h} $ from $ \varvec{I}_{l} $. Usually, the input $ \varvec{I}_{l} $ is scaled-up to the size of $ \varvec{I}_{h} $ by interpolation. Since the SR scheme works on the patches, $ \varvec{I}_{l} $ is broken into patches in advance. Then, the signal vector $ \varvec{v}_{l} $ of each patch $ \varvec{Ip}_{l} $ is calculated and its sparse representation vector $ \varvec{\alpha} $ is inferred over $ \varvec{D}_{l} $. Depending on the link learned in training, a HR version $ \hat{\varvec{I}}\varvec{p}_{h} $ can be estimated. Through the rational combination such as averaging the multiple estimated values for each pixel, a HR image can be obtained.

3 The Details of Our Proposed SR Method

3.1 A New Dictionary Learning Algorithm K-EIG

Assume that the given set of signal examples is $ \varvec{Y = }\{ \varvec{y}_{i} \}_{i = 1}^{M} $. $ \varvec{y}_{i} (\varvec{y}_{i} \in {\mathbb{R}}^{n} ) $ is a signal example and $ M $ is the number of signal examples. According to the theory of signal sparse representation [17], each $ \varvec{y}_{i} $ can be represented approximately by a sparse vector $ \varvec{\alpha}_{i} (\varvec{\alpha}_{i} \in {\mathbb{R}}^{L} ) $ over an over-complete dictionary $ \varvec{D} $, $ \varvec{D} \in {\mathbb{R}}^{n \times L} (L > n) $. Let $ \varvec{A} $ be a sparse vector set $ \varvec{A} = \{\varvec{\alpha}_{i} \}_{i = 1}^{M} $. Note that $ \varvec{Y} $ and $ \varvec{A} $ can be also treated as the matrices. Then, the problem of dictionary learning can be described as:

$$ \mathop {\hbox{min} }\limits_{{\varvec{D},\varvec{A}}} \{ ||\varvec{Y} - \varvec{DA}||_{F}^{2} \} \, s.t. \, \forall i, \, ||\varvec{\alpha}_{i} ||_{0} \le T_{0} , $$

(2)

where $ \parallel\varvec{\alpha}_{i} \parallel_{0} $ stands for the count of nonzero entries in $ \varvec{\alpha}_{i} $, $ T_{0} $ is the maximum allowed number of nonzero entries in $ \varvec{\alpha}_{i} $. By now K-SVD [12] is the most prevalent dictionary learning algorithm [18–20]. It alternates between sparse coding of signal vectors and updating the dictionary atoms to get the solution. The creativity of K-SVD is that it updates one column in $ \varvec{D} $ at a time. All columns in $ \varvec{D} $ except one atom $ d_{k} $ is fixed. Besides, the kth row in $ \varvec{A} $, denoted as $ \alpha_{T}^{k} $, is also put in question. So, the penalty term can be rewritten as:

$$ ||\varvec{Y} - \varvec{DA}||_{F}^{2} = ||(\varvec{Y} - \sum\limits_{j \ne k} {\varvec{d}_{j}\varvec{\alpha}_{T}^{j} } ) - \varvec{d}_{k}\varvec{\alpha}_{T}^{k} ||_{F}^{2} = ||\varvec{E}_{k} - \varvec{d}_{k}\varvec{\alpha}_{T}^{k} ||_{F}^{2} . $$

(3)

$ \varvec{E}_{k} $ stands for the representation error matrix based on the current $ \varvec{D} $ and $ \varvec{A} $ when $ \varvec{d}_{k} $ is removed. Note that some signal examples do not use $ \varvec{d}_{k} $, so $ \varvec{E}_{k} $ should be restricted as $ \varvec{E}_{k}^{R} $ by choosing only the related columns. In order to use $ \varvec{E}_{k}^{R} $ to find the unknown $ \tilde{\varvec{d}}_{k} $ and $ \tilde{\varvec{\alpha }}_{T}^{k} $ for minimizing $ ||\varvec{E}_{k}^{R} - \varvec{d}_{k}\varvec{\alpha}_{T}^{k} ||_{F}^{2} $, the singular value decomposition (SVD) of $ \varvec{E}_{k}^{R} $ is employed:

$$ \varvec{E}_{k}^{R} = {\mathbf{USV}}^{T} ,\text{ }\tilde{\varvec{d}}_{k} = {\mathbf{U}}(:,1),\text{ }\tilde{\varvec{\alpha }}_{T}^{k} = {\mathbf{V}}^{T} (:,1) \times {\mathbf{S}}(1,1). $$

(4)

After updating dictionary atoms one by one, a new $ \varvec{D} $ can be obtained. Then, go back to the stage of sparse coding. A satisfactory dictionary fitting $ \varvec{Y} $ can be generated after several alternations. Unfortunately, the operation of $ \varvec{E}_{k}^{R} $’s SVD is time-consuming since the column number of $ \varvec{E}_{k}^{R} $ is usually large. It causes the high computational cost. Rubinstein et al. [11] offered an approximate K-SVD (AK-SVD) algorithm to accelerate K-SVD. However, it sacrifices the performance of the learned dictionary to promote the training speed. Next, we propose a new algorithm K-EIG to accelerate the learning process and try to keep the similar performance of the learned dictionary. The details of derivation are as follows.

For the convenience of derivation, the target function $ ||\varvec{E}_{k}^{R} - \varvec{d}_{k}\varvec{\alpha}_{T}^{k} ||_{F}^{2} $ is denoted as $ ||\varvec{E} - \varvec{d\alpha }||_{F}^{2} $. Then, we expand $ || \cdot ||_{F}^{2} $:

$$ \begin{aligned} fu & = \text{ }||\varvec{E} - \varvec{d\alpha }||_{F}^{2} = \sum\limits_{j} {\sum\limits_{i} {(d(i)\alpha (j) - E(i,j))^{2} } } \\ & = \sum\limits_{j} {\sum\limits_{i} {(d(i)^{2} \alpha (j)^{2} - 2d(i)\alpha (j)E(i,j) + E(i,j)^{2} ).} } \\ & = \sum\limits_{j} {[(\sum\limits_{i} {d(i)^{2} } )\alpha (j)^{2} - 2(\sum\limits_{i} {d(i)} E(i,j))\alpha (j) + \sum\limits_{i} {E(i,j)}^{2} ]}. \\ \end{aligned}$$

(5)

Note that $ \sum\limits_{i} {d(i)^{2} } = 1 $. Besides, we can ignore $ \sum\limits_{i} {E(i,j)^{2} } $ because it can be treated as a constant. Therefore, $ fu $ turns to:

$$ fu = \sum\limits_{j} {[\alpha (j)^{2} - 2(\sum\limits_{i} {d(i)E(i,j)} )\alpha (j)]} . $$

(6)

The formula in $ [ \cdot ] $ can be considered as a form of quadratic equation about $ \alpha (j) $. So, the minimum value of $ fu $ is reached at the bottom of the quadratic curve when $ \alpha (j) = \sum\limits_{i} {d(i)E(i,j)} $. And now:

$$ fu = - \sum\limits_{j} {(\sum\limits_{i} {d(i)E(i,j)} )^{2} } . $$

(7)

Thus $ \alpha (j) $ is eliminated from $ fu $ and we can solve $ \varvec{d} $ directly under the normalization constraint with the help of Lagrange multiplier method. The Lagrange function is:

$$ Lf = - \sum\limits_{j} {(\sum\limits_{i} {d(i)E(i,j)} )^{2} } + \lambda (\sum\limits_{i} {d(i)^{2} } - 1). $$

(8)

Let $ \partial Lf/\partial d(m) = 0 $ where $ d(m) $ is the mth element in $ \varvec{d} $. Then we can obtain:

$$ \begin{aligned} \partial Lf/\partial d(m) & = - \sum\limits_{j} {(\sum\limits_{i} {d(i)E(i,j)} )} E(m,j) + \lambda d(m) = 0 \\ & \Rightarrow \sum\limits_{i} {d(i)\sum\limits_{j} {E(i,j)E(m,j)} } - \lambda d(m) = 0,\quad m = 1,2, \ldots ,n. \\ \end{aligned}$$

(9)

Let $ \varvec{R} $ be the autocorrelation matrix of $ \varvec{E} $, i.e. $ \varvec{R = E} \cdot \varvec{E}^{T} $. So, $ \sum\limits_{j} {E(i,j)E(m,j)} = R(i,m) $. Then, Eq. (9) turns to:

$$ \begin{aligned} & \sum\limits_{i} {R(i,m)d(i)} - \lambda d(m) = 0, \, m = 1,2, \ldots ,n \\ & \Rightarrow R(:,m)^{T} \varvec{d} = \lambda d(m), \, m = 1,2, \ldots ,n \Rightarrow \varvec{R}^{T} \varvec{d} = \lambda \varvec{d}\varvec{.} \\ \end{aligned} $$

(10)

Obviously, the solution of the above problem can be obtained by calculating the eigenvector of $ \varvec{R}^{T} $. We choose the eigenvector corresponding to the largest eigenvalue as the optimal solution of the problem. Then, use $ \varvec{\alpha}= \varvec{d}^{T} \varvec{E} $ to get $ \varvec{\alpha} $. Thus, we replace the implementation of $ \varvec{E} $’s SVD with $ \varvec{R}^{T} $’s eigen-decomposition and name the whole algorithm as K-Eigen decomposition. Compared with K-SVD, the advantage of K-EIG is computational simplicity. The reason is as follows. In K-SVD, the object of SVD is the matrix $ \varvec{E}_{k}^{R} $ with the size $ n \times M_{k} $, where $ M_{k} $ represents the number of signals using $ \varvec{d}_{k} $. Usually, $ M_{k} $ is related to the total number of input signals $ M $. With the increasing of $ M $, the column number of $ \varvec{E}_{k}^{R} $ is larger. It results in poor computational efficiency due to the slow implementation of $ \varvec{E}_{k}^{R} $’s SVD. Whereas, K-EIG implements eigen-decomposition of $ \varvec{R}^{T} $. $ \varvec{R}^{T} $ is a square matrix of size $ n \times n $. It is much smaller than $ \varvec{E}_{k}^{R} $ since $ n \ll M_{k} $. So, the implementation of $ \varvec{R}^{T} $’s eigen-decomposition is much faster. It makes K-EIG more time-saving.

As described in Sect. 2, based on the signal vector-pairs $ (\varvec{v}_{l} ,\varvec{v}_{h} ) $ sampled from $ (\varvec{T}_{l} \text{,}\varvec{T}_{h} ) $, we train the dictionary-pair $ (\varvec{D}_{l} \text{,}\varvec{D}_{h} ) $ to satisfy the following formula:

$$ \left\{ {\begin{array}{*{20}c} {\mathop {\hbox{min} }\limits_{{\varvec{D}_{l} ,\varvec{\alpha}}} \parallel \varvec{v}_{l}^{(i)} - \varvec{D}_{l} \cdot\varvec{\alpha}^{(i)} \parallel_{2}^{2} \, s.t. \, \parallel\varvec{\alpha}^{(i)} \parallel_{0} \le T_{0} } \\ {\mathop {\hbox{min} }\limits_{{\varvec{D}_{h} ,\varvec{\alpha}}} \parallel \varvec{v}_{h}^{(i)} - \varvec{D}_{h} \cdot\varvec{\alpha}^{(i)} \parallel_{2}^{2} \, s.t. \, \parallel\varvec{\alpha}^{(i)} \parallel_{0} \le T_{0} } \\ \end{array} } \right.,\quad \forall i \in \{ 1,2, \ldots ,M\} . $$

(11)

In order to get $ (\varvec{D}_{l} \text{,}\varvec{D}_{h} ) $, we first generate $ \varvec{D}_{l} $ based on $ \{ \varvec{v}_{l}^{(1)} ,\varvec{v}_{l}^{(2)} , \ldots ,\varvec{v}_{l}^{(M)} \} $ by solving:

$$ \varvec{D}_{l} ,\{\varvec{\alpha}^{(i)} \} = \mathop {\arg \hbox{min} }\limits_{{\varvec{D}_{l} ,\{\varvec{\alpha}^{(i)} \} }} \sum\limits_{i = 1}^{M} {\parallel \varvec{v}_{l}^{(i)} - \varvec{D}_{l} \cdot\varvec{\alpha}^{(i)} \parallel_{2}^{2} } , \, s.t. \, \parallel\varvec{\alpha}^{(i)} \parallel_{0} \le T_{0} ,\quad \forall i \in \{ 1,2, \ldots ,M\} . $$

(12)

K-EIG can serve the above problem and we get $ \varvec{D}_{l} $ and $ \{\varvec{\alpha}^{(i)} \} $. Then, $ \{\varvec{\alpha}^{(i)} \} $ and $ \{ \varvec{v}_{h}^{(1)} ,\varvec{v}_{h}^{(2)} , \ldots ,\varvec{v}_{h}^{(M)} \} $ both guide the computation of $ \varvec{D}_{h} $ directly as the way in [10]. Let $ \varvec{V}_{h} = [\varvec{v}_{h}^{(1)} ,\varvec{v}_{h}^{(2)} , \ldots ,\varvec{v}_{h}^{(M)} ] $ and $ \varvec{A} = [\varvec{\alpha}^{(1)} ,\varvec{\alpha}^{(2)} , \ldots ,\varvec{\alpha}^{(M)} ] $, then $ \varvec{D}_{h} $ can be obtained by $ \varvec{D}_{h} = \varvec{V}_{h} \cdot \varvec{A}^{T} \cdot (\varvec{AA}^{T} )^{ - 1} $. Now, $ (\varvec{D}_{l} \text{,}\varvec{D}_{h} ) $ can meet the requirement of common sparse representation coefficient. Compared with the joint dictionary training method used in [9], the direct dictionary-pair construction method employing K-EIG is fast and effective.

3.2 Sparse Representation Based SR Framework with Example Classification and Extension

Considering the types of image patches, we do an important work before training. That is classification for the sampled image patches. Each patch-pair $ (\varvec{Tp}_{l} ,\varvec{Tp}_{h} ) $ will be put into a classifier and then marked by a label “smooth” or “edge”. We employ the sorted quadrant median vector (SQMV) scheme proposed in [21] to classify every $ \varvec{Tp}_{h} $. SQMV is a simple but efficient tool to recognize an image patch belonging to uniform region or edge region. Then, the set of example-pairs is separated into two parts, “smooth” set and “edge” set. In order to increase the number of “edge” patch-pairs before learning, we rotate each HR image $ \varvec{T}_{h} $ by three different angles (90°, 180° and 270°) and three different images are formed. Then, we can generate more “edge” patch-pairs. For “smooth” class, there is no need to increase the patch number. Based on the two sets of signal vector-pairs, two different dictionary-pairs $ (\varvec{D}_{l}^{smooth} ,\varvec{D}_{h}^{smooth} ) $ and $ (\varvec{D}_{l}^{edge} ,\varvec{D}_{h}^{edge} ) $ are constructed respectively.

In reconstruction, the input LR image $ \varvec{I}_{l} $ is first interpolated to the objective size and then broken into lots of overlapped patches $ \varvec{Ip}_{l} $ with the size of $ w \times w $. For each $ \varvec{Ip}_{l} $, we use the first-order combining second-order derivatives to represent it and form a signal vector $ \varvec{v}_{l} $. Through labeling $ \varvec{Ip}_{l} $ by SQMV scheme, we choose the appropriate dictionary-pair $ (\varvec{D}_{l}^{smooth} ,\varvec{D}_{h}^{smooth} ) $ or $ (\varvec{D}_{l}^{edge} ,\varvec{D}_{h}^{edge} ) $ for SR. A key problem is to infer the sparse representation vector of $ \varvec{v}_{l} $ over $ \varvec{D}_{l} $ by $ \hat{\varvec{\alpha }} = \mathop {\arg \hbox{min} }\limits_{\varvec{\alpha}} \parallel \varvec{v}_{l} - \varvec{D}_{l} \cdot\varvec{\alpha}\parallel_{2}^{2} \, s.t. \, \parallel\varvec{\alpha}\parallel_{0} \le T_{0} $. Orthogonal matching pursuit (OMP) [22] or convex relaxation techniques [17] can be used to solve the problem. After getting $ \hat{\varvec{\alpha }} $ for $ \varvec{Ip}_{l} $, we can reconstruct its HR version by $ \hat{\varvec{I}}\varvec{p}_{h} = \varvec{Ip}_{l} + reshape(\varvec{D}_{h} \cdot \hat{\varvec{\alpha }}) $ where $ reshape( \cdot ) $ means rearranging a vector to a patch. By the simple “averaging” of these overlapped $ \hat{\varvec{I}}\varvec{p}_{h} $, we get the estimated HR image $ \hat{\varvec{I}}_{h} $. Besides, to satisfy the degradation model, we add an operation of “backprojection” [9] to modify $ \hat{\varvec{I}}_{h} $ and generate the last $ \hat{\varvec{I}}_{h} $.

4 Simulations and Results

4.1 Testing the Performance of K-EIG

We test the performance of K-EIG by a synthetic experiment and compare it with other typical dictionary learning algorithms. A random matrix $ \varvec{D}_{g} $ of size $ 20 \times 50 $ is generated with i.i.d uniformly distributed entries. Each column is normalized to a unit $ \ell_{2} $-norm. Then a number of data signals $ \{ \varvec{y}_{i} \}_{i = 1}^{M} $ of dimension 20 are produced. Each signal is created by a linear combination of three different atoms in $ \varvec{D}_{g} $, with uniformly distributed i.i.d coefficients in random and independent locations. Then, K-SVD [12], AK-SVD [11] and K-EIG are implemented respectively to obtain three recovered dictionaries $ \varvec{D}_{r} $ based on $ \{ \varvec{y}_{i} \}_{i = 1}^{M} $ and then we observe which $ \varvec{D}_{r} $ is closest to $ \varvec{D}_{g} $. Authors can see [12] for more details about success ratio computation. In order to test the robustness of algorithm, white Gaussian noise with varying signal-to-noise ratio (SNR) is added to $ \{ \varvec{y}_{i} \}_{i = 1}^{M} $. Let $ M = 1500 $ and $ T_{0} = 3 $. Set the number of alternate iteration by 30. The results of atom recovery performance are shown in Table 1. We see that K-SVD obtains the best performance under any SNR situation. The average success ratio of K-EIG is very close to K-SVD. AK-SVD is inferior to the other two algorithms. Besides, we compare the running speed of the three algorithms. We change the input signal number from $ M = 1000 $ to $ M = 96000 $ with interval of 5000 and keep the other parameters unchanged. The running platform is MATLAB 2009 and the computer is equipped with 2.60 GHZ Inter(R) Core(TM) i5 CPU & 4.0 GB memory. Figure 1 shows the time consuming curve. We see that the computation time of K-SVD rises rapidly with the increasing of input signals number. On the contrary, the time consuming of K-EIG and AK-SVD always keeps a low level even in the situation of large M.

Table 1. Comparison of several dictionary learning algorithms in average success ratio of atom recovery

Full size table

4.2 Testing Image SR Performance

In our SR experiments, we use a training set $ \varvec{TS} $ containing some HR natural images such as flowers, architectures, animals and human faces. In simulation, we assume that the degradation model $ \varvec{I}_{l} = (\varvec{I}_{h} *f) \downarrow s $ is operated by the “bicubic filter” with a zoom factor $ 1/s $. During training, we extract about $ 8 \times 10^{4} $ image patch-pairs to be the training examples for each class. The patch size is $ 5 \times 5 $. The column number of $ \varvec{D}_{l} $ or $ \varvec{D}_{h} $ is set by 1000. The number of alternate iteration of K-EIG is set by 40. Five color images are chosen as the test images. They are “lena”, “child”, “flower”, “building”, and “zebra”. Then, five corresponding LR images are generated by the known degradation model. Our task is to reconstruct the HR versions of these LR images. As the way in other literatures, we just apply our SR algorithm to the illuminance channel (Y). The other channels (Cb, Cr) are processed by bicubic interpolation simply. Several representative and related image SR methods are also implemented for comparison purpose. They are “bicubic interpolation”, “neighbor embedding [6]”, “Yang’s method [9]” and “Zeyde’s method [10]”. We choose the peak signal-to-noise ratio (PSNR) and the mean structure similarity index measure (MSSIM) [23] as the performance indexes to measure the quality of the reconstructed image.

We show the quantitative SR results in Tables 2 and 3 in case of $ s = 2 $ and $ s = 3 $. Clearly, our proposed method achieves the highest PSNRs and MSSIMs for all the test images. In order to compare the reconstruction results visually, we present the estimated HR images by nearest interpolation, bicubic interpolation, neighbor embedding, Yang’s, Zeyde’s and our method in case of $ s = 3 $ in Figs. 2 and 3. Meanwhile, some details are magnified to clearly display the subtle difference. From them, we see that bicubic interpolation eliminates the sawtooth appearance in the images obtained by nearest interpolation, but it generates the blurred edges. Neighbor embedding is easy to produce some artifacts. For example, in the part between hat and face in Fig. 2(d), we can observe the discontinuous. Sparse representation based methods generate clearer HR images. Among them, our algorithm obtains the best results which are closest to the original HR images, thanks to the integration of patch classification and edge patches extension into the SR framework. In Fig. 3, we can see obviously that the lattice in Fig. 3(g) is reconstructed better than Fig. 3(b) and (f).

Table 2. PSNRs (MSSIMs) results of reconstructed images with s = 2

Full size table

Table 3. PSNRs (MSSIMs) results of reconstructed images with s = 3

Full size table

4.3 Testing the Time Performance of Dictionary-Pair Construction

Here, we compare the time performance of the three dictionary-pair construction methods (Yang’s, Zeyde’s and ours). The platform and computer configuration are the same as description in Sect. 4.1. The CPU running time is presented in Table 4. Obviously, our and Zeyde’s method are much faster than Yang’s method. Our method is slightly slower than Zeyde’s one because our training process includes patch-classification, edge-patches extension and learning of two dictionary-pairs. It is worthy of spending more several minutes to obtain better dictionary-pair.

Table 4. Time comparison of three dictionary-pair constructing methods

Full size table

5 Conclusion

In this paper, we propose a sparse representation based single image super resolution algorithm with a new dictionary learning technique K-EIG. Our proposed K-EIG is an improved algorithm based on the classical K-SVD algorithm. It replaces the implementation of SVD about the representation error matrix with eigen-decomposition about its autocorrelation matrix. This change overcomes the drawback that the operation of SVD is time-consuming in case of the large number of training examples. We employ a direct dictionary-pair construction method based on K-EIG to accelerate the whole training process. Besides, considering the types of image patches and the number of edge examples, patch classification and edge patches extension are integrated rationally into the SR framework. Extensive experimental results show that our SR algorithm generates better SR images compared with some classical or related methods.

References

Park, S.C., Park, M.K., Kang, M.G.: Super-resolution image reconstruction: a technical overview. IEEE Signal Process. Mag. 20(3), 21–36 (2003)
Google Scholar
Sun, J., Xu, Z., Shum, H.: Image super-resolution using gradient profile prior. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Google Scholar
Dai, S., Han, M., Xu, W.: SoftCuts: a soft edge smoothness prior for color image super-resolution. IEEE Trans. Image Process. 18(5), 969–981 (2009)
MathSciNet MATH Google Scholar
Freeman, W.T., Pasztor, E.C., Carmichael, O.T.: Learning low-level vision. Int. J. Comput. Vis. 40(1), 25–47 (2000)
MATH Google Scholar
Freeman, W.T., Jones, T.R., Pasztor, E.C.: Example-Based super-resolution. IEEE Comput. Graphics Appl. 22(2), 56–65 (2002)
Google Scholar
Chang, H., Yeung, D.Y., Xiong, Y.: Super-resolution through neighbor embedding. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 275–282 (2004)
Google Scholar
Yang, J., Wright, J., Huang, T., et al.: Image super-resolution as sparse representation of raw image patches. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8 (2008)
Google Scholar
Wang, J., Zhu, S., Gong, Y.: Resolution enhancement based on learning the sparse association of image patches. Pattern Recogn. Lett. 31(1), 1–10 (2010)
Google Scholar
Yang, J., Wright, J., Huang, T., et al.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 1–8 (2010)
MathSciNet MATH Google Scholar
Zeyde, R., Elad, M., Protter, M.: On single image scale-up using sparse-representations. In: Boissonnat, J.-D., Chenin, P., Cohen, A., Gout, C., Lyche, T., Mazure, M.-L., Schumaker, L. (eds.) Curves and Surfaces. Lecture Notes in Computer Science, vol. 6920, pp. 711–730. Springer, Heidelberg (2010)
Google Scholar
Rubinstein, R., Zibulevsky, M., Elad, M.: Efficient implementation of the K-SVD algorithm using batch orthogonal matching pursuit. Technical report, Technion-Israel Institute of Technology (2008)
Google Scholar
Aharon, M., Elad, M., Bruckstein, A.: K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process. 54(11), 4311–4322 (2006)
MATH Google Scholar
Dong, W., Zhang, L., Lukac, R., Shi, G.: Nonlocal centralized sparse representation for image restoration. IEEE Trans. Image Process. 22(4), 1382–1394 (2013)
MathSciNet MATH Google Scholar
Peleg, T., Elad, M.: A statistical prediction model based on sparse representations for single image super-resolution. IEEE Trans. Image Process. 23(7), 3085–3098 (2014)
MathSciNet MATH Google Scholar
Fan, W., Yeung, D.Y.: Image hallucination using neighbor embedding over visual primitive manifolds. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition (2007)
Google Scholar
Li, H., Xiong, H., Qian L.: Image super-resolution with sparse representation prior on primitives. In: Proceedings of Visual Communications and Image Processing (2010)
Google Scholar
Elad, M.: Sparse and Redundant Representations From Theory to Applications in Signal and Image Processing. Springer, New York (2010)
MATH Google Scholar
Elad, M., Aharon, M.: Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 15(12), 3736–3745 (2006)
MathSciNet Google Scholar
Zhou, Y., Ye, Z., Xiao, Y.: A restoration algorithm for images contaminated by mixed Gaussian plus random-valued impulse noise. J. Vis. Commun. Image Represent. 24(3), 283–294 (2013)
Google Scholar
Bao, G., Ye, Z., Xu, X., et al.: A compressed sensing approach to blind separation of speech mixture based on a two-layer sparsity mode. IEEE Trans. Audio Speech Lang. Process. 21(5), 899–906 (2013)
Google Scholar
Lin, C.H., Tsai, J.S., Chiu, C.T.: Switching bilateral filter with a texture/noise detector for universal noise removal. IEEE Trans. Image Process. 19(9), 2307–2320 (2010)
MathSciNet MATH Google Scholar
Tropp, J.A.: Greed is good: algorithmic results for sparse approximation. IEEE Trans. Inf. Theory 50, 2231–2242 (2004)
MathSciNet MATH Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., et al.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13(4), 600–612 (2004)
Google Scholar

Download references

Acknowledgment

This research is supported by the National Natural Science Foundation of China (Grant No.61401379), the General Project of Educational Commission of Sichuan Province in China (Grant No.14ZB0107), the Doctoral Research Fund of Southwest University of Science and Technology in China (Grant No.13zx7148) and the Key Project of Educational Commission of Sichuan Province in China (Grant No.11ZA130).

Author information

Authors and Affiliations

Special Environment Robot Technology Key Laboratory of Sichuan Province, School of Information Engineering, Southwest University of Science and Technology, Mianyang, 621010, Sichuan, People’s Republic of China
Yingyue Zhou, Hongbin Zang, Su Xu & Hongying Zhang

Authors

Yingyue Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Hongbin Zang
View author publications
You can also search for this author in PubMed Google Scholar
Su Xu
View author publications
You can also search for this author in PubMed Google Scholar
Hongying Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yingyue Zhou .

Editor information

Editors and Affiliations

Department of Electronic Engineering, Tsinghua University, Beijing, China
Yu-Jin Zhang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, Y., Zang, H., Xu, S., Zhang, H. (2015). Single Image Super Resolution Algorithm with a New Dictionary Learning Technique K-Eigen Decomposition. In: Zhang, YJ. (eds) Image and Graphics. Lecture Notes in Computer Science(), vol 9219. Springer, Cham. https://doi.org/10.1007/978-3-319-21969-1_32

Download citation

DOI: https://doi.org/10.1007/978-3-319-21969-1_32
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-21968-4
Online ISBN: 978-3-319-21969-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The International Association for Pattern Recognition (opens in a new tab)

Single Image Super Resolution Algorithm with a New Dictionary Learning Technique K-Eigen Decomposition

Abstract

Similar content being viewed by others

Image Super-Resolution Reconstruction Based on Two-Stage Dictionary Learning

Multi-stage Dictionary Learning for Image Super-Resolution Based on Sparse Representation

Super-Resolution from One Single Low-Resolution Image Based on R-KSVD and Example-Based Algorithm

Keywords

1 Introduction

2 Reviews and Analysis

3 The Details of Our Proposed SR Method

3.1 A New Dictionary Learning Algorithm K-EIG

3.2 Sparse Representation Based SR Framework with Example Classification and Extension