EDGAN: motion deblurring algorithm based on enhanced generative adversarial networks

Zhang, Yong; Ma, Shao Yong; Zhang, Xi; Li, Li; Ip, Wai Hung; Yung, Kai Leung

doi:10.1007/s11227-020-03189-y

EDGAN: motion deblurring algorithm based on enhanced generative adversarial networks

Published: 06 February 2020

Volume 76, pages 8922–8937, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

The Journal of Supercomputing Aims and scope Submit manuscript

EDGAN: motion deblurring algorithm based on enhanced generative adversarial networks

Download PDF

Yong Zhang^1,2,3,
Shao Yong Ma¹,
Xi Zhang⁴,
Li Li ORCID: orcid.org/0000-0003-2520-9583⁵,
Wai Hung Ip⁶ &
…
Kai Leung Yung⁶

645 Accesses
13 Citations
Explore all metrics

Abstract

Removing motion blur has been an important issue in computer vision literature. Motion blur is caused by the relative motion between the camera and the photographed object. However, in recent years, some achievements have been made in the research of image deblurring by using deep learning algorithms. In this paper, an enhanced adversarial network model is proposed. The proposed model can use the weight of feature channel to generate sharp image and eliminate draughtboard artefacts. In addition, the mixed loss function enables the network to output high-quality image. The proposed approach is tested using GOPRO datasets and Lai datasets. In the GOPRO datasets, the peak signal-to-noise ratio of the proposed approach is up to 28.674, and DeblurGAN is 27.454. And the structural similarity measure can be achieved up to 0.969, and DeblurGAN is 0.939. Furthermore, the images were obtained from China’s Chang’e 3 Lander to test the new algorithm. Due to the elimination of the chessboard effect, the deblurred image has a better visual appearance. The proposed method achieved higher performance and efficiency in qualitative and quantitative aspects using the benchmark dataset experiments. The results also provided various insights into the design and development of the camera pointing system, which was mounted on the Lander for capturing images of the moon and rover for Chang’e space mission.

Blind motion deblurring with cycle generative adversarial networks

Article 04 October 2019

FMD-cGAN: Fast Motion Deblurring Using Conditional Generative Adversarial Networks

Fast simultaneous image super-resolution and motion deblurring with decoupled cooperative learning

Article 11 May 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Motion blur is one of the most common types of image blurring. Shorter exposure time and fast-moving objects or camera shaking will cause motion blur in the final image, resulting in poor image perception, affecting image information transmission, and postprocessing [1, 2]. In the field of computer vision, motion blur could cause the reduction of accuracy and efficiency of image recognition and classification. Therefore, the restoration of motion blurred images is of great significance. Most of the early blurred image restoration approaches are based on the following blur model [3,4,5,6]:

$$\begin{array}{*{20}c} {I_{\text{B}} = K*I_{\text{S}} + N} \\ \end{array}$$

(1)

where $I_{\text{B}}$, $I_{\text{S}}$, $K$, and $N$ are the blurred image, the latent sharp image, the blurred kernel, and noise, respectively, which represents a convolution operation. The process may be seen as a convolution operation between a sharp image and a blurred image kernel, when the blurred image is formed after the effect of random noise. An image restoration algorithm can be divided into blind restoration and non-blind restoration based on whether the blur kernel is known or not. The non-blind restoration algorithm restores the blurred image by estimating the inverse process of (1) using the known blurred kernel. The classical algorithm uses an LR (Lucy–Richardson) algorithm, a Wiener filter, and a Kalman filter [7]. A partial blind restoration algorithm is used to reconstruct the image by estimating the blurred kernel. It is time-consuming and inefficient to restore an image with a blurred kernel due to the unknown blurred kernel’s function; the blur types in reality are complex and uncertain in most cases.

Generative adversarial networks (GANs) [8] are a rapidly developing research direction in the field of deep learning. GANs have been proven to have excellent performance in image super-resolution, image style transfer, and high-precision sample estimation. Image deblurring can be considered a type of image style transfer, that is, the change of image from “blurred” style to “sharp” style.

GAN’s excellent performance in the image conversion task and the existing image deblur algorithm still have many shortcomings. It is a new direction to apply GAN to motion blur image restoration task. In this paper, an enhanced GANs model, which may obtain higher PSNR and SSIM on motion blur removal, is proposed. In addition, the deblurred image will have a better visual appearance and quality. In Section II, the related work on image restorations is discussed. In Section III, the proposed method and algorithm are described, and the mathematical analysis is elaborated. In Section IV, the proposed method is validated by the experiments, and the benchmarked datasets are used to compare the results. The experiments are also simulated on images obtained from the Chang’e 3 space mission. In Section V, the summary of the work, conclusions, and implications for the design of Chang’e mission’s camera pointing systems is also given. The contributions of this paper are as follows:

1.
A new work structure and a hybrid loss function are proposed, which can recover the motion blurry image efficiently;
2.
Experiments on different datasets prove that the algorithm can be applied to real scenes.

2 Work

2.1 Related study

In recent years, numerous blurred image restoration approaches based on deep learning have been proposed. Sun et al. [9] used CNN to predict the probability distribution of motion blur in every block of image and restored the image by the motion blur probability distribution of each image block. Nah et al. [10] used a multi-scale CNN to directly correct the deblurred image. However, the algorithm has high complexity and low efficiency, so it cannot process images quickly. Ramakrishnan et al. [11] proposed a novel network structure to improve the efficiency of the network, while maintaining the effect of deblurring. Although this algorithm improved the efficiency, the clarity of the generated images was similar to that of Nah’s method. Kupyn et al. [12] proposed a motion blur removal algorithm based on condition generation against a network. This approach uses WGAN-GP [13] and perceptual loss [14] as the final loss function and achieved a high image restoration effect. Deep learning is also applied to motion blurred video restoration. Su et al. [15] adopt CNN to aggregate multiple images to generate an sharp output. Zhang et al. [16] used GAN with 3D convolution to capture spatial and temporal information encoded in neighbouring frames to restore blurred video. Chen et al. [17] used self-supervised fashion to fine-tune existing deblurring neural networks, which improves the performance of video deblurring algorithm.

2.2 GANs

GANs consist of a generator and a discriminator. The basic principle is that the generator receives a random noise signal to generate new data samples, and the discriminator determines whether the samples are from the real sample set or the simulated samples generated by the generator. The purpose of the generator is to generate a sample that is close to the data distribution of the real sample set, making it impossible for the discriminator to determine the data source. The objective optimisation function of the generation of GANs is as follows:

$$\begin{array}{*{20}c} {\mathop {\hbox{min} }\limits_{G} \mathop {\hbox{max} }\limits_{D} V\left( {D,G} \right) = {\mathbb{E}}_{x\sim p\left( x \right)} \left[ {\log D\left( x \right)} \right] + {\mathbb{E}}_{x\sim p\left( z \right)} \left[ {\log \left( {1 - D\left( {G\left( z \right)} \right)} \right)} \right]} \\ \end{array}$$

(2)

where $p\left( x \right)$, $p\left( z \right)$, $G\left( {z;\,\theta_{g} } \right)$, and $D\left( x \right)$ are data distribution, predefined noise variable, mapping from noise space to data space, and the probability that x is from the real sample set, respectively. ${\mathbb{E}}$ is mathematical expectation. The discriminator was trained to minimise $\log D\left( x \right)$, and the generator is trained to minimise $\log \left( {1 - D\left( z \right)} \right)$. To enable the network to output data according to the expectations, the researchers added the extra conditional information $, y,$ on the original GANs, and $y$ could be any kind of information [18]. The optimal objective function of conditional generation adversarial networks is as follows:

$$\begin{array}{*{20}c} {\mathop {\hbox{min} }\limits_{G} \mathop {\hbox{max} }\limits_{D} V\left( {D,G} \right) = {\mathbb{E}}_{x\sim p\left( x \right)} \left[ {\log D\left( {x|y} \right)} \right] + {\mathbb{E}}_{{z\sim p_{z} \left( z \right)}} \left[ {\log \left( {1 - D\left( {G\left( {z|y} \right)} \right)} \right)} \right]} \\ \end{array}$$

(3)

However, training the original GANs suffered from many challenges, such as mode collapse, training instability, vanishing gradients, etc. The reason for these problems is that the original GANs use JS divergence to measure the differences between the two distributions. JS divergence is a constant log2, where two distributions do not overlap completely, which leads to the mutated properties of JS divergence. Thus, Wasserstein GAN (WGAN) [19] uses Wasserstein distance to measure the differences between the two distributions.

$$\begin{array}{*{20}c} {W\left( {P_{r,} P_{g} } \right) = \mathop {\inf }\limits_{{y\sim \prod \left( {P_{r,} P_{g} } \right)}} {\mathbb{E}}_{{\left( {x,y} \right)\sim y}} \left[ {x - y} \right]} \\ \end{array}$$

(4)

For two distributions, $P_{r}$ and $P_{g}$, their joint distribution is $\prod \left( {P_{r} ,P_{g} } \right)$. Calculating the distance between $y$ and $y$ sampled from each joint distribution takes all of x and $y$ to calculate the expected value. The smallest value is chosen as the Wasserstein distance. Compared with JS divergence, Wasserstein distance has continuous transformation regardless of whether the two distributions overlap.

3 Proposed method

The proposed method is to improve the performance of existing end-to-end deblurring adversarial network model by making the network to have the ability to extract the weight of feature channels and remove the draughtboard artefacts. The key idea is to use a novel convolution unit constructed by squeeze and excitation networks and residual network to extract image feature. In addition, using resize convolution as up-sampling method, he contrastive experiments showed that chessboard effect could be effectively removed by resize convolution.

3.1 Generator

The generator structure is shown in Fig. 1, which is similar to the structure used by Kupyn et al. [12]. It contains two strided convolution blocks with stride two, nine residual blocks, and two transposed convolution blocks. To improve the quality of the image, some improvements are made on the basis of the network structure of Kupyn et al. [12] and are elaborated as follows.

3.1.1 Without batch normalisation

Traditional neural networks only normalise the data before it is inputted into the network, whereas the batch normalisation [20] layer normalises the input of hidden layers. Batch normalisation may solve the problem of gradient disappearance and gradient explosion in the back-propagation algorithm, which can also accelerate the convergence speed of the network. The batch normalisation layers are removed from the network, as Nah et al. [10] and Bee et al. [21] presented in their model. Since batch normalisation layers normalise the features, it limits network flexibility.

3.1.2 SE-ResBlock

The SENet (squeeze and excitation networks) proposed by Hu et al. [22] is used to improve the performance of the residual network. During CNN, the convolution kernel could be regarded as the aggregate of the spatial information and the characteristics of dimension information. To improve the performance of CNN, many approaches have been proposed, from spatial dimension, such as the inception module [20, 23,24,25]. The difference is that SENet improves network performance from the feature dimension; SENet can learn the relationship between different feature channels, obtain the weight of feature channels, and use the weight to promote useful features and suppress features that are less useful for the current task.

SENet includes three key operations: squeeze, excitation, and reweight. Squeeze operations achieve feature compression by turning each of the two-dimensional characteristic channels into a real number. Excitation operation gives weight to each feature channel through the parameter, W, which represents the correlation between the feature channels. In the reweight operation, the output of excitation is regarded as the importance of different feature channels. By multiplying the output of excitation as the weighting of the original feature, the original feature is re-calibrated. SENet has been proven to have excellent performance in image classification. In this paper, SENet is applied to image processing due to the image processing method adopted in this paper to completely reconstruct the image.

In the reconstruction process, the importance of different feature channels should be considered. So the combined convolution unit of SENet and ResNet (residual networks) [26], named SE-ResBlock, is used and is shown in Fig. 2. Global average pooling is used as a squeeze operation. The excite operation calculates the interchannel correlation by using two fully connected layers to make up the bottleneck structure. The reweight operation is used to weight the normalised feature into the original feature channels.

3.1.3 Resize convolution [27]

Using CNN to generate images is a process that transforms low-resolution image blocks into a high-resolution image. It is usually realised by deconvolution. Due to the “uneven overlap” in the deconvolution process, it will lead to the artefact similar to the draughtboard lattice in the details of the image, which are called “draughtboard artefacts”. To eliminate this phenomenon, one approach is to make sure the kernel size is divided by the stride; however, it is still easy to create draughtboard artefacts. The resize convolution is used as an up-sampling method instead of deconvolution. The resize convolution is implicitly weight-tying in a way that discourages high-frequency artefacts. The process of resize convolution is to resize the image (using nearest-neighbour interpolation or bilinear interpolation) and then do a convolutional layer, as shown in Fig. 3.

3.2 Discriminator

Here, the Markovian discriminator [28], PatchGAN, is used as the discriminator for EDGAN. Since content loss (the combination of perceptual loss and gradient loss was used in this paper) has been able to process the low-frequency components of the image very well, the discriminator only needs to process the high-frequency components. Therefore, the receptive field of the discriminator output does not need to be the whole input image. The output can be a feature map, and the receptive field of each pixel in the feature map is a patch on the input image, which can accelerate the discriminator while obtaining high-quality images. According to the experiments in the literature of Isola et al. [29], the output of the discriminator is set at $50 \times 50$, which takes into account both image quality and network operation speed. The model parameters of the discriminator are shown in Table 1.

Table 1 Model parameters of discriminator

Full size table

3.3 Loss function

The generator loss functions as a combination of adversarial loss, perceptual loss, and gradient loss are formulated as follows:

$$\begin{array}{*{20}c} {l = \ell_{{{\text{WGAN}} - {\text{GP}}}}^{\text{Generator}} + \alpha \cdot \ell_{\text{percept}} + \beta \cdot \ell_{\text{grad}} } \\ \end{array}$$

(5)

where $\alpha$ and $\beta$ are the weight parameters of perceptual loss and gradient loss.

3.3.1 Adversarial loss

To overcome the problems existing in the original GANs, while WGAN reduces the difficulty of training for GANs, it is still difficult to converge under certain conditions, and the effect of generating the picture did not satisfy the expectations. WGAN will limit the weight to a certain extent after updating the weight of each iteration, whereas WGAN-GP calculates the weight gradient according to the input of the discriminator and corrects the weight according to the norm of the gradient. WGAN-GP effectively solves the problem of WGAN. The loss is calculated as the following, where $I^{B}$ is blurry image:

$$\begin{array}{*{20}c} {\ell_{{{\text{WGAN}} - {\text{GP}}}}^{\text{Generator}} = \mathop \sum \limits_{n = 1}^{N} - D_{{\theta_{D} }} \left( {G_{{\theta_{G} }} \left( {I^{B} } \right)} \right)} \\ \end{array}$$

(6)

3.3.2 Perceptual loss

The basic idea here is to use the features extracted by CNN as part of the target function. By reducing the Euclidean distance between feature maps generated by CNN and target images, the generated images are more consistent with the target image than the pixel-level loss function. Perceptual loss is a kind of high-level loss. The definition is as follows, where $I^{S}$ is sharp image:

$$\begin{array}{*{20}c} {\ell_{\text{percep}} = \frac{1}{{W_{i,j} H_{i,j} }}\mathop \sum \limits_{x = 1}^{{W_{i,j} }} \mathop \sum \limits_{y = 1}^{{H_{i,j} }} \left( {\phi_{i,j} \left( {I^{S} } \right)_{x,y} - \phi_{i,j} \left( {G_{{\theta_{G} }} \left( {I^{B} } \right)} \right)_{x,y} } \right)^{2} } \\ \end{array}$$

(7)

where $W_{i,j}$ and $H_{i,j}$ are the dimensions of the feature maps; $\phi_{i,j}$ is the feature map obtained by the j-th convolution before the i-th maxpooling layer within a CNN. The CNN used in this paper is a VGG19 (layer 1–14) network [31], pretrained on ImageNet [32].

3.3.3 Gradient loss [33]

In addition, image information in the gradient domain is also leveraged as a high-level loss term as follows:

$$\begin{array}{*{20}c} {\ell_{\text{grad}} = \frac{1}{2N}\mathop \sum \limits_{i = 1}^{N} \left| {\nabla_{h} \left( {I^{S} } \right) - \nabla_{h} \left( {I^{B} } \right)} \right| + \left| {\nabla_{v} \left( {I^{S} } \right) - \nabla_{v} \left( {I^{B} } \right)} \right|} \\ \end{array}$$

(8)

where $\nabla_{h}$ and $\nabla_{v}$ indicate the horizontal and vertical gradients. N indicates the number of training image pairs.

WGAN-GP is used as the critic function, the discriminator loss, as the following:

$$\begin{array}{*{20}c} {\ell_{{{\text{WGAN}} - {\text{GP}}}}^{\text{Discriminator}} = \mathop {\mathbb{E}}\limits_{{\widetilde{x}\sim P_{g} }} \left[ {D\left( {\widetilde{x}} \right)} \right] - \mathop {\mathbb{E}}\limits_{{x\sim P_{r} }} \left[ {D\left( x \right)} \right] + \lambda \mathop {\mathbb{E}}\limits_{{\widehat{x}\sim P_{{\widehat{x}}} }} \left[ {\left( {\nabla_{{\widehat{x}}} D\left( {\widehat{x}} \right)_{2} - 1} \right)^{2} } \right]} \\ \end{array}$$

(9)

where $\lambda$ is the penalty coefficient, $p_{g}$ is the sample distribution of the generator, $p_{r}$ is the sample distribution of the sharp image, and $p_{{\widehat{x}}}$ is the distribution uniformly sampled along the straight lines between $p_{g}$ and $p_{r}$.

4 Experimental evaluation

4.1 Experimental settings

The following describes the validations of the proposed method and algorithms. First, the benchmarked datasets are used to test the performances, and then it is tested with the images obtained from Chang’e 3 space missions. All of the proposed models using the PyTorch deep learning framework are implemented. The experimental model training and testing hardware platform is NVIDIA GTX 1070 GPU and Intel i7-4790 CPU. The first model is referred for using SE-ResBlock as ${\text{EDGAN}}_{\text{SE}}$. The second model used SE-ResBlock and gradient loss as ${\text{EDGAN}}_{{{\text{SE}} - {\text{G}}}}$. The third model used SE-ResBlock and a resize convolution as ${\text{EDGAN}}_{{{\text{SE}} - {\text{R}}}}$. A random crop size of $250 \times 250$ from a GOPRO dataset [10] is used to train the proposed model. The model optimisation algorithm is ADAM [34], a set learning rate of $10^{ - 4}$. To speed up the convergence rate of the model, the learning rate in the first 150 rounds was fixed, and in the second 150 rounds, the learning rate was gradually reduced to zero. To balance the performance of the generator and the discriminator, every five-gradient descent algorithms were performed by the discriminator, and the generator was executed once. The $\alpha$ and $\beta$ of generator loss function are 1 and 5. The approaches proposed in this paper do not adopt a batch normalisation layer. Other specific differences are shown in Table 2.

Table 2 Mean PSNR and SSIM on GOPRO dataset of 1111 images

Full size table

4.2 Datasets

4.2.1 GOPRO dataset

The GOPRO dataset consists of 3214 pairs of images, of which 2103 are training sets and 1111 pairs are test sets. The dataset adopts 240fps high frame rate video captured by GOPRO4 Hero camera, taking about 7–14 images as a group, averaging each group of images to get blurry image, and taking the middle frame of this group of images as the label image. Since this dataset has been applied in many related studies, it is also used in this paper for the convenience of comparison with other studies. The proposed model is compared with Kupyn et al. [12], and the results are shown in Table 3. The details of image are shown in Fig. 4. From the results, it is found that EDGAN could better restore the image detail. The contrast between ${\text{EDGAN}}_{\text{SE}}$ and ${\text{EDGAN}}_{{{\text{SE}} - {\text{R}}}}$ is shown in Fig. 5. Deconvolution causes the abnormal colour artefacts in the image texture, and resize convolution can eliminate it.

Table 3 Mean PSNR and SSIM on GOPRO dataset of 1111 images

Full size table

4.2.2 Lai dataset

The Lai dataset includes a real dataset and a synthetic dataset. The real dataset is made up of 100 blurred images collected from real-world scenes. These images use different shooting devices, shooting settings, and shooting themes. The synthetic dataset includes 100 blurred images generated by a convolution between 25 sharp images collected from the Internet and four different blurred kernels. The blur kernels are generated by the recorded 6D camera trajectories. In the experiment, only a synthetic dataset is used, and the results are shown in Table 4. EDGAN shows superior results both in qualitative and in quantitative ways. Deblurred images from tests on the Lai dataset are shown in Fig. 6.

Table 4 Mean PSNR and SSIM on Lai dataset of 100 images

Full size table

4.2.3 Chang’e 3 space mission images [35]

The impact of the proposed algorithm is also evaluated using the open-source data images available for Chang’e 3 space mission. The Chang’e 3 space mission from China was successfully launched, and the rover “Yutu” was landed on the moon’s surface in December 2013. One of the many successful space missions from the Chinese in several decades was considered. One of the many projects that was led by the author of this paper, Prof. KL Yung [1, 2, 36] and his team at the Hong Kong Polytechnic University, was the design and development of the camera pointing system mounted on the Lander of the moon’s surface. The equipment was operated for over 3 years on the moon and was terminated in 2017.

The following simulated experiments were conducted to test the algorithm which can be used to improve the design and implementation of future Chang’e missions. Ten pictures from Chang’e 3 and its rover “Yutu” image database are randomly selected. And the corresponding blurred images are artificially generated and are restored. The blurred images were generated by a convolution operation of a clear image and a motion blur operator. The motion blur operator included two parameters, angle $\theta$ and pixel number $n$. The convolution represents the object photographed being moved with $n$ pixel with $\theta$ angle counter-clockwise. The results are shown in Tables 5, 6, 7, and 8. Deblurred images are shown in Figs. 7, 8, 9, and 10.

Table 5 Mean PSNR and SSIM on Chang’e 3 image database of 10 images ($n = 20, \theta = 15$)

Full size table

Table 6 Mean PSNR and SSIM on Chang’e 3 image database of 10 images ($n = 10, \theta = 15$)

Full size table

Table 7 Mean PSNR and SSIM on rover “Yutu” image database of 10 images ($n = 20, \theta = 15$)

Full size table

Table 8 Mean PSNR and SSIM on rover “Yutu” image database of 10 images ($n = 10, \theta = 15$)

Full size table

Two conclusions from the simulated experiments are drawn: (1) although resize convolution can eliminate draughtboard artefacts, it will cause PSNR and SSIM to decrease, and (2) when the blur is serious ($n = 20$, $\theta = 15$), DeblurGAN does not restore the details in the image very well, especially the line structure, and SE-ResBlock can better restore the details of the image.

5 Discussion

This paper demonstrates that the image generated by the proposed algorithm is sharper through comparative experiments on different datasets. Although the algorithm performs well in image deblurring, it is unable to process high-resolution images in real time due to the lack of further optimisation of network structure for algorithm efficiency. In addition, for image frames from video, because the algorithm proposed in this paper does not support multiple images as input, the interframe information cannot be extracted. The future work will focus on the speed improvement and way to extract information of this algorithm.

6 Conclusion

In this paper, the EDGAN is proposed, and it is a novel model designed to restore motion blurred images. And the EDGAN sets a new state-of-the-art technology to public benchmark datasets in terms of the PSNR and SSIM metric [37]. In addition, using resize convolution as an up-sampling method can effectively eliminate “draughtboard artefacts” on the generated images is confirmed. However, the resize convolution would reduce the quality of the image details. It will be part of future work to eliminate colour artefacts without reducing the performance of EDGAN. Moreover, the method is tested and evaluated for future space missions of Chang’e. The camera pointing system developed by the Hong Kong Polytechnic University in early 2013 was used to capture images of the moon, as well as the movement of the rovers. It was capable of 360 degrees of image capturing, as well as positioning and navigating of the rover. Based on the past experience, a new algorithm with a deep learning approach for future space missions is proposed. The results indicate that deep learning can achieve good performances for high-precision image restorations and can be incorporated into the design of cylindrical projection of sequential images of the camera pointing system for image constructions, as well as feature recognition in future deep space explorations.

References

Polyu.edu.hk (2015) Life on Mars? Award-wining PolyU device could dig up the answer [online]. http://www.polyu.edu.hk/openingminds/en/story.php?sid=3. Accessed 10 Nov 2017
Polyu.edu.hk (2017) PolyU 70th Anniversary [online]. https://www.polyu.edu.hk/cpa/70thanniversary/memories_achievement10.htm. Accessed 3 Nov 2017
Gupta A, Joshi N, Zitnick CL, Cohen M, Curless B (2010) Single image deblurring using motion density functions. In: Proceedings of the ECCV, pp 171–184
Harmeling S, Hirsch M, Schölkopf B (2010) Space-variant single-image blind deconvolution for removing camera shake. In: Proceedings of the NIPS, Vancouver, British Columbia, Canada, pp 829–837
Hirsch M, Schuler CJ, Harmeling S, Scholkopf B (2011) Fast removal of non-uniform camera shake. In: Proceedings of the ICCV, pp 463–470
Whyte O, Sivic J, Zisserman A, Ponce J (2010) Non-uniform deblurring for shaken images. In: Proceedings of the CVPR, pp 491–498
Li W (2011) Parameter estimation and algorithm research of motion blur image restoration. M.S. dissertation, Department of Computer Application, Anhui University, Anhui, China
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S et al (2014) Generative adversarial nets. In: Proceedings of the ICNOIP, pp 2672–2680
Sun J, Cao W, Xu Z, Ponce J (2015) Learning a convolutional neural network for non-uniform motion blur removal. In: Proceedings of the CVPR, pp 769–777
Nah S, Kim TH, Lee KM (2018) Deep multi-scale convolutional neural network for dynamic scene deblurring. arXiv preprint arXiv:1612.02177v2
Ramakrishnan S, Pachori S, Gangopadhyay A, et al (2017) Deep generative filter for motion deblurring. In: Proceedings of the ICCV, pp 2993–3000
Kupyn O, Budzan V, Mykhailych M, Mishkin D, Matas J (2017) Deblurgan: blind motion deblurring using conditional adversarial networks. arXiv preprint arXiv:1711.07064
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein generative adversarial networks. In: Proceedings of the ICML, pp 214–223
Johnson J, Alahi A, Li FF (2016) Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the ECCV, pp 694–711
Su S, Delbracio M, Wang J, Sapiro G, Heidrich W, Wang O (2017) Deep video deblurring for hand-held cameras. In: Proceedings of the CVPR, pp 1279–1288
Zhang K, Luo W, Zhong Y, Mia L, Liu W (2018) Adversarial Spatio-Temporal Learning for Video Deblurring. In: Proceedings of the TIP, pp 281–301
Chen H, Gu J, Gallo O, Liu M, Veeraraghavan A, Kautz J (2018) Reblur2Deblur: deblurring videos via self-supervised learning. In: ICCP
Mirza M, Osindero S (2014) Conditional generative adversarial nets. Comput Sci 63:2672–2680
Google Scholar
Arjovsky M, Chintala S, Bottou L (2017) Wasserstein gan. arXiv preprint arXiv:1701.07875
Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: Proceedings of the ICML
Shi W, Caballero J, Huszár F, et al (2016) Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1874–1883
Hu J, Shen L, Sun G (2018) Squeeze-and-excitation networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7132–7141
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D et al (2015) Going deeper with convolutions. In: Proceedings of the CVPR, pp 1–9
Szegedy C, Vanhoucke V, Ioffe S et al (2016) Rethinking the inception architecture for computer vision. In: Proceedings of the CVPR
Szegedy C, Ioffe S, Vanhoucke V (2017) Inception-v4, inception-resnet and the impact of residual connections on learning. arXiv:1602.07261
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the CVPR, pp 770–778
Odena A, Dumoulin V, Olah C (2016) Deconvolution and checkerboard artifacts. http://distill.pub/2016/deconvcheckerboard/
Liand C, Wand M (2016) Precomputed real-time texture synthesis with Markovian generative adversarial networks. In: Proceedings of the ECCV, pp 702–716
Isola P, Zhu J, Zhou T, Efros A (2017) Image-to-image translation with conditional adversarial networks. In: Proceedings of the CVPR, pp 1125–1134
Maas AL, Hannun AY, Ng AY (2013) Rectifier nonlinearities improve neural network acoustic models. Proc. ICML 30(1):3
Google Scholar
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Comput Sci 362:1140
Google Scholar
Deng J, Dong W, Socher R, Li LJ, Li K, Li FF (2009) ImageNet: a large-scale hierarchical image database. Computer Vision and Pattern Recognition. In: Proceedings of the CVPR, pp 248–255
Jiao J, Tu WC, He S, Lau RWH (2017) FormResNet: formatted residual learning for image restoration. In Proceedings of the CVPRW, pp 1034–1042
Kingmaand DP, Ba J (2017) Adam: a method for stochastic optimization. arXiv: arXiv:1412.6980v9
Chinese Academy of Sciences, China National Space Administration, The Science and Application Center for Moon and Deepspace Exploration. Chang’e 3data:Rover P anoramic Camera. http://planetary.s3.amazonaws.com/data/change3/pcam.html. 14 May 2018
Polyu.edu.hk (2017) The Hong Kong Polytechnic University-The Space Exploration Journey [online]. https://www.polyu.edu.hk/web/filemanager/en/content_155/1960/Appendix_Milestones_SpaceProjects_.pdf. Accessed 24 Nov 2017
Hore A, Ziou D (2010) Image quality metrics: PSNR vs. SSIM. In: Proceedings of the ICRP, pp 2366–2369

Download references

Acknowledgments

This work is supported by the Natural Science Foundation of Guangdong Province (2015A030310172). It is also partly supported by a Grant from the Department of Industrial and Systems Engineering of the Hong Kong Polytechnic University (H-ZG3K) and a Grant from Shenzhen Technology University (2018010802008).

Author information

Authors and Affiliations

ATR Key Laboratory of National Defense Technology, Shenzhen University, Shenzhen, 518060, China
Yong Zhang & Shao Yong Ma
Guangdong Key Laboratory of Intelligent Information Processing, Shenzhen University, Shenzhen, 518060, China
Yong Zhang
Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen University, Shenzhen, 518060, China
Yong Zhang
Information Centre, Shenzhen Technology University, Shenzhen, 518118, China
Xi Zhang
School of Foreign Languages, Shenzhen Institute of Information Technology, Shenzhen, 518172, China
Li Li
Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University, Hong Kong, 999077, China
Wai Hung Ip & Kai Leung Yung

Authors

Yong Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shao Yong Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Li Li
View author publications
You can also search for this author in PubMed Google Scholar
Wai Hung Ip
View author publications
You can also search for this author in PubMed Google Scholar
Kai Leung Yung
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Li Li.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Y., Ma, S.Y., Zhang, X. et al. EDGAN: motion deblurring algorithm based on enhanced generative adversarial networks. J Supercomput 76, 8922–8937 (2020). https://doi.org/10.1007/s11227-020-03189-y

Download citation

Published: 06 February 2020
Issue Date: November 2020
DOI: https://doi.org/10.1007/s11227-020-03189-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

EDGAN: motion deblurring algorithm based on enhanced generative adversarial networks

Abstract

Similar content being viewed by others

Blind motion deblurring with cycle generative adversarial networks

FMD-cGAN: Fast Motion Deblurring Using Conditional Generative Adversarial Networks

Fast simultaneous image super-resolution and motion deblurring with decoupled cooperative learning

1 Introduction