Learning to Predict Decomposed Dynamic Filters for Single Image Motion Deblurring

Hu, Zhiqiang; Yu, Tao

doi:10.1007/978-3-031-26313-2_24

Zhiqiang Hu¹² &
Tao Yu¹³

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13843))

Included in the following conference series:

Asian Conference on Computer Vision

807 Accesses

Abstract

This paper tackles the large motion variation problem in the single image dynamic scene deblurring task. Although fully convolutional multi-scale-based designs have recently advanced the state-of-the-art in single image motion deblurring. However, these approaches usually utilize vanilla convolution filters, which are not adapted to each spatial position. Consequently, it is hard to handle large motion blur variations at the pixel level. In this work, we propose Decomposed Dynamic Filters (DDF), a highly effective plug-and-play adaptive operator, to fulfill the goal of handling large motion blur variations across different spatial locations. In contrast to conventional dynamic convolution-based methods, which only predict either weight or offsets of the filter from the local feature at run time, in our work, both the offsets and weight are adaptively predicted from multi-scale local regions. The proposed operator comprises two components: 1) the offsets estimation module and 2) the pixel-specific filter weight generator. We incorporate the DDF into a lightweight encoder-decoder-based deblurring architecture to verify the performance gain. Extensive experiments conducted on the GoPro, HIDE, Real Blur, SIDD, and DND datasets demonstrate that the proposed method offers significant improvements over the state-of-the-art in accuracy as well as generalization capability. Code is available at: https://github.com/ZHIQIANGHU2021/DecomposedDynamicFilters.

Z. Hu—Independent Researcher.

Access provided by Autonomous University of Puebla. Download conference paper PDF

Fast simultaneous image super-resolution and motion deblurring with decoupled cooperative learning

Article 11 May 2020

Motion Deblurring in the Wild

Pixel-Level and Perceptual-Level Regularized Adversarial Learning for Joint Motion Deblurring and Super-Resolution

Article 08 July 2022

1 Introduction

Dynamic scene motion deblurring aims to rehabilitate an original sharp image from a blurry image caused by camera shakes, moving objects, or low shutter speeds. Blur artifacts significantly degrade the quality of captured images, which is harmful to many high-level vision applications, e.g., face recognition systems, surveillance, and autonomous driving systems. Therefore, the accurate and efficient technique of eliminating blurring artifacts and recovering sharp images is highly desired. To handle the blind motion deblurring problem, many conventional approaches attempt to estimate the blur kernel via some hand-crafted priors [1,2,3,4,5,6,7]. However, estimating a satisfactory blur kernel remains a challenging computer vision problem. Such hand-crafted priors can hardly generalize well to complex real-world examples, which results in degraded performance. To address these challenges, many deep learning-based approaches [8,9,10,11,12,13,14,15,16] try to utilize the neural network to deblur images in an end-to-end manner and significantly improve the performance. In particular, among the architectures, the coarse-to-fine scheme has been widely employed to restore the blurred image at either multiple scales (MSs) [8,9,10, 12] or multiple patch (MPs) levels [11, 13,14,15,16]. However, these methods usually employ a vanilla convolution filter as the base module, which suffers from two main issues in motion deburring tasks:

1)
It is difficult to handle large variations in motion magnitude. More specifically, the identical geometric shape or receptive fields of vanilla convolution filters are applied to different pixels of an image. However, the magnitude of motion blur appears diversely in different regions (e.g., moving vehicle pixels vs. sky). To tackle this problem, extremely deep networks and multi-scale architectures have been exploited to enhance the generalization ability to solve the large variance of motion problems. Consequently, this kind of approach suffers from ultra-heavy computational complexity and is hard to be deployed on lightweight devices, e.g., smartphones and self-driving cars.
2)
The weights of the vanilla convolution filters are also content-agnostic as shown in Fig. 1(a), regardless of the texture information of the local region. A spatially shared filter could be sub-optimal for the task of extracting feature across all pixels. In addition, once the network has been trained, the identical filters are utilized across different images, leading to ineffective feature extraction results.

To tackle the problems mentioned above, in this work, we focus on the design of efficient and adaptive filtering modules, dubbed as Decomposed Dynamic Filter (DDF), which is illustrated in Fig. 1(b). DDF decouples a conventional convolution filter into offsets and weights adaptive filters. The proposed DDF consists of two major components:

1)
The offsets estimation module, which could learn from local multi-scale features, and generate the optimal filter offsets. The proposed module can be trained end-to-end without explicit supervision. The deformable convolution [17, 18] has been proposed to adapt to local image geometry and enlarge the receptive field without sampling extra pixels. However, the offsets estimator of deformable convolution is merely a single-layer structure. Consequently, it does not perform well in large motion situations, which frequently occur in dynamic scenes. To address this issue, we propose a novel adaptive offsets estimation module, that can generate pixel-level representative offsets across the multi-scale spatial space. The proposed offsets estimator naturally solves the large motion problem by capturing long-range dependencies from multi-scale feature regions.
2)
The pixel-specific filter weight generator. The weight of DDF is dynamically reconstructed by a linear combination of filter basis with assembling coefficients. Both the filter basis and the assembling coefficients are self-adaptively learned from multi-scale local regions by lightweight sub-networks. In contrast to the dynamic filter-based method [19, 20], which explicitly predicts all pixel-specific filter weights from feature maps, our method achieves a better tradeoff between performance and memory usage. Finally, the learned weight and offsets are combined as a dynamic filter to extract the adaptive feature for the motion deblurring task. Deploying the proposed DDF for the motion deblurring network enables us to design a compact structure for the network, which a conventional network [9, 11, 13] cannot achieve without stacking a large number of subnet blocks. Overall, our contributions can be summarized as follows:
- We proposed a novel adaptive operator module DDF, which is capable of learning the sampling positions and filter weights at the same time. Adaptively solve the motion variation problem. Moreover, the dynamic feature of DDF enables us to design an effective network structure without sacrificing accuracy.
- We conducted extensive ablation evaluations on multiple benchmarks and confirmed that the spatial adaptability of DDF could empower various vanilla networks e.g., U-Net [21], and achieve state-of-the-art performance.
- To consolidate the generalization capability of DDF, and verify its effectiveness as a plug-and-play adaptive operator, we also plug it onto baseline networks for the real image noise removal task, which significantly improves the performance.

2 Related Works

2.1 Single-Scale Deep Image Deblurring

The single-scale approaches [22,23,24] aim to recover blurred images in an end-to-end manner. For instance, DeblurGAN [22], the adversarial learning algorithm, has been adopted with multiple residual blocks to restore the sharp image. DeblurGAN-v2 [23] advanced the DeblurGAN [22] by employing a much deeper architecture with an encoder-decoder architecture. However, the GAN-based methods often introduce unexpected artifacts into the image and make it hard to handle large motion situations. Yuan et al. [24] utilized optical flow information to guide deformable convolutions [18] offsets generation process. However, optical flow information is not always available for real-world applications.

2.2 Multi-scale, Multi-patch-Based Approaches

Multi-scale approaches have been verified to be an effective direction in image restoration scenarios. The pioneering work, Nah et al. [8] proposed a multi-scale deblurring network, which initiates from a coarse scale and then progressively deblurs the image at a finer scale. Tao et al. [9] proposed SRN, a scale-recurrent network responsible for aggregating features from different scales. The motion information from the previous coarser scale can be utilized for the following processing. Cho et al. [25] proposed MIMO-UNet, which employs a multi-scale U-Net structure to deblur the image in a coarse-to-fine strategy. The approach in [10] proposed a sharing method that takes a different level of blurs in each stage into consideration. Zhang et al. [11] introduced a multi-patch hierarchical scheme (DMPHN) to keep spatial resolution without any image down-sampling operation. The MPRNet [13] combines the multi-patch hierarchical structure with a global attention mechanism to further advance the state-of-the-art.

However, these methods usually adopt vanilla convolution kernels and consist of multiple large subnetworks, which lead to a long process time. Recently, vision transformer (ViT) with the ability of long-range dependency modeling, has shown promising performance in image restoration tasks [26,27,28]. However, as suggested in [29], the execution time of SwinIR [28] is approximately 1.99s on GoPro dataset, which is unacceptable for real-time applications. In contrast, we attempt to empower the lightweight network with pixel-adaptive ability, resulting in a lighter network for real-world applications.

2.3 Dynamic Filters

Image level dynamic filters predict filter weight based on input features at the image level. In particular, DyNet [30], DynamicConv [31], and CondConv [32] predict coefficients to combine several expert filters by employing attention mechanisms. However, the dynamic weights are generated at the image scale, making it hard to handle the motion deblur problem, which appears at pixel-level.

Pixel level dynamic filters [17,18,19,20, 33,34,35,36,37,38,39,40,41,42] further extend the adaptiveness to the spatial dimension by using a per-pixel dynamic filter. The filter weights are dynamically predicted based on the input features. Deformable convolutions [17, 18] attempt to learn an offsets map at position level, to adapt to the geometric variations, while fixing the weight in kernels. Su et al. [19] proposed to adaptively generate pixel-specific [20] filters on an input image. CARAFE [33, 34] proposed an adaptive operator for feature map upsampling, where an auxiliary branch is utilized to predict a 2D filter at each pixel. However, these channel-wise shared filters are hard to capture cross-channel information leading to the sub-optimal result. Furthermore, various dynamic filter-based methods have been applied to facilitate computer vision tasks, such as video frame interpolation [35], video denoising [36], super-resolution [37, 38], semantic segmentation [39, 40], and point cloud segmentation [41, 42].

However, the abovementioned methods directly predict all the parameters of dynamic filters, which require a large amount of memory for restoring the gradient in the backpropagation process. In contrast, our method learns to predict the dynamic pixel-adaptive convolution kernels in a memory-efficient manner and also inherits merits of pixel adaptive paradigm, detailed comparison results are in Sect. 3.4.

3 Our Approach

3.1 Overview

The success of Dynamic Convs [17,18,19,20, 33, 34] suggests that adaptively predicting the weight of filters at run-time improves the accuracy. However, it may lack the ability to dynamically adapt to the geometric variations and local image patch textures, at the same time. To this end, we propose DDF, such an operator enjoys benefits from modeling both geometric variations and local textures. The learning framework of DDF is shown in Fig. 2. In this section, we first revisit the general concept of dynamic convolutions, and then we introduce the architecture of the proposed DDF in detail.

3.2 Decomposed Dynamic Filters

We initiate from introducing the vanilla convolutions to make the definition of the proposed DDF. Let $ \textbf{X} \in \mathbb {R}^{H \times W \times C} $ denote the input feature map, where H, W, C represent its height, width, and channel numbers, respectively. $\textbf{X}_{i, j} \in \mathbb {R}^{C}$ is the feature vector inside $ \textbf{X}$ at position i-th row and the j-th column. We also use $\mathcal {M}_{\textbf{X}i, j}^{\delta } \in \mathbb {R}^{\delta \times \delta \times C}$ represents the size-$\delta $ square region centered at i, j inside cube tensor $\textbf{X}$. Hence, the conventional vanilla convolution can be expressed as

$$\begin{aligned} \textbf{X}^{\prime }_{i, j}=\mathcal {F}\left( \mathcal {M}_{\textbf{X}_{i, j}}^{\delta } ; \mathbf {\Theta }\right) , \end{aligned}$$

(1)

where we employ $\mathcal {F}$ to represent the convolution operation, $\textbf{X}^{\prime }$ is the output feature map, ${\delta }$ refers to the kernel size, and the convolution filter parameter $\mathbf {\Theta }$ remains the same across all the pixels over the image. In contrast, the proposed pixel-adaptive operator DDF is conditioned on the corresponding local region of feature map $\textbf{X}$, which is formulated as

$$\begin{aligned} \hat{\textbf{X}}_{i, j}=\mathcal {F}\left( \mathcal {M}_{\textbf{X}_{i, j}}^{{\delta }^{\prime }} ; \mathbf {\Theta }_{DDF} \mid \mathcal {M}_{\textbf{X}_{i, j}}^{{\delta }^{\prime }}\right) , \end{aligned}$$

(2)

where $\hat{\textbf{X}}$ is the output feature map generated by DDF, the parameter $\mathbf {\Theta }_{DDF}$ of DDF is composed of two independent part: 1) the pixel-specific filter weight $\textbf{D}$, which consists of a list of spatially-varying filters $\left\{ \textbf{D}_{i,j}\right\} \in \mathbb {R}^{K^2 \times C}$, where $i \in \{1,2, \ldots , H\}, j \in \{1,2, \ldots , W\}$; 2) the adaptive offsets $\textbf{S}$, $\left\{ \textbf{S}_{i,j}\right\} \in \mathbb {R}^{K^2 \times 2}$, where $i \in \{1,2, \ldots , H\}, j \in \{1,2, \ldots , W\}$, and K is the kernel size. To handle large motion variations for all the pixels and enable the DDF to see a larger region of the corresponding feature area. We utilize a set of $K \times K$ atrous convolution filters $\left\{ \boldsymbol{W}^{r}\right\} _{r=1}^{n}$ with dilated rate r to extract features for the following filter predicting task. ${\delta }^{\prime }$ refers to the maximum receptive filed of atrous convolution filters set $\left\{ \boldsymbol{W}^{r}\right\} _{r=1}^{n}$, and calculated by ${\delta }^{\prime }=K+(K-1)(r-1)$. More specifically, the proposed DDF is represented as

$$\begin{aligned} \hat{\textbf{X}}_{i, j, k}=\sum _{(u, v) \in \varDelta _K} \textbf{D}_{i, j, u+\lfloor K / 2\rfloor , v+\lfloor K / 2\rfloor , k} \textbf{X}_{i+u +\varDelta {x^{u+\lfloor K / 2\rfloor ,v+\lfloor K / 2\rfloor }_{i,j}}, j+v+\varDelta {y^{u+\lfloor K / 2\rfloor ,v+\lfloor K / 2\rfloor }_{i,j}}, k}, \end{aligned}$$

(3)

where $\boldsymbol{\varDelta }_K \in \mathbb {Z}^2$ indicates the set of sampling positions for convolution operation, written as ( $\times $ is Cartesian product) $\boldsymbol{\varDelta }_K=[-\lfloor K / 2\rfloor , \cdots ,\lfloor K / 2\rfloor ] \times [-\lfloor K / 2\rfloor , \cdots ,\lfloor K / 2\rfloor ]$, and $ \{\varDelta {x_{i,j}}, \varDelta {y_{i,j}} \}$ $\in $ $\textbf{S}_{i,j}$ are the learnable offsets at horizontal and vertical directions, respectively. k is the channel index, since we use depth-wise convolution each kernel in $\textbf{D}$ has C channels, instead of $C_{\text {in}} \times C_{\text {out}}$. Our work aims to design a filtering operator with content-adaptive property. In contrast to vanilla content-agnostic convolution operator, the proposed dynamic filters leverage two meta branch modules to learn the parameter of $\textbf{D}_{i, j}$ and $\textbf{S}_{i,j}$ from the local feature region, which is formulated as follows:

$$\begin{aligned} \textbf{D}_{i, j}=\mathbf {\Phi }\left( \mathcal {M}_{\textbf{X}_{i,j}}^{{\delta }^{\prime }} ; \theta _D\right) \end{aligned}$$

(4)

$$\begin{aligned} \textbf{S}_{i,j}=\mathbf {\Psi }\left( \mathcal {M}_{\textbf{X}_{i,j}}^{{\delta }^{\prime }} ; \theta _S\right) , \end{aligned}$$

(5)

where $\mathbf {\Phi (\cdot )}$ and $\mathbf {\Psi (\cdot )}$ are the meta generation networks parameterized by $\theta $, which are responsible for the filter weight learning and offsets learning, respectively. We will give more detailed information in the following sections.

3.3 The Offsets Estimation Module

The deformable convolution [17, 18] merely employs a one-layer vanilla convolution with an identical receptive field to estimate the offsets map, for the entire input feature. However, the same receptive field cannot handle large motion variation for all the pixels, leading to sub-optimal offsets estimation results. Furthermore, the meaningful context information is hard to be captured with the limited receptive field. To address this issue, we propose a multi-scale dynamic offsets estimator, which enables the DDF to see a larger region of the corresponding feature area and generate the optimal offsets. Our offsets estimator $\mathbf {\Psi (\cdot )}$ is composed of two parts: the offsets extractor and the offsets refiner arranged in sequence.

The Offsets Extractor

The atrous convolution has been verified to be a powerful operator for enlarging receptive field. To this end, our offset estimator $\mathbf {\Psi (\cdot )}$ first utilize a set of $K \times K$ atrous convolution filters $\left\{ \boldsymbol{W}^{r}\right\} _{r=1}^{n}$with dilated rate r to extract n set corresponding multi-scale offset $\mathbf {\hat{S}}=\left\{ \mathbf {\hat{S}}^0_{i,j}, \ldots , \mathbf {\hat{S}}^{n-1}_{i,j} \mid \mathbf {\hat{S}}_{i j} \in \mathbb {R}^{K \times K \times 2}\right\} $ along with the modulation scalar $\mathbf {\Delta }m=\left\{ \mathbf {\Delta }m^{0}_{i,j}, \ldots , \mathbf {\Delta }m^{n-1}_{i,j} \mid \mathbf {\Delta }m_{i j} \in \mathbb {R}^{K \times K }\right\} $. Because the modulation scalar $\mathbf {\Delta }m$, which was introduced in Deformable ConvNets v2, could evaluate the reliability of each $K \times K$ offsets, so we feed them into learnable guided refiner to decide the final offsets $\mathbf {{S}}_{i j}$ for the position i, j, from the multi-scale candidate offsets.

The Offsets Refiner

To adaptively select the offsets from generated N-set candidates, we also design a sub-module namely offsets refiner. Intuitively, the larger confidence value indicates a better offset estimation result. To this end, the learnable guided refiner decides the final offsets by selecting the one with the maximum confidence value from candidates. We use $\mathcal {G} \in \mathbb {R}^{K \times K}$ to denote the refined index, given the candidate modulation scalar set $\mathbf {\Delta }m \in \mathbb {R}^{K \times K \times n}$. For each position (u, v) in the spatial domain, we have

$$\begin{aligned} \mathcal {G}_{u, v}={\text {argmax}}\left( \mathbf {\Delta }m_{u, v}^{0}, \mathbf {\Delta }m_{u, v}^{1}, \cdots , \mathbf {\Delta }m_{u, v}^{n-1}\right) , \end{aligned}$$

(6)

where ${\text {argmax}}(\cdot )$ generates the index for the maximum value. So values in the refined index range from 0 to $n-1$ and indicate the index of offsets, which should be chosen for the corresponding positions. However, the ${\text {argmax}}(\cdot )$ is not continuous, so the gradient cannot be achieved. To solve this problem, we employ the softmax with temperature to obtain the gradient for ${\text {argmax}}(\cdot )$ through backward propagation. As detailed in Eq. 7, $\zeta $ is the extra noise sampled from Gumbel (0, 1) distribution, and $\tau $ is the temperature parameter which controls the distribution, and when the $\tau $ increases, the distribution becomes more uniform, as the $\tau $ approaches 0, the distribution becomes one-hot.

$$\begin{aligned} \mathcal {G}_{u, v}^{j}=\frac{e^{\left( \mathcal {G}_{u, v}^{j}+\zeta \right) / \tau }}{\sum _{m=0}^{n-1} e^{\left( \mathcal {G}_{u, v}^{m}+\zeta \right) / \tau }} \quad j \in [0, n-1] \end{aligned}$$

(7)

Therefore, the refiner index module can be trained by the loss function and yield the optimal offsets with respect to different scales for each pixel.

3.4 The Pixel-Specific Filter Weight Generator

The previous dynamic filter-based methods proposed to generate pixel-adaptive filter weight from the local feature. However, generating such a large number of filters ($C_{in} \times C_{out} \times K \times K$) cause extremely large memory usage, where $C_{in}$ and $C_{out}$ are the input channel number and output channel number, respectively. Since each gradient of pixel filter must be saved into memory when doing backpropagation (e.g., input feature map with size $H \times W \times C_{in}$, the gradients size is $H \times W \times C_{in} \times C_{out} \times K \times K$). Thus, the image-level dynamic filters CondConv [32], DYconv [31] can hardly extend to pixel-level filters.

Motivated by the observation that the traditional convolution filter can be well represented by a linear combination of low-rank filter basis along with decomposition coefficients without losing performance [43], we propose the pixel-specific filter weight generator $\mathbf {\Phi (\cdot )}$. The generated dynamic filter $\textbf{D}_{i, j}\in \mathbb {R}^{K \times K \times C}$, which could decouple into pixel-adaptive basis $\textbf{B}_{i,j}\in \mathbb {R}^{m \times K \times K}$ and dynamic coefficients $\textbf{A}_{i,j} \in \mathbb {R}^{C \times m}$, formulated as $\textbf{D}_{i,j}=\textbf{B}_{i,j} \textbf{A}_{i,j}$, m is a pre-defined small value, e.g., $m=4$.

The work in [43] leaves the coefficients $\textbf{A}$ as the global parameter shared throughout the image, however, the globally shared filters are hard to capture cross-channel information. Hence, we propose to enable coefficients $\textbf{A}$ to be a pixel-specific dynamic parameter and utilized two layers of MLP (MultiLayer Perceptron) to capture the cross-channel variations. The pixel-adaptive basis $\textbf{B}$ and dynamic coefficients $\textbf{A}$ are generated as follows:

$$\begin{aligned} \textbf{B}_{i, j}\in \mathbb {R}^{m \times K \times K}=\mathcal {B}\left( \mathcal {M}_{\textbf{X}_{i,j}}^{\delta } ; \theta _B\right) ; \textbf{A}_{i, j}\in \mathbb {R}^{C \times m}=\mathcal {A}\left( {\textbf{X}_{i,j}} ; \theta _A\right) , \end{aligned}$$

(8)

where, $\mathcal {B}(\cdot )$ and $\mathcal {A}(\cdot )$ are the generation network with parameters $\theta _B$ and $\theta _A$, respectively. In our experiment, $\mathcal {B}(\cdot )$ is implemented by one $1 \times 1$ convolution followed by a single layer $K \times K$ atrous convolution with the dilated rate $r=n$, the same as offsets estimator, for the purpose of capturing long-range dependencies. As for $\mathcal {A}(\cdot )$, we only utilized two layers of MLP to generate the coefficients $\textbf{A}$. With the help of the decomposed coefficients, the number of parameters is significantly reduced from (HWKKC) to $\left( CKKm + mHW\right) $. Specifically, m can be set to a small value, e.g., $m=1$, relatively yielding a $(HWKKC)/(HWKK+H W) \approx (C)$ times reduction of parameters.

4 Experiments

4.1 Datasets

To evaluate the proposed method, we conduct extensive experiments on three image deblurring benchmark datasets: 1) GoPro [8] which consisting of 3, 214 image pairs, in which 2,103 pairs are utilized for training our model and the rest 1111 pairs are for testing, 2) HIDE [44] contains 2,025 pairs of images, all of which are used for testing, and 3) RealBlur [45] dataset which consisting of 980 image pairs in RealBlur-R and RealBlur-J test sets, respectively. We train our models on 2,103 pairs of blurry and sharp images from the GoPro dataset, then test against all the test sets from three datasets. The average values of PSNR and SSIM [46] are utilized for performance comparison.

4.2 Implementation Details

Setting. We train our model for 2000 epochs by employing Adam optimizer [47] with default setting $\beta _1=0.9$, $\beta _2=0.999$ and $\epsilon =10^{-8}$. The learning rate is set to be $2\times 10^{-4}$ and exponentially decays to 0 using power 0.3. All parameters are initialized using Xavier normalization. The batch size is set to 16. We randomly crop the images into $256\times 256$ patches for training and also utilize horizontal flip and rotation for image augmentation. Experiments are conducted on the server with an intel E5-2690 CPU, and 4X NVIDIA Tesla v100 GPUs.

Network Architecture. The architecture of U-Net-based network [21] is illustrated in Fig. 3(a). We also propose an extension operator DDF-Bottleneck, which is shown in Fig. 3(b), and a DDF-Upsampling module, as shown in Fig. 3(c). According to the upsampling rate r the number of $r^{2}$ DDF is used. We utilized pixel-shuffle [48] to assemble the output features as an upsampling layer.

Table 1. Quantitative comparisons on GoPro [8], HIDE [44], and RealBlur [45] dataset. Our methods and the best results are highlighted.

Full size table

4.3 Evaluation of DDF

Because our DDF module is a plug-and-play operator, to verify its effectiveness in improving motion deblurring accuracy, we plug it into various deblurring architectures, such as U-Net and MIMO-UNet [25]. For U-Net, we evaluate two variants of our model: 1) DDF-UNet which consists of 8 encoder blocks and 8 decoder blocks, respectively, and 2) DDF-UNet+ which consists of 20 blocks for each encoder and decoder, respectively. The $\ell _{1}$ loss is used in our implementation, which is formulated as $ L=\frac{1}{N} \sum _{n=1}^{N}\left\| \hat{\textbf{Y}}^{(n)}-\textbf{Y}^{(n)}\right\| _{1} $, where $\textbf{Y}^{(n)}$ is the n-th corresponding sharp image, $\hat{\textbf{Y}}^{(n)}$ is the network output, and N is the number of sample images in a mini-batch.

The detailed architecture is shown in Fig. 3(a). For MIMO-UNet and MIMO-UNet+ [25], we replace all the ResBlock in MIMO-UNet [25] with our DDF-Bottleneck module, and also replace the upsampling layer with our DDF-Upsampling module, dubbed as DDF-MIMO and DDF-MIMO+, correspondingly. We also use the same loss function: multi-scale frequency reconstruction (MSFR) loss, from the original work [25]. Embedding the DDF module contributes to remarkable performance gains for both architectures, as detailed in Table 1, e.g., w/DDF in DDF-UNet lead to a 1.37 dB improvement in PSNR on the GoPro dataset, a toy example is shown in Fig. 4. For DDF-MIMO, a 0.95 dB gain in PSNR is verified by plugging the DDF onto MIMO-UNet [25] architecture. These results consolidate that, the spatial adaptability of DDF could empower the vanilla network, and achieve better performance.

4.4 Comparisons with State-of-the-Art Methods

Table 2. Ablation study based on GoPro [8] testing set for using different component combinations in DDF-UNet, where DCN v2 represents Deformable Convolution V2 [18]. DDF Offsets means the DDF offsets estimation module, while DDF Weight indicates the DDF pixel-specific filter weight generator. The best result is highlighted.

Full size table

Table 3. Real Image Denoising results on the SIDD [51] and DND [52] datasets. Our methods and the best results are highlighted.

Full size table

We extensively compare our proposed method with state-of-the-art dynamic scene motion deblurring approaches, including DeepBlur [8], DeblurGAN v1, v2, [22, 23], SRN [9], Gao et al. [10] and DMPHN [11], MPRNet [9] and so on. The quantitative results on the GoPro [8], HIDE [44], and RealBlur [45] test sets are listed in Table 1. The visual comparison results are illustrated in Fig. 5 and Fig. 6. As illustrated in the images, our method outperforms the other approaches in terms of deblurring quality, and handles large dynamic motion blur scenes quite well. It can be seen from the quantitative results that our DDF-MIMO+ outperforms the previous state-of-the-art MPRNet [13], in terms of PSNR, ours DDF-MIMO+ is ranked first, surpassing the best competitor [13]. In terms of inference-time, the proposed DDF-UNet can deblur one image at 41ms. In contrast, the stacked architectures, e.g., SRN [9], DMPHN [11], and SAPHN [16] suffer from expensive computational costs because they need to stack more layers for achieving larger receptive fields. Our method is 17.9$\times $ and 31$\times $ faster than SRN [9] and DeepBlur [8], respectively. The experimental results demonstrate the optimal trade-off between the performance and computational complexity of the proposed method.

4.5 Ablation Studies

Analyses of the Spatial Adaptability. As illustrated in the Fig. 4, DDF handles large dynamic motion blur scenes quite well. We observe that the result of UNet suffers from artifacts and deficient deblurring results depicted in Fig. 4(d). In contrast, owing to the proposed DDF module, our model is capable of restoring sharper boundaries and richer details in the region containing large motion blur. In addition, according to the visualization of the offsets estimation result in Fig. 4(a), for the background region, the estimated offsets are approximately uniformly distributed in the area, while for the large motion regions, we can see that the sampling geometry of the filter is adjusted adaptively according to the motion blur patterns. Moreover, as shown in Fig. 4(b), the weights of filters are also adaptively changed for each local region, which demonstrates that our DDF has been empowered with spatial adaptability.

The Effectiveness of Offsets Estimator and Refiner. To demonstrate the effectiveness of each component in the proposed DDF including the adaptive offsets prediction module, and the kernel weight perdition module, we compare the results with several versions. The experimental results are listed in Table 2. Version 1 is the original U-Net, as the baseline. Version 2 is U-Net equipped with Deformable ConvNets V2 [18], Version 3 is replaces deformable convolution with the adaptive offsets prediction module. It achieves an increase of 0.46 dB in PSNR, which means that the offsets adaptive adjustment ability is enhanced by the proposed adaptive offsets prediction module. Compared with deformable convolution, which could only capture a limited area of motion pattern as shown in Fig. 4(c), the proposed DDF demonstrates stronger fitting capabilities.

The Effectiveness of Pixel-Specific Filter Weight Generator. To further evaluate the effectiveness of the pixel-specific filter weight generator, we also propose Version 4 test, which incrementally adds the filter weight generator to DDF. Thanks to the adaptive adjustment of weights, the PSNR has further increased by 0.43 dB.

The Effectiveness of the DDF-Upsampling Module. Finally, we further add the DDF-Upsampling module into the UNet as the Version 5 test and get a 0.21 dB performance gain, which consolidates the effectiveness of the DDF-Upsampling module.

4.6 Generalization to Real Image Noise Removal Task

To demonstrate the generalization capability of DDF, we also applied the DDF-UNet+ to real-world image noise removal task. The real-world noise removal aims at restoring high-quality images from noisy inputs, which are spatially non-uniformly distributed. Thus, the spatially-adaptive operator is a natural method to solve such kinds of problems. The real-world noise removal dataset SIDD [51] and DND [52] are used for evaluation. We train our model by using 320 high-resolution images of the SIDD dataset and evaluate the test sets of SIDD [51] and DND [52]. The quantitative results are listed in Table 3. We compare DDF-UNet+ with state-of-the-art denoising methods, including the CBM3D [54], CBDNet [55], RIDNet [56], SADNet [58], DnCNN [53], CycleISP [59], NBNet [61], DANet [57], MIRNet [60], and MPRNet [13]. Our DDF-UNet+ achieves a 39.82 dB on PSNR for SIDD [51] dataset, and outperforms the best methods, in our experiments. The visualization results of the SIDD dataset are shown in Fig. 7. The results indicate that the DDF could handle the spatial-variance problem, and improve the performance.

5 Conclusion

In this paper, we proposed a new adaptive plug-and-play operator DDF for the challenging task of handling large motion blur variations across spatial locations. The weights and the offsets of DDF are adaptively generated from the local features by proposed meta networks, which are trained end-to-end without explicit supervision. We have also proposed a U-Net-based architecture powered by our proposed DDF and DDF-Upsampling module. Extensive experimental results demonstrated that the proposed DDF could empower the baseline to achieve better performance. Furthermore, the proposed DDF can be generalized to other computer vision tasks as a plug-and-play adaptive operator, e.g., real image noise removal task, and also achieve state-of-the-art performance.

References

Gupta, A., Joshi, N., Lawrence Zitnick, C., Cohen, M., Curless, B.: Single image deblurring using motion density functions. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010. LNCS, vol. 6311, pp. 171–184. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-15549-9_13
Chapter Google Scholar
Liu, G., Chang, S., Ma, Y.: Blind image deblurring using spectral properties of convolution operators. IEEE Trans. Image Process. 23, 5047–5056 (2014)
Article MathSciNet MATH Google Scholar
Tran, P., Tran, A.T., Phung, Q., Hoai, M.: Explore image deblurring via encoded blur kernel space. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11956–11965 (2021)
Google Scholar
Ren, D., Zhang, K., Wang, Q., Hu, Q., Zuo, W.: Neural blind deconvolution using deep priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3341–3350 (2020)
Google Scholar
Chakrabarti, A.: A neural approach to blind motion deblurring. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 221–235. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_14
Chapter Google Scholar
Sun, J., Cao, W., Xu, Z., Ponce, J.: Learning a convolutional neural network for non-uniform motion blur removal. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 769–777 (2015)
Google Scholar
Schuler, C.J., Hirsch, M., Harmeling, S., Schölkopf, B.: Learning to deblur. IEEE Trans. Pattern Anal. Mach. Intell. 38, 1439–1451 (2015)
Article Google Scholar
Nah, S., Hyun Kim, T., Mu Lee, K.: Deep multi-scale convolutional neural network for dynamic scene deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3883–3891 (2017)
Google Scholar
Tao, X., Gao, H., Shen, X., Wang, J., Jia, J.: Scale-recurrent network for deep image deblurring. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8174–8182 (2018)
Google Scholar
Gao, H., Tao, X., Shen, X., Jia, J.: Dynamic scene deblurring with parameter selective sharing and nested skip connections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3848–3856 (2019)
Google Scholar
Zhang, H., Dai, Y., Li, H., Koniusz, P.: Deep stacked hierarchical multi-patch network for image deblurring. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5978–5986 (2019)
Google Scholar
Park, D., Kang, D.U., Kim, J., Chun, S.Y.: Multi-temporal recurrent neural networks for progressive non-uniform single image deblurring with incremental temporal training. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12351, pp. 327–343. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58539-6_20
Chapter Google Scholar
Zamir, S.W., et al.: Multi-stage progressive image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 14821–14831 (2021)
Google Scholar
Zhang, J., et al.: Dynamic scene deblurring using spatially variant recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2521–2529 (2018)
Google Scholar
Chen, L., Lu, X., Zhang, J., Chu, X., Chen, C.: HINet: half instance normalization network for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 182–192 (2021)
Google Scholar
Suin, M., Purohit, K., Rajagopalan, A.: Spatially-attentive patch-hierarchical network for adaptive motion deblurring. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3606–3615 (2020)
Google Scholar
Dai, J., et al.: Deformable convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 764–773 (2017)
Google Scholar
Zhu, X., Hu, H., Lin, S., Dai, J.: Deformable ConvNets v2: more deformable, better results. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9308–9316 (2019)
Google Scholar
Su, H., Jampani, V., Sun, D., Gallo, O., Learned-Miller, E., Kautz, J.: Pixel-adaptive convolutional neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11166–11175 (2019)
Google Scholar
Zamora Esquivel, J., Cruz Vargas, A., Lopez Meyer, P., Tickoo, O.: Adaptive convolutional kernels. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops(2019)
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Kupyn, O., Budzan, V., Mykhailych, M., Mishkin, D., Matas, J.: DeblurGAN: blind motion deblurring using conditional adversarial networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8183–8192 (2018)
Google Scholar
Kupyn, O., Martyniuk, T., Wu, J., Wang, Z.: DeblurGAN-v2: deblurring (orders-of-magnitude) faster and better. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 8878–8887 (2019)
Google Scholar
Yuan, Y., Su, W., Ma, D.: Efficient dynamic scene deblurring using spatially variant deconvolution network with optical flow guided training. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3555–3564 (2020)
Google Scholar
Cho, S.J., Ji, S.W., Hong, J.P., Jung, S.W., Ko, S.J.: Rethinking coarse-to-fine approach in single image deblurring. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4641–4650 (2021)
Google Scholar
Wang, Z., Cun, X., Bao, J., Zhou, W., Liu, J., Li, H.: Uformer: a general U-shaped transformer for image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 17683–17693 (2022)
Google Scholar
Zamir, S.W., Arora, A., Khan, S., Hayat, M., Khan, F.S., Yang, M.H.: Restormer: efficient transformer for high-resolution image restoration. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5728–5739 (2022)
Google Scholar
Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: SwinIR: image restoration using swin transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1833–1844 (2021)
Google Scholar
Mao, X., Liu, Y., Shen, W., Li, Q., Wang, Y.: Deep residual Fourier transformation for single image deblurring. arXiv preprint arXiv:2111.11745 (2021)
Zhang, Y., Zhang, J., Wang, Q., Zhong, Z.: DyNet: dynamic convolution for accelerating convolutional neural networks. arXiv preprint arXiv:2004.10694 (2020)
Chen, Y., Dai, X., Liu, M., Chen, D., Yuan, L., Liu, Z.: Dynamic convolution: attention over convolution kernels. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 11030–11039 (2020)
Google Scholar
Yang, B., Bender, G., Le, Q.V., Ngiam, J.: CondConv: conditionally parameterized convolutions for efficient inference. In: Advances in Neural Information Processing Systems, vol. 32 (2019)
Google Scholar
Wang, J., Chen, K., Xu, R., Liu, Z., Loy, C.C., Lin, D.: CARAFE: content-aware reassembly of features. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3007–3016 (2019)
Google Scholar
Wang, J., Chen, K., Xu, R., Liu, Z., Loy, C.C., Lin, D.: CARAFE++: unified content-aware reassembly of features. IEEE Trans. Pattern Anal. Mach. Intell. 44(9), 4674–4687 (2021)
Google Scholar
Niklaus, S., Mai, L., Liu, F.: Video frame interpolation via adaptive convolution. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 670–679 (2017)
Google Scholar
Mildenhall, B., Barron, J.T., Chen, J., Sharlet, D., Ng, R., Carroll, R.: Burst denoising with kernel prediction networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2502–2510 (2018)
Google Scholar
Xu, Y.S., Tseng, S.Y.R., Tseng, Y., Kuo, H.K., Tsai, Y.M.: Unified dynamic convolutional network for super-resolution with variational degradations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 12496–12505 (2020)
Google Scholar
Magid, S.A., et al.: Dynamic high-pass filtering and multi-spectral attention for image super-resolution. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 4288–4297 (2021)
Google Scholar
Liu, J., He, J., Qiao, Yu., Ren, J.S., Li, H.: Learning to predict context-adaptive convolution for semantic segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 769–786. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_46
Chapter Google Scholar
He, J., Deng, Z., Qiao, Y.: Dynamic multi-scale filters for semantic segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 3562–3572 (2019)
Google Scholar
Xu, C., et al.: SqueezeSegV3: spatially-adaptive convolution for efficient point-cloud segmentation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12373, pp. 1–19. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58604-1_1
Chapter Google Scholar
Xu, M., Ding, R., Zhao, H., Qi, X.: PAConv: position adaptive convolution with dynamic kernel assembling on point clouds. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3173–3182 (2021)
Google Scholar
Wang, Z., Miao, Z., Hu, J., Qiu, Q.: Adaptive convolutions with per-pixel dynamic filter atom. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12302–12311 (2021)
Google Scholar
Shen, Z., et al.: Human-aware motion deblurring. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
Google Scholar
Rim, J., Lee, H., Won, J., Cho, S.: Real-world blur dataset for learning and benchmarking deblurring algorithms. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 184–201. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_12
Chapter Google Scholar
Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process. 13, 600–612 (2004)
Article Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1874–1883 (2016)
Google Scholar
Zhang, K., et al.: Deblurring by realistic blurring. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2737–2746 (2020)
Google Scholar
Purohit, K., Suin, M., Rajagopalan, A., Boddeti, V.N.: Spatially-adaptive image restoration using distortion-guided networks. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 2309–2319 (2021)
Google Scholar
Abdelhamed, A., Lin, S., Brown, M.S.: A high-quality denoising dataset for smartphone cameras. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1692–1700 (2018)
Google Scholar
Plotz, T., Roth, S.: Benchmarking denoising algorithms with real photographs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1586–1595 (2017)
Google Scholar
Zhang, K., Zuo, W., Chen, Y., Meng, D., Zhang, L.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26, 3142–3155 (2017)
Article MathSciNet MATH Google Scholar
Dabov, K., Foi, A., Katkovnik, V., Egiazarian, K.: Image denoising by sparse 3-D transform-domain collaborative filtering. IEEE Trans. Image Process. 16, 2080–2095 (2007)
Article MathSciNet Google Scholar
Guo, S., Yan, Z., Zhang, K., Zuo, W., Zhang, L.: Toward convolutional blind denoising of real photographs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1712–1722 (2019)
Google Scholar
Zhuo, S., Jin, Z., Zou, W., Li, X.: RIDNet: recursive information distillation network for color image denoising. In: Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (2019)
Google Scholar
Yue, Z., Zhao, Q., Zhang, L., Meng, D.: Dual adversarial network: toward real-world noise removal and noise generation. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12355, pp. 41–58. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58607-2_3
Chapter Google Scholar
Chang, M., Li, Q., Feng, H., Xu, Z.: Spatial-adaptive network for single image denoising. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12375, pp. 171–187. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58577-8_11
Chapter Google Scholar
Zamir, S.W., et al.: CycleISP: real image restoration via improved data synthesis. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2696–2705 (2020)
Google Scholar
Zamir, S.W., et al.: Learning enriched features for real image restoration and enhancement. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12370, pp. 492–511. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58595-2_30
Chapter Google Scholar
Cheng, S., Wang, Y., Huang, H., Liu, D., Fan, H., Liu, S.: NBNet: noise basis learning for image denoising with subspace projection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4896–4906 (2021)
Google Scholar

Download references

Author information

Authors and Affiliations

Yokohama, Japan
Zhiqiang Hu
Tokyo Institute of Technology, Tokyo, Japan
Tao Yu

Authors

Zhiqiang Hu
View author publications
You can also search for this author in PubMed Google Scholar
Tao Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhiqiang Hu .

Editor information

Editors and Affiliations

University of Wollongong, Wollongong, NSW, Australia
Lei Wang
University of Bonn, Bonn, Germany
Juergen Gall
University of Adelaide, Adelaide, SA, Australia
Tat-Jun Chin
National Institute of Informatics, Tokyo, Japan
Imari Sato
Johns Hopkins University, Baltimore, MD, USA
Rama Chellappa

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 7365 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hu, Z., Yu, T. (2023). Learning to Predict Decomposed Dynamic Filters for Single Image Motion Deblurring. In: Wang, L., Gall, J., Chin, TJ., Sato, I., Chellappa, R. (eds) Computer Vision – ACCV 2022. ACCV 2022. Lecture Notes in Computer Science, vol 13843. Springer, Cham. https://doi.org/10.1007/978-3-031-26313-2_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-26313-2_24
Published: 02 March 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-26312-5
Online ISBN: 978-3-031-26313-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning to Predict Decomposed Dynamic Filters for Single Image Motion Deblurring

Abstract

Similar content being viewed by others