1 Introduction

Digital image processing is an extremely, computationally demanding, strategic research field. This has led, over the years, to the development of many complex image processing algorithms on highly parallel, specialized hardware platforms. With the rapid progress of parallel hardware, suitably high performance is now available also for sophisticated imaging tasks. In the present study we focus on image denoising, an essential requirement for any other image processing, which refers to the recovery of a clean sharp image from a noisy observation.

In [20] Rudin, Osher and Fatemi presented one of the main mathematical models and algorithms for image denoising, the so-called Total Variation denoising (also known as TV model). Since then, TV model has become a quite popular technique, which usage allows to improve overall image quality when the images are affected by noise or corruption, while well preserving edges and details. Efficient solutions have been proposed over time for the numerical optimizations of the TV model [3, 4, 11, 25], but they are not able to fully exploit parallel computational resources. In [13], the authors designed a new optimization algorithm which is simple and highly parallelizable, and relies on median value computations, thus reducing computational effort to a sorting problem. Thanks to its low complexity, this algorithm is prone to be implemented on low-end devices or, more generally, in situations where a reduced amount of resources is available. That is the case of Quantum Image Processing (QIP), a novel and promising research field which goal is the development of image processing techniques for quantum computers, exploiting peculiar features from quantum world, like entanglement, superposition, interference [24]. Quantum computing is universally acknowledged for its ability to process data providing computational speedups compared to the classical paradigm. However, image processing is actually one of the most demanding applications in terms of resources for quantum computers. As a consequence QIP is still in its early stage, and thus is facing several fundamental problems, such as how to represent and store an image in quantum computers appropriately, and how to efficiently implement image processing algorithms [10, 16].

In the present days, quantum devices are subject to several issues (i.e. noise, absence of error correction, low amount of available qubits, etc.), which defines what researchers commonly refer to as Noise intermediate-scale quantum era, or just NISQ [18]. This is an intermediate development phase where, through the exhibition of quantum devices’ limits, algorithms have been designed to be as optimized and simple as possible [2]. In this context, QIP is a complex subject to deal with because several difficulties may arise in the attempt of implementing the classical image processing algorithms on quantum devices (some of them will be discussed later in this work). Many proposals have been submitted during the past few years [22], in the attempt to provide QIP algorithms that fulfill NISQ requirements. QIP algorithms are usually tested using quantum simulators or similar execution environments.

Here along this line, the presented work aims to solve the TV model in a quantum environment. The result is a Quantum TV filter, which integrates a Quantum Median Filter proposal by Li et al. [10, 14]. This work focuses not only on the development of a QIP technique for image denoising which mimics its variational TV counterpart, but also on presenting an useful review of detected problems in the newly developed QIP field, and offering possible solutions and ideas for future developments and optimizations.

The paper is organized as follows. In Sect. 2 we introduce basic theory about Total Variation denoising technique and a median formula for efficiently solving an anisotropic TV problem. In Sect. 3 we discuss Quantum Image Processing concepts and related issues. In Sect. 4 we explain in detail how Quantum TV Filter works, highlighting its quantum components, and analysing its circuit complexity in Sect. 5. In Sect. 6 we focus on the analysis of experimental results, comparing both quantum and classical implementations of the TV algorithm. Section 7 reports conclusions and future works. In the Appendix we briefly report details on a few quantum modules used in the design of the proposed Quantum TV.

For basic notations and fundamental knowledge about quantum image processing computing, we refer the reader to [24].

2 Total variation image denoising

The goal of denoising is to obtain an image \(u^*\) not only with small variations in intensity between pixels but also close to the observation f. At this aim, the class of variational methods for image restoration relies on determining restored images \(u^*\in {\mathbb {R}}^N\), given a noisy image \(f\in {\mathbb {R}}^N\), as the minimizers of suitable cost functionals \(J: {\mathbb {R}}^N \rightarrow {\mathbb {R}}\) such that, typically, restoration is casted as an optimization problem of the form:

$$\begin{aligned} u^* \,\;{\leftarrow }\;\, \arg \min _{u \in {\mathbb {R}}^N} \left\{ \, J(u) \;{:=}\; R(u) \;{+}\; \lambda \, F(u;f) \, \right\} \, , \end{aligned}$$
(1)

where the functionals R(u) and F(uf), commonly referred to as the regularization and the fidelity term, encode prior information on the clean image u and the observation model, respectively, with the so-called regularization parameter \(\lambda > 0\) controlling the trade-off between the two terms. In particular, the functional form of the fidelity term is strictly connected to the characteristics of the noise corruption. A classical choice for the fidelity measures the data fitting in terms of the \(\ell _2\)-norm, in formulas:

$$\begin{aligned} F(u;f) \,\;{:=}\;\, \, \Vert u - f \Vert _2^2. \end{aligned}$$
(2)

For what regards the regularization term R(u) in (1), a very popular choice is represented by the Total Variation, presented in the following two forms:

$$\begin{aligned} \text {[Isotropic TV]}&R_{\text {iso}}(u) :=&\Vert Du \Vert _{2,1} = \Vert \sqrt{ (D_{X} u)^2 + (D_{Y} u )^2} \Vert _1 \end{aligned}$$
(3)
$$\begin{aligned} \text {[Anisotropic TV]}&R_{\text {ani}}(u) :=&\Vert Du \Vert _{1} = \Vert D_{X}u\Vert _1 + \Vert D_{Y}u\Vert _1 \end{aligned}$$
(4)

where \(D_x, D_y\) denote the horizontal and vertical operators, respectively, and \(D = (D_x, D_y)\) is the gradient operator \(\nabla \) in the discrete setting.

By substituting the TV regularizer \(TV(u):=R_{iso}(u)\) in (3) or \(TV(u):=R_{ani}(u)\) in (4) and the fidelity term (2) for R and F in (1), respectively, one obtains the so-called TV-L\(_2\) - restoration model, originally introduced in [20]. In formulas:

$$\begin{aligned} u^* \,\;{\leftarrow }\;\, \arg \min _{u \in {\mathbb {R}}^N} \left\{ \, J(u)=\textrm{TV}(u) \,\;{+}\;\, \lambda \, \Vert u - f \Vert _2^2 \, \right\} \, . \end{aligned}$$
(5)

2.1 Median formula for TV

Li and Osher in [13] proposed an efficient and highly parallelizable method for solving TV model (5) with anisotropic TV regularization (4). They proposed to solve iteratively N one-dimensional optimization problems to obtain an accurate solution of the N-dimensional optimization problem (5). In particular, for each pixel \(u\in {\mathbb {R}}\), we consider the local minimization problem:

$$\begin{aligned} u^* = \arg \min _{u \in {\mathbb {R}}} \left\{ E(u) := \sum _{i = 1}^{\kappa }w_{i}|u - u_{i} |+ F(u)\right\} \end{aligned}$$
(6)

where \(F(u) = \lambda (f - u)^{2}\), \(u^*, f \in {\mathbb {R}}\) are respectively the denoised and noisy versions of the same pixel, while \(u_i\) belongs to the set of \(\kappa \) neighboring pixels and \(w_i\ge 0\) are given weights. In [13], a simple method for computing \(u^*\) in (6) is derived and here reported for self-consistency.

Theorem 1

Supposing the \(w_i\) are non-negative and the \(u_i\) are sorted as \(u_1 \le u_2 \le ... \le u_{\kappa }\), the function F is strictly convex and differentiable and \(F'\) is bijective; then the minimizer of (6) is a median:

$$\begin{aligned} u^* = median \{u_1, u_2, ... , u_{\kappa }, p_0, p_1, ... , p_{\kappa } \} \end{aligned}$$
(7)

where \(p_i = (F')^{-1}(W_i)\) and

$$\begin{aligned} W_i = -\sum _{j = 1}^{i}w_j +\sum _{j = i + 1}^{\kappa }w_j,\,\,\,i = 0, 1, \ldots , {\kappa }. \end{aligned}$$
(8)

In our formulation, the neighborhood of the current pixel u are simply \(u_u, u_d, u_l, u_r\), the vertical and horizontal direct neighbors pixels, respectively. The adopted 4-neighbors strategy allows us to apply the median formula in parallel on multiple pixels at one time using a proper configuration, since each pixel is directly affected only by its 4 neighbors.

The fidelity is defined as \(F(u) = \lambda (f - u)^{2}\), and consequently

$$\begin{aligned} F'(u) = -2\lambda (f - u) \Rightarrow (F')^{-1}(W) = f + \frac{W}{2\lambda }, \end{aligned}$$

where W is a sum of weights previously defined in (8). The denoised pixel \(u^*\) is then obtained as the median value:

$$\begin{aligned} u^* = \text {median}\{u_l, u_r, u_u, u_d, p_0, p_1, p_2, p_3, p_4\}, \end{aligned}$$
(9)

where the p-values \(p_i\) are calculated following Theorem 1, with \(w_i=1\) and \(\kappa = 4\):

$$\begin{aligned} \begin{array}{lll} W_0 = &{}4 \Rightarrow p_0 = f + \frac{4}{2\lambda } \Rightarrow &{} p_0 = f + \frac{2}{\lambda } \\ W_1 = &{}2 \Rightarrow p_1 = f + \frac{2}{2\lambda } \Rightarrow &{} p_1 = f + \frac{1}{\lambda } \\ W_2 = &{}0 \Rightarrow p_2 = f + \frac{0}{2\lambda } \Rightarrow &{} p_2 = f \\ W_3 = &{}-2 \Rightarrow p_3 = f + \frac{-2}{2\lambda } \Rightarrow &{} p_3 = f - \frac{1}{\lambda } \\ W_4 =&{} -4 \Rightarrow p_4 = f + \frac{-4}{2\lambda } \Rightarrow &{} p_4 = f - \frac{2}{\lambda } \end{array} \end{aligned}$$
(10)

The image denoising problem can hence be solved by iteratively computing (7) pixel-by-pixel over the whole image until convergence, which is guaranteed by the following result.

Theorem 2

The algorithm defined by repeatedly applying (7) at the jth pixel, converges, i.e. \(u_j^{(k+1)} = \arg \min _{u_j\in {\mathbb {R}}}E^{(k)}(u_j)\), hence

$$\begin{aligned} u^{(k)} \Rightarrow \arg \min _{u \in {{\mathbb {R}}}^N} J(u). \end{aligned}$$

This means that, after k iterations, with \(k \rightarrow \infty \), we obtain the minimizer of problem (5).

For the numerical implementation the stopping criterion considered is computed as follows: given a small tolerance value \(\epsilon \), process stops when, at iteration k

$$\begin{aligned} \frac{\Vert u^{(k-1)} - u^{(k)}\Vert _2}{\Vert u^{(k-1)}\Vert _2} \le \epsilon \end{aligned}$$
(11)

where \(u^{(k-1)}\) and \(u^{(k)}\) are consecutive processed images.

The resulting algorithm is described in Algorithm 1.

figure a

3 Quantum image processing

QIP aims to design quantum algorithms that, once constructed a quantum state which encodes an image, implement image processing techniques in a quantum environment. QIP is a novel topic in quantum computing and many issues are far from being solved, as explained in [19]. A central issue regards the image encoding in a quantum environment, which will be discussed in Sect. 3.1. Another limiting issue is that, nowadays, QIP represents a demanding branch of quantum computing, as it needs a lot of resources that are far from being offered by current or medium term devices. At the moment, researchers are proposing many different approaches to the problem, even though only a limited number of these are commonly accepted and used [22,23,24]. In order to tackle the memory restrictions, we subdivide the image into a set of sub-images as described in Sect. 3.2.

The denoising QIP here proposed consists of three steps, illustrated in Fig. 1. The image is first encoded in the quantum environment, then it is processed by quantum circuits to perform the denoising task, and finally the denoised image is measured to convert it in a classical image format. The used measurement process is detailed in Sect. 3.3.

Fig. 1
figure 1

Scheme illustrating Quantum TV (QTV) denoising process

3.1 Quantum image representation

A quantum image encoding is defined Quantum Image Representation (QIR). Unlike classical image processing, where a set of well-known standard formats are available and well-assessed, in QIP many encoding QIR techniques were proposed and tested [14], but on the other hand none nowadays has distinguished itself as standard. The most used QIR technique is the Novel Enhanced Quantum Representation (NEQR).

This kind of representation needs to encode the following image’s data:

  • Pixel coordinates, encoded by qubits \(|XY\rangle \). An image of dimension \(D_{X} \times D_{Y}\), usually needs to use \(d_{X} = \lceil \log _{2}D_{X} \rceil \) qubits for horizontal coordinates and \(d_{Y} = \lceil \log _{2}D_{Y} \rceil \) qubits for vertical ones. In this way \( |XY\rangle = |X_0 X_1 ... X_{d_x-1} Y_0 Y_1 ... Y_{d_y-1}\rangle \).

  • Pixel value, encoded by one or more qubits \(|C\rangle \). It uses q qubits for representing \(N_q=2^q\) possible values in a binary encoding.

Without loss of generality, we consider grayscale images I with a square \(2^n \times 2^n\) domain and values in the range \([0,2^q-1]\). In this way we consider \(|XY\rangle \) with 2n qubits, where \(n = \lceil \log _{2}D\rceil \), and \(|C\rangle \) using \(q=\lceil \log _{2}N_q\rceil \) qubits to represent \(2^q\)

state of \(2n + q\) qubits:

$$\begin{aligned} |I(n)\rangle = \frac{1}{2^n}\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|C_{XY}\rangle \otimes |XY\rangle , \end{aligned}$$
(12)

where

$$\begin{aligned} \begin{aligned} |C_{XY}\rangle = |C_{XY}^{q-1} \ldots C_{XY}^2 C_{XY}^1 C_{XY}^0\rangle \in \{0,1,\ldots ,2^{q}-1\}\\ \end{aligned} \end{aligned}$$
(13)
Fig. 2
figure 2

NEQR representation of an image: (left) the \(2\times 2\) image; (right) NEQR circuit, encoding pixels 00, 01, 10, 11 (left to right). Figure on the left from [24]

and each \(|C_{XY}^i\rangle \) qubit is \(|0\rangle \) or \(|1\rangle \). A simple example of NEQR representation for a \(2 \times 2\) grayscale image is illustrated in Fig. 2(left), where each corner number denotes pixel’s coordinates, while centered value indicates pixel’s intensity [24]. The quantum wave function for this image in NEQR is hence the following:

$$\begin{aligned} |I\rangle= & {} \frac{1}{2}(|C_{00}\rangle \otimes |00\rangle + |C_{01}\rangle \otimes |01\rangle + |C_{10}\rangle \otimes |10\rangle |C_{11}\rangle \otimes |11\rangle )\nonumber \\= & {} \frac{1}{2}(|11110000\rangle \otimes |00\rangle + |01000100\rangle \otimes |01\rangle \nonumber \\{} & {} + \,|10010100\rangle \otimes |10\rangle + |01001001\rangle \otimes |11\rangle ). \end{aligned}$$
(14)

This is obtained by putting the coordinate qubits \(|X\rangle \) and \(|Y\rangle \) in a superposition state using Hadamard gates, then entangle them with qubits \(|C_{XY}\rangle \) using a series of CNOT gates. The resulting NEQR circuit is illustrated in Fig. 2(right) for the image in Fig. 2(left).

NEQR is a very versatile representation for image computation. However, it presents some drawbacks. As pixels are encoded one by one, large images produce long NEQR circuits; this means that circuit depth grows linearly with image size. As the image size grows, the number of coordinate qubits become larger and the control qubits needed for encoding quickly become more than two. A Toffoli gate with more than two control qubits is defined Multiple CNOT gate, or just MCX: this operator is decomposed before execution, using many Toffoli gates and some auxiliary qubits called ancilla. From an efficiency point of view, more coordinate qubits means larger MCX to be used, which leads to a larger amount of gates.

3.2 Image pre-processing

In order to reduce the memory usage, we split a pre-padded image (one-pixel bordered) into many smaller overlapped patches, \(4 \times 4\) sub-images. Once extracted, a patch of pixels will be processed by the quantum TV algorithm. At the end of each iteration, will be necessary to reassemble the resulting image from the output patches.

Fig. 3
figure 3

Image pre-processing: image (left), padded image (center) and extracted patches (right)

Figure 3 illustrates the image pre-processing procedure for a sample image: red square frames original image, while green squares in patches highlight the processed pixels.

In order to speed up quantum circuits generation, we have subdivided workload using multi-threading: each thread is tasked to assemble a QTV for each image patch, using pre-assembled circuits and generating remaining ones. This approach considerably accelerated the execution process for what regards the generation phase.

3.3 Image measurement

The image extraction from the QIR format is a not-trivial process and its performance depends on the quantum representation used in the algorithm.

A quantum representation collects all image’s data in a single quantum state. Due to the nature of this particular quantum state, the extraction is not a deterministic process: for each measurement, one of the possible pixel coordinate-value association is randomly obtained as outcome from the collapse of the quantum state. For a complete recovery of an image, it is necessary to execute the same algorithm many times.

The image measurement is the last step in Fig. 1.

4 Quantum TV filter

In this section we introduce the Quantum TV Filter, named QTV, a quantum implementation of the TV regularization algorithm described in Sect. 2 applied to qubits involved in the NEQR quantum image representation. This work extends and improves the work in [10] where the authors proposed a quantum solution for implementing a simple median filter for image processing.

Fig. 4
figure 4

Quantum TV general scheme (left); a detail of the QTV sub-circuits (right)

The proposed circuit processes an image input in NEQR representation and provides a denoised image in output in the same NEQR form, according to the scheme in Fig. 4(left).

According to the TV algorithm presented in Sect. 2 we have to iterate a core process for each pixel of the input image. Considering a four-pixel neighborhood configuration, this process is composed by three steps defined as three different quantum operators acting for each pixel:

  1. 1.

    Neighborhood Preparation (NP): collect neighboring pixels and extract their values \(u_u,u_d,u_l,u_r\);

  2. 2.

    P-values Computation (PC): compute weighted values \(p_0,p_1,p_2,p_3,p_4\);

  3. 3.

    Median Function (MF): extract median value from set \(\{u_u,u_d,u_l,u_r,p_0,p_1,p_2,p_3,p_4\}\)

The quantum operators are assembled into the QTV structure illustrated in Fig. 4(right).

In the following, we will describe in detail the three operands which characterize the Quantum TV. Each one is composed of many sub-circuits, or modules.

Fig. 5
figure 5

Neighborhood Preparation module

Fig. 6
figure 6

P-values computation module

4.1 Neighborhood preparation

This operand is in charge of extract neighboring pixels from NEQR representation. At this aim, we used Cycle-Shift (CS) module to change a coordinate register’s value, allowing us to shift an image up, down, left or right; see Appendix A for details.

The structure of the NP operand, which output is a quantum superposition state of the central f, up \(u_u\), down \(u_d\), left \(u_l\) and right \(u_r\) pixel values, is illustrated in Fig. 5. Specifically, once obtained a pixel value from NEQR, we use CS to shift coordinate values, then we re-apply NEQR to gather a new pixel value corresponding to the new position data. With the exception of the first NEQR for image encoding, we avoid the usage of H-gates applied to coordinate registers, as we want to extract a specific (and not random) color outcome.

The output of the NP operand is a quantum state obtained starting from \(|\psi \rangle = |0\rangle ^{\otimes (q+2n+5q)}\), which reads as

$$\begin{aligned} |\psi _{NC}\rangle =\frac{1}{2^n}\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_l\rangle |u_u\rangle |u_r\rangle |u_d\rangle |f\rangle |0\rangle |X\rangle |Y\rangle \end{aligned}$$
(15)

where each qubit in the \(|C\rangle \) register is reset to the \(|0\rangle \) state.

More in detail, following the scheme in Fig. 5, we have

$$\begin{aligned} \begin{array}{rll} \text {NEQR}\cdot \text {H}_{XY}|\psi \rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|f\rangle |X\rangle |Y\rangle )\otimes |0\rangle ^{\otimes 5} &{}=|\psi _1\rangle \\ \text {SWAP}|\psi _1\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|f\rangle |0\rangle |X\rangle |Y\rangle )\otimes |0\rangle ^{\otimes 4} &{}=|\psi _2\rangle \\ \text {CS}_{y+}|\psi _2\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|f\rangle |0\rangle |X\rangle |Y+1\rangle )\otimes |0\rangle ^{\otimes 4} &{}=|\psi _3\rangle \\ \end{array} \end{aligned}$$
$$\begin{aligned} \begin{array}{rll} \text {NEQR}|\psi _3\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|f\rangle |u_d\rangle |X\rangle |Y+1\rangle )\otimes |0\rangle ^{\otimes 4} &{}=|\psi _4\rangle \\ \text {SWAP}|\psi _4\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_d\rangle |f\rangle |0\rangle |X\rangle |Y+1\rangle )\otimes |0\rangle ^{\otimes 3} &{}=|\psi _5\rangle \\ \text {CS}_{y-}\text {CS}_{x+}|\psi _4\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_d\rangle |f\rangle |0\rangle |X+1\rangle |Y\rangle )\otimes |0\rangle ^{\otimes 3} &{}=|\psi _6\rangle \\ \text {NEQR}|\psi _6\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_d\rangle |f\rangle |u_r\rangle |X+1\rangle |Y\rangle )\otimes |0\rangle ^{\otimes 3} &{}=|\psi _7\rangle \\ \text {SWAP}|\psi _7\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_r\rangle |u_d\rangle |f\rangle |0\rangle |X+1\rangle |Y\rangle )\otimes |0\rangle ^{\otimes 2} &{}=|\psi _8\rangle \\ \text {CS}_{y-}\text {CS}_{x-}|\psi _8\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_r\rangle |u_d\rangle |f\rangle |0\rangle |X\rangle |Y-1\rangle )\otimes |0\rangle ^{\otimes 2} &{}=|\psi _9\rangle \\ \text {NEQR}|\psi _9\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_r\rangle |u_d\rangle |f\rangle |u_u\rangle |X\rangle |Y-1\rangle )\otimes |0\rangle ^{\otimes 2} &{}=|\psi _{10}\rangle \\ \text {SWAP}|\psi _{10}\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_u\rangle |u_r\rangle |u_d\rangle |f\rangle |0\rangle |X\rangle |Y-1\rangle )\otimes |0\rangle &{}=|\psi _{11}\rangle \\ \text {CS}_{y+}\text {CS}_{x-}|\psi _{11}\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_u\rangle |u_r\rangle |u_d\rangle |f\rangle |0\rangle |X-1\rangle |Y\rangle )\otimes |0\rangle &{}=|\psi _{12}\rangle \\ \text {NEQR}|\psi _{12}\rangle = &{}\frac{1}{2^n}(\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_u\rangle |u_r\rangle |u_d\rangle |f\rangle |u_l\rangle |X-1\rangle |Y\rangle )\otimes |0\rangle &{}=|\psi _{13}\rangle \\ \text {SWAP}|\psi _{13}\rangle = &{}\frac{1}{2^n}\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_l\rangle |u_u\rangle |u_r\rangle |u_d\rangle |f\rangle |0\rangle |X-1\rangle |Y\rangle &{}=|\psi _{14}\rangle \\ \text {CS}_{x+}|\psi _{14}\rangle = &{}\frac{1}{2^n}\sum _{X=0}^{2^{n}-1}\sum _{Y=0}^{2^{n}-1}|u_l\rangle |u_u\rangle |u_r\rangle |u_d\rangle |f\rangle |0\rangle |X\rangle |Y\rangle &{}=|\psi _{NC}\rangle \\ \end{array} \end{aligned}$$

4.2 P-values computation

Starting from the obtained neighborhood values, we compute the p-values according to relations (10). This reduces to adding some constant values to f, the central pixel.

However the QTV algorithm only works in unsigned integer arithmetic, while the TV algorithm works with floating point numbers, thus getting more accurate and precise results. Therefore, instead of the p-values \(p_i = f + \frac{W_i}{2\lambda },\) we force approximated rounded values

$$\begin{aligned} p_i = f + round(\frac{W_i}{2\lambda }). \end{aligned}$$
(16)

The p-values are then given by:

$$\begin{aligned} \begin{array}{lll} W_0 = 4, &{} p_0 = &{} f + round(2/\lambda )\\ W_1 = 2, &{} p_1 = &{} f + round(1/\lambda )\\ W_2 = 0, &{} p_2 = &{} f\\ W_3 = -2, &{} p_3 = &{} f - round(1/\lambda )\\ W_4 = -4, &{} p_4 = &{} f - round(2/\lambda ). \end{array} \end{aligned}$$
(17)

The final design of the P-values Computation module is hence illustrated in Fig. 6, and it is composed by three sub-circuits for setting, adding and subtracting the mentioned constants. In particular:

  • SETTER module: assign the round to the nearest integer of a given value to a register, which is used to encode our constants;

  • ADDER module: add two values encoded in two quantum registers. We refer to the Appendix A for more details;

  • SUBTRACTOR module: subtract two values by using an adder module, according to \(\alpha - \beta \Rightarrow \overline{ {\overline{\alpha }} + \beta }.\)

4.3 Median function

Fig. 7
figure 7

The basic sorting strategy

To determine a median from a set of values, we need to sort them. The sorting strategy here adopted follows the proposal in [10] for a set of 9 sortable elements. If we re-arrange these values in a matrix form, then we simply need to follow these three steps in pipeline: sort each column, sort each row, and sort the right diagonal. This will guarantee us to store the median value in the central quantum register. The strategy is illustrated in Fig. 7.

Fig. 8
figure 8

Median function module

The corresponding quantum circuit is shown in Fig. 8.

Fig. 9
figure 9

Sort module (left); Swapper module (right)

Fig. 10
figure 10

Lena denoising results for AWGN

Fig. 11
figure 11

Cameraman denoising results for AWGN

Fig. 12
figure 12

QRCode denoising results for AWGN

Fig. 13
figure 13

Lena denoising results for SPN

Fig. 14
figure 14

Cameraman denoising results for SPN

Fig. 15
figure 15

QRCode denoising results for SPN

The Sort module is then the core of Median Function module. Sort module has to order its three input registers. The total ordering of three elements is a trivial problem, as it is reduced to compare two positive integer values and, if not ordered, swap them. A sequence of three comparisons is necessary and sufficient to reach a correct ordering. A Sort module is then a sequence of three sub-circuits, named Swapper (SWPR). A Swapper module is in turn composed by two sub-circuits: a Comparator (COMP) and a Controlled Swap (C-SWAP). Sort and Swapper modules are shown in Fig. 9.

The Comparator module evaluates two register values ab and provides, on an auxiliary qubit e, the result of the comparison \(a>b\), as follows

$$\begin{aligned} \text {if} \quad a \le b \quad \text {then} \quad e = |0\rangle \quad \text {else} \quad e = |1\rangle . \end{aligned}$$
(18)

This result will be next used to control a C-SWAP gate, so that if \(e = |1\rangle \), then the values \(a \,\text {and}\, b\) are swapped.

The Comparator module is described in Appendix A. The designed quantum circuit for the Comparator module is more efficient with respect to other proposals. For example, in case \(q=8\), this circuit uses less quantum elementary gates than other existing methods: depth = 64 with respect to the Sort module applied in [10] which has depth = 1.091.767.

5 Circuit complexity analysis

In order to estimate a quantum circuit efficiency, we have to look at its depth, that is the longest path in it. The path length is always an integer number, representing the number of gates it has to execute in that path. At this aim, we are going to analyze each operand of Quantum TV to derive an approximate estimation of its depth.

Neighborhood Preparation is for sure the most demanding operand of all filtering algorithm, because it uses multiple instances of NEQR circuit, which length is variable according to the number of pixel to encode. Considering a NEQR implementation that commits the least amount of MCX gates and encodes N pixels, its depth grows polynomially with N as follows,

$$\begin{aligned} \text {NEQR}_{\text {depth}} = N(4(2\log _2{\sqrt{N}} -1) + 8) \Rightarrow {\mathcal {O}}(8N\log _2{\sqrt{N}} + 4N). \end{aligned}$$

Other modules involved are Swap and Cycle Shift. Swap depth depends on the number q of color qubits, thus its depth is always equal to that value. Cycle Shift depth, instead, depends on coordinate register size and it is equal to \(\log _2{\sqrt{N}}\cdot (\log _2{\sqrt{N}}-1)\).

Neighborhood preparation uses NEQR, SWAP and CS five times in a row. Hence the NP module overall depth is polynomial in N and we can estimate the NP depth as:

$$\begin{aligned} \text {NP}_{\text {depth}} = 5\cdot (8N\log _2{\sqrt{N}} + 4N + q + \log _2{\sqrt{N}}\cdot \log _2{\sqrt{N}}-1)) \end{aligned}$$

For what concerns the P-values Computation depth, the setting phase involves four SET modules, which however are applied simultaneously, so they count as a single one. An Adder module depth instead is dependent on the q value. Full-Adder is composed of q Half-Adder, with constant depth (considering also Reset gates); then it is followed by \(4q + 1\) sequentially placed gates. Thus we have:

$$\begin{aligned} \text {ADD}_{\text {depth}} = 9q+4q+1 = 15q+1 \\ \text {SUB}_{\text {depth}} = \text {ADD}_{\text {depth}}+2 = 15q+3 \end{aligned}$$

From these results, we can derive the P-values Computation module total depth which is polynomial in q:

$$\begin{aligned} \text {PC}_{\text {depth}} = 60q+8\Rightarrow {\mathcal {O}}(poly(q)) \end{aligned}$$

To compute Median Filter depth, we just need to estimate SWPR module depth and count its occurrences in the circuit. To do so, we add up Comparator and C-SWAP depths.

Comparator’s depth depends on q, as for each color qubit it uses 6 gates. Although Toffoli gate used in this module counts as three, as two additional X-gates need to be added in order to work. This means that module depth is estimated as 8q. C-SWAP depth is the same as SWAP. This means that Swapper’s total depth is estimated as \(8q + q = 9q\).

In Median Filter there are multiple occurrences of SWPR module, which can be further reduced if row and column sorting are executed simultaneously:

$$\begin{aligned} \text {MF}_{\text {depth}} = 9\cdot 9q = 81q \Rightarrow {\mathcal {O}}(poly(q)). \end{aligned}$$

From this analysis we have derived that Quantum TV algorithm implements the NP module with polynomial complexity in the number of pixels N. The PC and MF modules instead have a polynomial complexity in \(q=\lceil \log _{2}N_q\rceil \), i.e. the logarithm of the number of colors, hence providing an exponential speedup.

6 Experimental results

In this section we evaluate the performance of the quantum TV algorithm (QTV) on denoising grayscale images, and we present some preliminary results from the comparison with the variational anisotropic TV in Algorithm 1.

The reference images used for the test have dimension \(128\times 128\) pixels and grayscale values (8-bit color depth). The discrete model of the image degradation process under noise corruptions can be written as:

$$\begin{aligned} f = {\mathcal {N}}({\bar{u}}) \end{aligned}$$
(19)

where \({\bar{u}}, f \in {\mathbb {R}}^{N}\) represent vectorized forms of the unknown clean image and of the observed corrupted image, respectively, while \({\mathcal {N}}(\,\cdot \,)\) denotes the noise corruption operator, which in most cases is of random nature.

In this work we considered two important types of noise, namely the additive (zero-mean) white Gaussian noise (AWGN) and the impulsive salt and pepper noise (SPN), which models saturated or dead pixels.

Denoting by \(\Omega := \{1,\ldots ,N\}\) the set of all pixel positions in the images, for these two kinds of noise the general degradation model in (19) reads as

$$\begin{aligned} \begin{array}{ccc} \mathrm {AWGN:} &{} \mathrm {SPN:} \\ f_i \,\;{=}\;\, {\bar{u}}_i \;{+}\; n_i \;\;\, \forall \, i \in \Omega \, ; &{} f_{i} \,\;{=}\;\, \left\{ \begin{array}{ll} {\bar{u}}_i \;\; &{} \textrm{for} \;\;\, i \in \Omega _0 \subseteq \Omega \\ n_i \in \{V_{min},V_{max}\} &{} \textrm{for} \;\;\, i \in {\Omega _1} :=\Omega \setminus \Omega _0. \end{array} \right. \, \end{array} \end{aligned}$$

In case of SPN, only a subset \(\Omega _1\) of the pixels is corrupted by noise, whereas the complementary subset \(\Omega _0\) is noise-free. In particular, the corrupted pixels can take only the two possible extreme values \(V_{min}/V_{max}\) , where in our case assume 0 and 255 values, with the same probability. The amount of noise can be measured with an error rate computed as follows:

$$\begin{aligned} ER_{\%} =\frac{\text {number of corrupted pixels}}{\text {number of pixels in image}} \times 100. \end{aligned}$$

For what concerns AWGN, the additive corruptions \(n_i \in {\mathcal {G}}(\sigma ,0)\), \(i \in \Omega \), represent independent realizations from the same univariate Gaussian distribution with zero mean and standard deviation \(\sigma \).

The performance has been evaluated by the Root-Mean-Square Error (RMSE) metric, defined as:

$$\begin{aligned} RMSE({\bar{u}},u):= \sqrt{\frac{\sum _{i=1}^{N}({\bar{u}}_{i} - u_{i})^2}{N}}, \end{aligned}$$

where \({\bar{u}}\) is the original reference image and u is the denoised output image. A lower RMSE indicates a more precise reconstruction.

For the remaining part of this work, when you come across the terms classical/quantum algorithm, we are referring to their respective implementation.

The selection of the regularization parameter \(\lambda \), which exerts a crucial effect on the solution, has been carried out, for each test, by running the TV algorithm for a range of \(\lambda \) values in order to select by the trial-error strategy the optimal regularization parameter. Then the estimated selected optimal \(\lambda \) has been used in the quantum TV algorithm in order to compare the results of the two algorithms with the same optimal parameter.

A fundamental concept we must keep in mind when analyzing the obtained results, is that classical TV denoising uses floating point numbers, usually along with a value normalization, to be as accurate as possible. This leads to a more precise outcome and a finer quantization of the output image. On the other hand, our quantum computation uses integer numbers, so an approximation had to be applied (as previously described in Sect. 4). This difference has an impact on the output images, which present coarser improvements than the classical ones.

For testing purpose, we created an implementation of Quantum Median Filter using Qiskit, a Python library for quantum computing simulation [9, 15, 17]. The simulations of the quantum algorithm has ran on the Galileo100 supercomputer (CINECA), with the following cluster configuration:

  • Nodes: 348 standard nodes

  • Processors: 2xCPU x86 Intel Xeon Platinum 8276-8276L (2.4Ghz)

  • Cores: 16704 (48 cores/node)

  • RAM: 384GB

6.1 Example 1: AWGN denoising

We consider the problem of denoising the three test images lena, QR, cameraman, corrupted only by AWG noise with standard deviation \(\sigma =\{5,10,15\}\), as shown in the first column of Figs. 10, 11, 12.

The denoised images obtained by applying TV and QTV algorithms are illustrated in Figs. 10, 11, 12, column-wise for the three images. For each denoised image the RMSE obtained value is reported below.

From a visual inspection the denoised images from TV and QTV present a comparable quality, even if the RMSE highlights the lost in accuracy due to the mentioned integer arithmetic representation followed by QTV.

Moreover, the quantum TV algorithm is able to denoise images corrupted by severe Gaussian noise, as illustrated in the last row of Figs. 10, 11, and Fig. 12 for \(\sigma =15\). Unlike, the quantum median filter proposed in [10], as any standard median filtering, is not appropriate to remove this kind of noise. The median filter is instead well-known to be an excellent image denoiser in case of salt-and-pepper noise because it does not blur the image, as a mean filter would do. However, despite its name, the median filter is not a filter because it does not respect the linearity property.

6.2 Example 2: SPN denoising

In this example we applied TV and QTV for the denoising of the three test images lena, QR, cameraman, corrupted by SPN noise with error rate ER\(_\%=\{5,10,30\}\).

The noisy images are shown in the first column of Figs.13, 14, 15, together with the denoised images in the second (TV) and third (QTV) columns, along with the associated RMSE values, reported in the bottom.

From these results, we can see how well quantum algorithm performs when compared to its variational counterpart. QTV results present excellent qualitative performance, with minimal RMSE differences.

However, by a visual inspection of the denoised images, we notice that, for both the algorithms, some pixel clusters were not completely denoised. This can be due to either to the limited pixel neighborhood considered (4 pixels), or to the \(L_2\)-norm metric fidelity in our model (5), when it is well known that SPN can be better treated by a \(L_1\)-norm fidelity, which, anyway, leads to a non-differentiable fidelity term.

7 Conclusions and discussion

In this work a quantum approach is proposed for total variation denoising, and its corresponding quantum circuit is designed. Specifically the quantum TV implements the anisotropic median formula presented in [13]. The main idea of the approach is that first the classical image is converted into a quantum version based on the quantum representation (NEQR) of digital images, and then three quantum modules are applied to realize the neighboring collection for each pixel in the image, weight calculation, and median extraction. Finally, an image measurement process collapses the quantum state into a resulting denoised image. From the complexity analysis in Section 5 we derived a polynomial complexity in the number of pixels N for the NP module, and a polynomial complexity in logarithm of the number of colors \(N_q=2^q\) for the PC and MF modules.

The experimental results show that the quantum TV performance is comparable to the classical variational TV approach. However, we highlighted several issues that need to be addressed to make the proposal a competitive QIP algorithm, as its variational counterpart. For example the Neighborhood collection module is an expensive operand due to the NEQR image representation. Even though NEQR is still one of the most used representation methods in QIP and the most suitable choice for this work, future developments should definitely search for other QIR alternatives, or develop more efficient versions of the same representation, such as parametric quantum circuits that take advantage of data structure for improving image processing activities.

Fig. 16
figure 16

Quantum modules used in QTV