1 Introduction

Quantum mechanics is considered one of the most important fields in modern physics. It provides a complete mathematical and conceptual framework to study systems on the microscopic scale. The theory of quantum mechanics associates a set of allowed states to a particle. The state that a particle occupies, determines its physical properties. However, this state could change depending on the external forces acting on the particle. The dynamics of this process can be computed from Schrödinger’s equation by evolving the initial states. The theory postulates the existence of an operator called the Hamiltonian which represents the total energy of the system. This operator incorporates the information regarding the external forces and controls the dynamics of the change. The result of this evolution is the particle existing in a superposition of states. This superposition condition is maintained until a measurement operation is performed. Then, the particle collapses to one of the states with a certain probability depending also on the external forces. Thus, by manipulating the external forces applied on a particle, the dynamics of the evolution and the probabilities of each state can be controlled.

It should be noted that quantum mechanics only provides the framework and the general laws that any microscopic-scale system must obey. However, it does not specify details about particular systems. For example, the theory does not specify exactly the form of the Hamiltonian operator for different systems. It is the role of a physicist to determine the representation of that operator for a particular system. This is considered a powerful point in the theory, since it provides a generic platform that could be used to address issues in diverse disciplines. In this case, the selection of the framework elements (such as the Hamiltonian) could be adapted, and consequently, the general rules of quantum mechanics could be applied to solve a particular problem.

Recently, quantum mechanics-based algorithms for signal and image processing have been proposed, motivated by the theoretical and practical progress in the fields of quantum computing and quantum information processing. One of the earliest attempts to use quantum mechanics in signal processing was proposed by Eldar and Oppenheim [8]. The focus in that work was on the quantum measurement operation which was applied to solve problems in signal processing such as signal estimation and detection. This framework was extended by Tseng and Tsung [22] and was applied in the domain of image processing to solve some problems such as image half-toning, edge detection, and visual cryptography. Additional image cryptography methods using the quantum theory can be found in [25, 26, 31]. Quantum image representation [13, 29, 30] and watermarking techniques [11, 24, 27, 28] are among the applications of the quantum mechanics. In [17], Nasios and Bors proposed the use of kernel-based classification using quantum mechanics.

Image segmentation is the process of separating the foreground of one or more objects from the background in a digital image. Many techniques exist for accomplishing such task. These techniques include edge detectors, thresholding, region-based methods (such as region growing), morphological methods (such as watersheds), and deformable models (such as active contours). Machine learning algorithms are also used for image segmentation, in both the supervised schemes (such as neural networks) and the unsupervised schemes (such as K-means). Such methods are discussed in [10, 15] and [23]. Image segmentation was recently addressed utilizing the quantum formulation. The authors of [21] developed an extension of Gaussian mixture models and constructed a quantum statistical mechanical framework, and then, they applied it to segment images. Talbi et al. [20] used a quantum-inspired genetic algorithm approach to image segmentation. In [9], the use of quantum mechanics was proposed to perform edge detection on medical images. The use of Schrödinger’s transform for image analysis was proposed in [14]. This transform was used to extract boundaries of objects in images. Another quantum mechanics-based contour extraction algorithm was proposed in [12]. In [2], a different approach of using quantum mechanics for object extraction was proposed. The method depends on solving Schrödinger’s equation on the image, resulting in a large number of possible wave functions representing possible segments. Casper et al. [5] used a quantum version of K-means clustering algorithm for image segmentation. The authors in [16] presented a quantum mechanics-based image segmentation based on hierarchical analysis of the 2-D histograms.

In this work, a novel generic approach for utilizing quantum mechanics in signal and image analysis is introduced. In order to realize this approach, an image or signal element is modeled by a quantum system. This modeling enables the usage of quantum mechanics important principals such as the evolution of states and the superposition property in image processing problems. Image segmentation is selected to develop the formulation of the framework. The quantum-based algorithm for image segmentation is implemented and tested against other existing methods from the literature.

2 Methods

This section starts with the basic postulates of quantum mechanics. These postulates are essential to understand the theoretical basis of utilizing the quantum theory in image processing. An overview of the framework will be discussed in the second subsection. Next, we will demonstrate and develop a customized algorithm based on the framework for addressing the single-object image segmentation problem. In the following subsection, a special treatment is devoted to the two-state quantum system because it forms an analogy to the problem of image segmentation. This analogy is described in detail in the next part. Then, the steps of the proposed algorithm are explained. After that, two important related issues are to be discussed separately as they form a key role in the realization and practical usage of the algorithm. These are the design of the Hamiltonian operator responsible for the states evolution and the solution of the associated Schrödinger’s equation. Finally, the performance measures used for evaluating the algorithm as well as comparing it with other algorithms are discussed.

2.1 Basic quantum mechanics postulates

In this part, the basic postulates of quantum mechanics are stated. These form the foundation and the concepts necessary for constructing framework. Moreover, it will be used in the subsequent discussions on the proposed methodology. For further details, see [18].

  1. 1.

    For every quantum mechanical system, there is an associated Hilbert space called the state space of the system. All information about the system can be extracted by knowing its state which is a unit vector in the state space. The state of the system is denoted in Dirac’s notation by the “ket” \(|\psi \rangle \). The adjoint of the state vector is denoted by the “bra” \(\langle \psi |\). The inner product of two-state vectors \(|\psi \rangle \) and \(|\phi \rangle \) is denoted by \(\langle \phi | \psi \rangle \). Since the state vector has unity norm, then \(\langle \psi | \psi \rangle =1\).

  2. 2.

    A unitary transformation depending only on the initial and final time instants describes the evolution of a closed system (system not interacting with other systems). This is represented by

    $$\begin{aligned} |\psi (t_2)\rangle =U(t_2,t_1)|\psi (t_1)\rangle \end{aligned}$$
    (1)

    where U is a unitary operator. The operator U must be unitary so as to preserve the unity norm of the state vector \(|\psi \rangle \) at any time instant.

  3. 3.

    The state of a closed system evolves according to Schrödinger’s equation:

    $$\begin{aligned} i\hbar \frac{\partial {|\psi \rangle }}{\partial {t}}=H|\psi \rangle \end{aligned}$$
    (2)

    where H is called the Hamiltonian operator, representing physically the total energy (sum of kinetic and potential energies) of the system. This postulate (describing the evolution of states in continuous time) is the generalization of the previous one (describing the evolution of states in discrete time).

  4. 4.

    Each observable (any physical quantity that can be measured such as position, momentum, and energy) is represented by an operator that acts on the state of the system. The eigenvalues of the operator represent physically the expectation of the physical quantity represented by that operator.

  5. 5.

    The system remains in a superposition of all its states, until a measurement occurs. At this case, the system collapses to one of its states with a predefined probability. A measurement is a physical operation done on the system to determine certain outcome of an observable. Measurements in quantum mechanics are inherently probabilistic. This means that we could get a different outcome each time we set up the system and perform the measurement. This is opposed to the classical picture, where identical setups must yield identical outcomes. Measurements are represented mathematically by projection operators acting on the state vectors, for each possible outcome m, there exists a measurement operator \(M_m\). The probability of getting an outcome m is given by

    $$\begin{aligned} p(m)=\langle \psi |M_m^\dagger M_m|\psi \rangle \end{aligned}$$
    (3)

    where \(M^\dagger \) is the hermitian conjugate of M. After measurement, the state vector becomes

    $$\begin{aligned} \frac{M_m|\psi \rangle }{\sqrt{\langle \psi |M_m^\dagger M_m|\psi \rangle }} \end{aligned}$$
    (4)

2.2 Overview of the framework

A class of image processing problems could be modeled using the concept of the state space. For example, image segmentation can be reduced to the task of determining the final states of pixels or regions. Different states should be assigned to each object in the foreground and the background. Pixels belonging to the same object should ideally reach the same steady state regardless of the initial state. This can be applied directly to other problems such as image classification and object recognition. Moreover, by replacing the image element with a signal element (e.g., a sample at a time point), the same framework can be used in signal-processing applications. The main idea behind the proposed framework is to consider each image element (pixels, windows, or custom shapes) a quantum particle that satisfies the postulates. This enables the association of a multi-state quantum system to each image element. Thus, a state space is assigned to each element having an initial state. The final state is an evolution of the initial state governed by Schrödinger’s equation.

In quantum theory, the states change under the effect of an external force. The effect of this force is embedded into Schrödinger’s equation in a form of an operator to control the evolution process. In this work, features derived from the image are used to construct the operator. The operator should satisfy certain conditions to ensure the convergence and the cross-state transformation. According to the superposition property, the result of the evolution is that the image element residing in a superposition of states. The image dependent external force determines the probabilities associated with each state. The measurement step that causes the particle to occupy a single state is then performed. In the image domain, this could correspond to a thresholding operation on the state probabilities.

The segmentation of a single object in an image is selected in this paper to formulate an algorithm based on the proposed framework. Each individual pixel is treated as a particle. In this case, the number of states is two corresponding to the foreground and background of the image. Therefore, the two-state quantum system is considered in the next subsection. The design of the Hamiltonian operator and the solution of the evolution equation will be discussed in detail later in this section.

2.3 Two-state quantum system

The single-object image segmentation problem will be approached by modeling the pixels in the image with a two-state quantum system. A physical example of such a system is a stationary spin-\(\frac{1}{2}\) particle such as a stationary electron. From the postulates of quantum mechanics, we find that the state space of the system is two-dimensional since there are only two independent states, which will be denoted by \(|0\rangle \) and \(|1\rangle \). The system will be in a superposition of those two states until a measurement is performed. This is represented mathematically by taking the general state of the system

$$\begin{aligned} |\psi \rangle =\alpha |0\rangle +\beta |1\rangle \end{aligned}$$
(5)

\(|\alpha |^2\) represents the probability of the system to be found in the ground state \(|0\rangle \) after measurement, while \(|\beta |^2\) represents the probability of the system to be found in the excited state \(|1\rangle \) after measurement. Since the total probability must add up to unity, then

$$\begin{aligned} |\alpha |^2+|\beta |^2=1 \end{aligned}$$
(6)

As the system evolves according to Schrödinger’s equation given in Eq. 2, these probabilities change with time. The Hamiltonian operator (which represents the total energy of the system) controls the evolution process. The total energy of a particle is the sum of its kinetic and potential energy. In this case, the particle is stationary, and thus, the kinetic energy vanishes, leaving only the potential energy which accounts for any external forces applied on this particle. So by controlling the external forces, we can control the evolution of the states of this system. For example in the case of the electron, applying a magnetic field to this electron results in changing its spin state. By carefully choosing the magnitude and the direction of the applied magnetic field, the spin state of the electron can be driven to any desired state, starting from a known initial state. The Hamiltonian of an electron at rest, present in a magnetic field \(\mathbf {B}=(B_x,B_y,B_z)\), is given by

$$\begin{aligned} H=-\gamma \begin{pmatrix}B_z &{} B_x- i B_y \\ B_x + i B_y &{} -B_z \end{pmatrix} \end{aligned}$$
(7)

where \(\gamma \) is called the gyromagnetic ratio.

2.4 Analogy to image segmentation

In this subsection, the two-state quantum system is shown to be analogous to the problem of foreground image segmentation. The pixels will be treated individually where each pixel in the image is represented by a two-state closed quantum system. Then, each pixel should occupy one of the two quantum states (foreground or background). The pixels belonging to the foreground class should evolve to the same final steady state, while the pixels belonging to the background class should end occupying the other steady state with higher probability. The background class will be the ground state \(|0\rangle \), while the foreground class will be the excited state \(|1\rangle \). This is the convention taken in this paper. However, the interchange of the two states will not affect the steps of the algorithm. So, generally the pixel will be in a superposition of the two states, since its actual classification is not yet known. This can be represented by the general state

$$\begin{aligned} |\psi \rangle =\alpha |0\rangle +\beta |1\rangle \end{aligned}$$
(8)

Thus, the probability of a pixel to be in the background is \(|\alpha |^2\), while the probability for a pixel to belong to the foreground is \(|\beta |^2\). For the determination of the pixel class, the state must evolve according to Schrödinger’s equation, to reach its final state. The applied external force responsible for driving the system to its steady state is constructed from a set of image-based features. These features are organized in a vector form. Therefore, the Hamiltonian of the system must depend on this feature vector, as the case of the external magnetic field. Finally, in order to take the decision, a thresholding process is performed on the probability of foreground, which resembles the measurement operation in the case of the quantum system. This analogy is summarized in Table 1.

Table 1 Analogy between the problem of single-object segmentation and quantum mechanics

2.5 Proposed algorithm

The general procedure of the algorithm consists of the four main steps summarized in Algorithm 1. Each pixel in the input image is considered a quantum particle and modeled by a quantum system having two possible states as previously discussed. The first step is to construct the Hamiltonian controlling the evolution. Theoretically, each system can have its own operator. Nevertheless, for practical purposes, we have decided to use a single form of the operator for all the pixels in an image or in similar images. The parameters of the Hamiltonian are estimated from image-based features using supervised learning from a selected image window that does not exceed 10 % of the image size. In case of segmenting many images of the same type, a small set of the images could be used for learning and the majority for testing. Features can be the gray levels of pixels or can be derived from a neighborhood surrounding a pixel such as statistical features (e.g., mean, median, or variance), texture features, or even a combination of different types. The evolution operator must satisfy certain conditions and be carefully designed. Therefore, its design will be elaborated in detail in Sect. 2.7. Second, the pixels are initialized to a particular state which is taken to be the \(|0\rangle \) state in this paper. However, choosing any arbitrary initial state will not affect the results. The system evolves in time to get the probabilities of the final states in the third step. Since the Hamiltonian is a function of the features at a certain pixel, it is evaluated for each pixel. Then, the computed operator is inserted in Schrödinger’s equation, and it is solved. It is worth noting that the solution is dependent on the type of the Hamiltonian as will be illustrated in Sect. 2.6. A form of the operator is constructed that will lead to a closed-form solution of Schrödinger’s equation, thus significantly reducing the processing time of the algorithm. The solution should result in the convergence of the probabilities to particular values in order to reach a steady state avoiding time-dependent measurement. Finally, a measurement operation is performed using thresholding. A threshold T is specified. The pixel’s class will be foreground if the probability of the pixel being foreground \(|\beta |^2\) is greater than the threshold T, otherwise it will be classified as background. This procedure is repeated for every pixel in the input image.

figure a

2.6 Solution of Schrödinger’s equation

From the postulates of quantum mechanics that was discussed earlier, the general solution of Schrödinger’s equation given in Eq. 2, evaluated at any time instant t, is

$$\begin{aligned} |\psi (t)\rangle =U(t,t_0)|\psi (t_0)\rangle \end{aligned}$$
(9)

where \(|\psi (t_0)\rangle \) is the initial state and \(U(t,t_0)\) is the unitary time evolution operator (\(U U^\dagger =I\)). There are three cases for getting the operator U according to the Hamiltonian.

  1. 1.

    The Hamiltonian is time independent:

    In this case,

    $$\begin{aligned} U(t,t_0)=e^{-\frac{i}{\hbar } (t-t_0) H } \end{aligned}$$
    (10)

    Taking \(t_0=0\), a closed-form solution can be obtained:

    $$\begin{aligned} |\psi (t)\rangle =e^{-\frac{i}{\hbar } t H} |\psi (0)\rangle \end{aligned}$$
    (11)
  2. 2

    The Hamiltonian is time dependent, but the operators H corresponding to different moments of time commute.

    In other words,

    $$\begin{aligned}{}[H(t_1),H(t_2)]=H(t_1)H(t_2)-H(t_2)H(t_1)=0 \end{aligned}$$
    (12)

    In this case,

    $$\begin{aligned} U(t,t_0)=e^{-\frac{i}{\hbar }\int _{t_0}^t{H(\tau ) \hbox {d}\tau }} \end{aligned}$$
    (13)

    Taking \(t_0=0\), a closed-form solution can be obtained

    $$\begin{aligned} |\psi (t)\rangle =e^{-\frac{i}{\hbar }\int _0^t{H(\tau ) \mathrm{d}\tau }} |\psi (0)\rangle \end{aligned}$$
    (14)

    Clearly, if the Hamiltonian is time independent, then the solution will automatically reduce to that of the first case.

  3. 3.

    The Hamiltonian is time dependent, but the operators H corresponding to different moments of time do not commute.

    In other words,

    $$\begin{aligned}{}[H(t_1),H(t_2)]=H(t_1)H(t_2)-H(t_2)H(t_1)\ne 0 \end{aligned}$$
    (15)

    In this case, no closed-form solution can be obtained. Only an infinite series (Dyson series) or a numerical solution can be found.

2.7 Hamiltonian design

As clear from the previous discussion, the key for this algorithm to succeed is to construct the appropriate Hamiltonian operator that allows the states to evolve leading to correct pixel classification. The Hamiltonian operator is what determines how the state evolves in time. The state \(|0\rangle \) can be represented by the vector \(\begin{pmatrix} 1&0 \end{pmatrix}^T \), while the state \(|1\rangle \) can be represented by the vector \(\begin{pmatrix} 0&1 \end{pmatrix}^T\), and thus, the general state of the system can be represented by the vector \(\begin{pmatrix} \alpha&\beta \end{pmatrix}^T\). Since this operator acts on the state space, then it will be represented by an \(n\times n\) matrix, where n is the dimension of the state space. The Hamiltonian operator must be a Hermitian operator (\(H^\dagger =H\)) to ensure that its eigenvalues (which represent physically the allowed energy levels of the system) are real. Other than the Hamiltonian’s size, and being Hermitian, no other constraints are put on the selection of this operator from the quantum mechanics point of view. In the case considered in this work, the Hamiltonian will be represented by a \(2\times 2\) Hermitian matrix.

In order to prevent using an infinite series solution or an approximate numerical solution, the Hamiltonian should satisfy cases 1 or 2 in the previous subsection. In this paper, we chose the Hamiltonian matrix entries such that they all have the same dependency on time. This condition guarantees the commutation of the operator at different time instants. This leads to

$$\begin{aligned} H(t)=g(t)\mathbf {S} \end{aligned}$$
(16)

where \(\mathbf {S}\) is a \(2\times 2\) matrix and g is any function in t. This form of the Hamiltonian clearly satisfies the condition in Eq. 12. This can be proven by the following equation using direct substitution.

$$\begin{aligned}{}[H(t_1),H(t_2)]=g(t_1)\mathbf {S}g(t_2)\mathbf {S}-g(t_2)\mathbf {S}g(t_1)\mathbf {S}=0 \end{aligned}$$
(17)

Now, the solution of Schrödinger’s equation should be evaluated at a certain time instant t where the measurement operation will be performed. The problem is the choice of this time instant. In order to solve this problem, an extra condition is imposed on the Hamiltonian. The probability calculated from the final state of the system must converge to a particular value. This will ensure that the longer the system evolves, the more accurate the obtained probabilities will be. Selecting the Hamiltonian to be time independent, \((g(t)=\text {const})\) will result in an oscillatory-varying probability, and thus, this choice is refused. To ensure the convergence of probability, the function g must be chosen such that integral in Eq. 14 converges as \(t \rightarrow \infty \). The function selected was

$$\begin{aligned} g(t)=e^{-t} \end{aligned}$$
(18)

which satisfies the conditions and ensures that the integral is convergent.

$$\begin{aligned} \int _0^\infty {e^{-\tau } \hbox {d}\tau }=1 \end{aligned}$$
(19)

In quantum control, according to [3], given the initial and final states of a two-state quantum mechanical system, a Hamiltonian could be obtained in a closed form that leads to the correct evolution of this initial state to the required final state. Physically this corresponds, for example, to how the magnitude and the direction of a magnetic field to control the spin of a stationary electron are determined. If the initial state is assumed to be \(|0\rangle \) and the pixel is to be classified a background pixel, then the final state should be \(|0\rangle \). The Hamiltonian which could result in such evolution will be

$$\begin{aligned} H=2 i \hbar \pi e^{-t} \begin{pmatrix} 0 &{} -1 \\ 1 &{} 0 \end{pmatrix} \end{aligned}$$
(20)

On the other hand, if the initial state is \(|0\rangle \) and the pixel is a foreground pixel, then the final state should be \(|1\rangle \). The Hamiltonian which could result in such evolution will be

$$\begin{aligned} H=\frac{1}{2} i \hbar \pi e^{-t} \begin{pmatrix} 0 &{} -1 \\ 1 &{} 0 \end{pmatrix} \end{aligned}$$
(21)

This shows that for that particular initial state, the matrix part of the Hamiltonian will not differ in either case. The difference between the two scenarios is the value of the constant that multiplies all the matrix entries.

According to the previous results, the Hamiltonian can take the same form as before, up to a multiplicative constant. This constant obviously should depend on the features extracted at that pixel, and it should lead to the correct evolution of the state. So, finally the Hamiltonian will take the following form

$$\begin{aligned} H=i \hbar e^{-t} f(\mathbf {x}) \begin{pmatrix} 0 &{}\quad -1 \\ 1 &{}\quad 0 \end{pmatrix} \end{aligned}$$
(22)

where f is a function in the feature vector \(\mathbf {x}\) at the pixel. This form of the Hamiltonian leads to a closed-form solution of system’s evolution equation as will be proven in “Appendix.” This is important since the equation is solved for each pixel and the closed-form solution highly increases the speed of the algorithm.

The function f is the only unknown in the Hamiltonian. The function f can take several forms; however, it is chosen to be a fifth-degree polynomial. Thus, calculating the evolution operator is then reduced to correctly find the coefficients of that polynomial. A supervised learning method is proposed to estimate the coefficients. It is important to mention that the samples used in the learning could come from a limited size image window and its ground truth segmentation in case of segmenting a single image. This approach is used in this work by randomly selecting a rectangular window. This window has a size of less than 10 % of the image size and should contain pixels from both the foreground and the background of the image. If the task is to segment an object in a dataset of images of the same type such as identifying the lung in X-ray computed tomography images of several patients, a subset of the data containing the entire images and the corresponding ground truth segmentation could be used for training. In both cases, a single operator could be determined and used to segment the rest of the pixels from a single or multi-images. The learning starts by deriving the features from all the training pixels. The problem becomes a typical curve-fitting problem. The target is to calculate the polynomial coefficients that minimize the total error between the segmentation output by the algorithm and the ground truth on the training window. The optimization is based on the steepest descent iterative procedure using the gradient of the total error to update the coefficients. This procedure terminates when either the error is below a certain value or the number of iterations exceeds a maximum value. The Hamiltonian design algorithm is demonstrated in Algorithm 2.

figure b

2.8 Performance measures

The problem of image segmentation is the identification of an object or multiple objects (foreground) on a background of an image. The object is represented as a group of pixels assigned a class that corresponds to the foreground. The pixels comprising the background are assigned a second class. Segmentation algorithms output either the pixel class or the boundary of an object. Foreground pixels may be misclassified as background pixels and vice versa resulting in segmentation errors. These errors can be captured by comparing the obtained pixel class from the algorithm to the corresponding pixel class in the ground truth. If both match, then it is a correct classification, otherwise it is a misclassification. Therefore, the performance of the algorithms needs to be evaluated. Two of the most widely used measures for evaluating the accuracy of a segmentation algorithm are the sensitivity and specificity. The sensitivity and specificity reflect the precision of identifying the foreground and background, respectively. They both rely on counting the correctly classified pixels. Table 2 is often called the confusion matrix and summarizes the four possible outcomes of the pixel segmentation. In this table, the term “positive” refers to the foreground class, while the term “negative” refers to the background class. The term “true” refers to a correct classification by the algorithm, while the term “false” refers to a misclassification. True positive and true negative are the categories of correctly classified foreground and background pixels, respectively. False positives are the pixels wrongly classified as foreground, while false negatives are the pixels wrongly classified as background. Sensitivity is the true-positive rate, which is defined as the ratio of the number of pixels correctly classified as foreground by the segmentation algorithm to the total number of foreground pixels in the image.

$$\begin{aligned} \hbox {Sensitivity}=\frac{\hbox {TP}}{\hbox {TP}+\hbox {FN}} \end{aligned}$$
(23)

Specificity is the true-negative rate, which is defined as the ratio of the number of pixels correctly classified as background by the segmentation algorithm to the total number of background pixels in the image.

$$\begin{aligned} \hbox {Specificity}=\frac{\hbox {TN}}{\hbox {FP}+\hbox {TN}} \end{aligned}$$
(24)

These two measures can be used to assess and compare different segmentation algorithms. The perfect (ideal) segmentation algorithm should produce a sensitivity of 100 % (meaning all foreground pixels are correctly classified) and a specificity of 100 % (meaning all background pixels are correctly classified). However, practical algorithms may misclassify some pixels. Consequently, as the sensitivity and specificity approach 100 %, the segmentation algorithm approaches the perfect case. The sensitivity is computed by counting the pixels that are correctly classified as foreground as well as the total number of foreground pixels from the ground truth. The pixels correctly classified as background are counted, and the total number of background pixels in the ground truth and the specificity is obtained.

Table 2 Confusion matrix for performance evaluation

3 Materials

For testing the proposed algorithm, two datasets were used for proving the concept. The first one was created manually and consists of synthetic images formed of simple geometric shapes, with different levels of noise. The noise is additive white Gaussian with zero mean and a variance in the range from 0.1 to 0.5. This dataset resembles a subset of the image dataset used for evaluating the level set segmentation algorithm proposed in [6]. The other dataset utilized is from the image segmentation database presented in [1] and made available online. This database consists of natural images of objects together with the segmentation ground truth for assessing image segmentation algorithms. Figures 1, 2, 3, 4, 5, and 6 show the first type of images; while Figs. 8, 9, and 10 show the second type of images.

Fig. 1
figure 1

Segmentation of a two-level image. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

Fig. 2
figure 2

Segmentation of a three-level image. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

Fig. 3
figure 3

Segmentation of an image with a gradient. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

Fig. 4
figure 4

Segmentation of an image with a texture. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

Fig. 5
figure 5

Segmentation of a noisy two-level image with noise variance of 0.1. a Original, b ground truth, c result of proposed algorithm, d Result of Canny edge detector, e result of active contours, f result of normalized cuts

Fig. 6
figure 6

Segmentation of a noisy two-level image with noise variance of 0.5. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

4 Results

The results of applying the proposed algorithm to the dataset described before is shown in Figs. 1, 2, 3, 4, 5, 6, 7, 8, and 9. Table 3 demonstrates the sensitivities and the specificities of the automatically segmented images using the proposed algorithm, as well as the features used for Hamiltonian construction. In order to further evaluate the performance of the proposed method, a comparison was made with the classical Canny edge detector [4], active contours without edges [6], and normalized cuts [7, 19]. The application of these methods directly to the gray-scale images yielded very poor results, not shown. Thus, the same set of features, extracted from each image, was used for all methods. The only exception was in the natural images in Figs. 8 and 9. In this case, a single feature that produced the best result was utilized. The normalized cut approach produced multiple possible segmentations, so the most accurate one was chosen. Canny detector provided the object boundary. Therefore, region filling was used to provide the entire set of foreground pixels corresponding to the segmented object. The visual results of carrying out the comparison between the quantum-based system and the other algorithms are also demonstrated in Figs. 1, 2, 3, 4, 5, 6, 7, 8, and 9. The average sensitivity and specificity of the presented methodology were 98.5 and 99.7 %, respectively. The Canny edge detector produced an average of 88.5 % for sensitivity and 96 % for specificity. The active contours method showed an average sensitivity of 87 % and average specificity of 99.8 %. Finally, the normalized cuts had average results of 84.9 and 98.6 % for the sensitivity and specificity, respectively. The details of this evaluation are shown in Table 4. The quantum-based algorithm achieved higher overall segmentation accuracy compared to the other techniques. These outcomes are discussed thoroughly in the next section.

Fig. 7
figure 7

Segmentation of a noisy three-level image with noise variance of 0.3. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

Fig. 8
figure 8

Segmentation of a natural image (from [1]) of a flower. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

Fig. 9
figure 9

Segmentation of a natural image (from [1]) of a bird. a Original, b ground truth, c result of proposed algorithm, d result of Canny edge detector, e result of active contours, f result of normalized cuts

Table 3 Summary of the results of the proposed algorithm
Table 4 Comparison with other algorithms

5 Discussion

In this work, a framework for signal and image processing based on the theory of quantum mechanics was introduced. Image or signal elements are regarded as quantum particles and accompanied by quantum systems. This allows to use important quantum concepts such as state space, external force modeling, dynamics of evolution, and the superposition property to solve some image-related problems. The goal is to determine the final states of the elements based on information specific to the images under consideration. The class of the elements is obtained from the steady state. This approach is suitable for image classification, object recognition, or segmentation applications. The suggested framework was applied to the image segmentation problem and specifically to the single-object case. The correspondence between the pixel and two-state systems in addition to the mathematical formulation was developed. Starting with an initial quantum state, the system evolved in time controlled by a set of features derived from the images. The Hamiltonian operator, controlling the evolution process, was constructed from the image-based features by supervised learning. The class of the pixels was determined from the steady state of the evolved system by a measurement operation. The algorithm was tested on two sets of images to evaluate the segmentation capabilities of the framework.

A set of synthetic images was created to assess different aspects of segmentation. Noise was added to these images to evaluate the algorithm performance in the presence of noise. In addition, the publicly available database of natural images in [1], dedicated for image segmentation, was used for testing. The system was able to isolate the target object in the test images with high accuracy as evident from the results. The overall average sensitivity and specificity were 98.5 and 99.7 %, respectively. The sharp edges of the object in Fig. 1 were perfectly captured with a 100 % sensitivity and specificity. The algorithm was also capable of identifying the unconnected object in the synthetic image in Fig. 2 and the natural image in Fig. 8 with 100 and 98.06 % sensitivities, respectively. Segmenting the unconnected object is attributed to dealing with each pixel as an individual quantum system. Therefore, the framework would also be suitable for applications where the objects are unconnected and distributed over the image such as skin detection. Another advantage from a computational perspective is that parallel or distributed computation can be used to solve Schrödinger’s equation for each individual pixel. This reduces the required runtime.

A drawback for the pixel-wise approach for segmentation is the neighborhood and object information cannot be explicitly included. In addition, the lack of connectivity information degrades the noise handling performance in the presence of high level of certain types of noise such as salt and pepper noise. A possible solution to this problem is to use features that encode the connectivity properties. For example, the use of neighborhood mean or median as features resulted in nearly exact segmentation of the noisy image in Fig. 6. Another solution which can be investigated in the future is to introduce coupling between the quantum systems associated with the pixels. The general laws of quantum mechanics provide the necessary methods for studying coupled systems.

The external force applied to the quantum system is a powerful point that enables the system to adapt to various types of images. This is reflected in the design of the Hamiltonian using image-based features. This flexibility allowed the construction of the Hamiltonian from different types of features depending on the image under consideration. This can be noticed in the features used in Table 3. For the gradient image in Fig. 3, the mean of a seven-by-seven window is used to achieve 99.52 % sensitivity and 99.93 % specificity, while for the textured image in Fig. 4, Morlet wavelet transform coefficients leaded to 95.44 % sensitivity and 100 % specificity.

The choice of the Hamiltonian based on the proposed method provides many advantages. The necessity to numerically solve Schrödinger’s equation was avoided by selecting the time dependence of the Hamiltonian to satisfy the condition shown in Eq. 12. This enables us to calculate a closed-form solution of Schrödinger’s equation given in Eq. 14. This obviates the need to numerically solve differential equations, thus providing a significant computational speed gain in runtime. Another advantage is that the same proposed form of the Hamiltonian given in Eq. 22 with the specified conditions can be easily adapted to different problems. Moreover, the exponential form of time dependence and the polynomial form of the feature dependence were arbitrary selected as examples to demonstrate the functionality of the system. The Hamiltonian form used in this work satisfies the physical and practical conditions imposed by quantum mechanics. However, other forms could be utilized as well. The design and optimization of the operator to achieve the best results for different applications should be investigated in the future. In addition, there is no limitation on the type or the number of features to be used. This makes the framework easily adaptable to any type of images by selecting the appropriate features.

The proposed algorithm handled the added noise efficiently. The sensitivity of the segmentation in Figs. 5 and 6, the noisy versions of Figs. 1 and 7, the noisy version of Fig. 2 only decreased by less than 0.1 %, while the specificity was almost maintained (less than 0.8 % decrease) compared to the noise-free images. Natural images were segmented with high accuracy using simple features such as the mean and the median of the gray values. Depending only on the gray level, a sensitivity of 93.35 % and a specificity of 99.96 % were reached for the image in Fig. 9. The misclassified pixels were due to relying only on the gray level. The dark part inside the object could not be identified as a foreground class. A solution to this issue is to use a morphological post-processing step to close this region. Figure 10 shows the results after the post-processing step. The results improved greatly, and the sensitivity becomes 99.30 %.

Fig. 10
figure 10

Segmentation of the natural image of the bird after post-processing. a Original, b ground truth, c result of proposed algorithm

The quantum mechanics theory is inherently probabilistic. This is the result of the superposition property which provides the likelihood that each particle occupies a certain state. By incorporating it in the proposed framework, this property could be beneficial. For example, the degree of membership of a pixel to a certain class could be determined or the confidence in the results. Moreover, if a pixel contains more than one object, the probabilistic information could help associating this pixel with all the objects sharing this pixel. However, the final state of the system is determined by the measurement operation. A simple hard thresholding for all the pixels in an image produced high accuracy in this paper. Nevertheless, more sophisticated techniques such as pixel-wise thresholding or adaptive thresholding could be used to improve the results.

In Fig. 1, the quantum-based algorithm produced maximum sensitivity and specificity compared to the others. This is due to the fact that it is based on pixel-wise segmentation, so the pixels on the boundary of the object were correctly segmented. These boundary pixels were the main reason that slightly affected the other approaches. Figure 2 consists of simple geometrical shapes. All the objects in the image are circles of different sizes. The circles are divided into two groups based on their color (black or white) on a single-level gray background. The Canny edge detector failed to differentiate between the two groups and identified all the circles as foreground. Thus, it produced poor specificity of less than 70 %. The active contour method was highly dependent on initialization. The best results, we could achieve, for most initializations was segmenting almost half of the object with a sensitivity of 52.2 %. The normalized cuts obtained 100 % specificity as the quantum algorithm, while the sensitivity was slightly less. The quantum algorithm had both maximum sensitivity and specificity at the same time among the tested algorithms. The edge detection method failed to capture the object with a gradient foreground in Fig. 3. The active contours captured a larger portion of the object, but still not all (sensitivity was less than 70 %). The proposed algorithm comes second and was able to capture most of the object resulting in a sensitivity of 99.52 %. The graph-based method had the highest sensitivity of 99.99 % which is slightly higher than the sensitivity of the new approach. However, in terms of specificity, the quantum mechanics-based framework resulted in a higher specificity than the normalized cuts method. Figure 4 is composed of textured object and background. For this image, the quantum method resulted in the best specificity of 100 % among the others. In terms of sensitivity, it comes on the second rank after the Canny edge detector. For the noisy image in Fig. 5, the presented work produced a sensitivity of 99.99 % exactly like the edge detection. On the other hand, the specificity was higher in this case. For the active contours, the measures were less than the quantum technique. Finally, the normalized cuts method had a slightly higher specificity but lower sensitivity. The normalized cuts method was affected by the higher noise level in Figs. 6 and 7 as it is clear from the sensitivity of less than 75 %. The quantum approach had higher sensitivity and specificity than the active contours. The edge detection method was slightly higher in sensitivity. Also, it is clear by comparing the results for natural images shown in Figs. 8 and 9 that the framework had done a better job in capturing the object. This is evident from the higher values of sensitivity. This shows that the presented framework achieves higher or nearly equal results compared to other algorithms. However, the superior average accuracy over the other algorithms indicates the ability of the system to handle different types of images more efficiently using a single Hamiltonian form. Moreover, the results could be further enhanced by considering the design of the Hamiltonian.

6 Conclusion and future work

The proposed framework provides a generic formulation for applying the powerful tools of quantum mechanics to the fields of signal and image processing. All the design aspects of the framework were considered in a quantum-based algorithm developed for image segmentation by mapping each pixel in an image to a quantum system. The algorithm showed high accuracy in segmenting different types of images including noisy images. The probability that the system reaches a certain final state gives information about the degree of confidence in the results. Various image features could be used to guide the evolution of the algorithm making the system highly adaptable to segment different types of objects. A form of the Hamiltonian operator was selected for image segmentation that follows the rules of quantum mechanics and that resulted in a closed-form solution of the differential equation governing the evolution. Thus, the need for a numerical solution was eliminated and the execution time of the system was highly reduced. The comparison results with three of the existing segmentation methods demonstrated the overall higher segmentation accuracy when applied to different types of images. The segmentation results were obtained using a proposed form of the Hamiltonian for all images. Nevertheless, different forms of the Hamiltonian could be used and optimized according to the nature and the problem under consideration. This is an issue that needs further research. The two-state quantum system was suitable for images containing single object. However, multi-state quantum system could be incorporated to deal with multi-class image processing problems and will be considered in the future. Moreover, this work could be extended to other areas in signal and image processing such as recognition or classification in a straightforward manner. Another topic that we will investigate next is the concept of coupling between quantum systems, or equivalently image element systems, to integrate the connectivity information and prior knowledge in the framework to improve the results.