1 Introduction

Feynman had first presented quantum computer [1]. However the researchers were attention to quantum computation since Shor’s quantum integer factoring algorithm [2] was proposed. Quantum computation had proved its computation ability base on parallelism. In recent years, many quantum computation model were appeared, include quantum circuits, adiabatic quantum computing and ion trap quantum computing, and so on. Quantum image processing(QIP) is the new branch of quantum computing, has achieved abundant research work in recent years. There are many quantum image representations, such as qubit lattice [3], real ket [4], entangled image [5], FRQI [6], MCQI [7], NEQR [8], GQIR [9] and etc. Based on these quantum image representations, quantum image scrambling [10,11,12], quantum watermarking [13,14,15,16], quantum geometric transformation [17,18,19], quantum image color transformation [20], quantum image segmentation [21], quantum image matching [22, 23], quantum image image encryption [24, 25] and etc have been presented. Quantum speech processing(QSP) is an emerging interdisciplinary research area at the intersection of quantum computing and speech processing, and it is similar to quantum image processing, but QSP’ research work is still in the bud. Quantum speech processing can use the advantages of quantum computing in order to improve classical methods of speech processing. The first quantum representation of digital audio [26] had been given. Some quantum speech processing algorithms have been realized. In the next step, various quantum speech processing algorithms, such as speech recognition, speech synthesis and so on, need to be presented.

Speech is the most general way to change information with people. It is easy to understand by people, also it is necessary to do between person and machine. Automatic speech recognition (ASR) [27] technology can transform speech into text, then the text could be deal with machines. So Speech recognition is important for many applications, it is the base for intelligent human-computer interactions. In recent years, many methods were applied to deal with speech processing, include deep learning and others. The most challenges are the accuracy and efficiency in trillions bytes of speech data and training models. Hence new computation approaches are interesting to researchers now. So it is valuable work to research quantum speech algorithms, which will promote the research work of speech processing.

This paper presents a quantum endpoint detection algorithm, also gives a n-bits quantum comparator at the same time. The detection of the presence of speech embedded in various types of non-speech events and background noise is called endpoint detection, speech detection, or speech activity detection. The endpoint detection of speech signal is the basic processing of speech preprocessing. To understand speech, it needs to distinguish the boundaries of speech, otherwise this is difficult with noise signal. Hence the endpoint detection algorithm could help to distinguish the endpoint of speech. In generally, there has two kinds of endpoint detection methods, the spectrum entropy method and the double threshold comparing method. There are many endpoint detection approaches [28,29,30], such as threshold method, spectrum analysis, cepstral analysis, zero-crossing rate and so on. In order to improve the performance for speech recognition in the noisy environment, the multiple features of speech are used in many improvement algorithms. Computational complexity is the key factor for endpoint detection algorithm. In the threshold algorithm, thresholds are used to detect the starting and ending of speech. The proposed quantum algorithm was base on the QRDA(quantum representation of digital audio) [26], which used two entangled qubit sequences to store audio data.

The paper is organized as follows: In Section 2, it introduces the related principles. Section 3 presents the quantum endpoint detection algorithm and a n-bits quantum comparator. The analysis of the proposed quantum algorithm and quantum comparator are done in this section. Finally, the conclusion is given in Section 4.

2 Background

In this section, the brief introductions of related techniques are given.

2.1 Speech Pre-processing

The first important step of ASR is endpoint detection, which could detect the location of speech regions from the noise. This pre-processing has been shown to improve the accurate of isolated word recognition [27], and it could reduce the computation amount of speech processing.

The procedure of speech pre-processing is shown as Fig. 1. The first step is windowing, the speech signals are segmented into frames. The second step is feature extraction, which can finish to compute multiple feature base on frames. The third step finishes to classification speech and noise by adjustment of threshold. At last step, it could compute the starting and ending location of speech.

Fig. 1
figure 1

The procedure of endpoint detection

2.2 The Short-Term Energy

Speech has the property of time-changed, but it is unchanged in the short time. Hence this property is called short-term of speech. So the speech data is always divided into many frames. The frame is part of voice signal which is the basic processing unit.There are many features of short-term speech, the short-term energy, the high band energy and zero crossing rate.

The energy value of one frame information is called short-term energy. E n denotes the number n frame’s short-term energy of voice signal x n (m). The equation is shown as below:

$$ E_{n}= \sum\limits_{m=1}^{N-1}{{x_{n}^{2}}(m)} $$
(1)

In the equation, sign n denotes integer and sign N denotes the length of signal frame.

2.3 QRDA

The audio data A = [a 0, a 1, ⋯ ,a L−1], where L is the length of the audio and a T ∈{0, 1, ⋯ ,2q − 1}. QRDA [27] uses a q-bit binary sequence \(D_{T}={D_{T}^{0}}{D_{T}^{1}}{\cdots } D_{T}^{q-2}D_{T}^{q-1}\) to encode the amplitude value a T as below.

$$ a_{T}={D_{T}^{0}}{D_{T}^{1}}{\cdots} D_{T}^{q-2}D_{T}^{q-1},{D_{T}^{i}}\in\{0,1\},i=0,1,\cdots,q-1. $$
(2)

where T = 0,1,⋯ ,L − 1 is the time information.

Hence, an QRDA audio can be written as below.

$$\begin{array}{@{}rcl@{}} |A\rangle &=& \frac{1}{\sqrt2^{l}}\sum\limits_{T=0}^{L-1}|D_{T} \rangle\otimes |T\rangle\\ |T\rangle &=& |t_{0} t_{1} {\ldots} t_{l-1}\rangle, t_{i}\in\{0,1\}\\ |D_{T}\rangle &=& |{D_{T}^{0}}{D_{T}^{1}}{\cdots} D_{T}^{q-2}D_{T}^{q-1}\rangle, {D_{T}^{i}}\in\{0,1\}, \end{array} $$
(3)

where

$$ l=\left\{ \begin{array}{ll} \lceil \log_{2} L \rceil, & L > 1 \\ 1, & L=1 \end{array}, \right. $$
(4)

where, ⊗ is the tensor product notation, |D T 〉 = |D T0D T1⋯D T q−2D T q−1〉 is the binary sequence of the sample amplitude value, |T〉 = |t 0 t 1t l−1〉 is the corresponding time information in the audio. Therefore, QRDA needs (q + l) qubits to represent a L audio with amplitude range 2q.

2.4 Plain Adder

Vlatko et al. [31] has prompted the plain adder to imply the addition operation. There are two register a and b. This operation rewrites the computation result into the register b, which |a, b〉→|a, a + b〉. The full addition quantum circuit network is shown in Fig. 2, and the basic sum and carry operations are illustrated in Fig. 3. The plain adder network will be used in our quantum endpoint detection algorithm in Section 3.

Fig. 2
figure 2

The plain adder

Fig. 3
figure 3

Basic operations of Sum and Carry

3 A Quantum Endpoint Detection Algorithm

In this section, the quantum comparator circuits are given firstly. Then the quantum endpoint detection algorithm is presented.

3.1 Quantum N-bits Comparator

Quantum comparator is useful to many applications, such as quantum search algorithms. Many works had prompted, such as Oliveira and Ramos [32], Khan [33], Wang et al. [34].

In generally, there are three results of comparison between element x and element y, x > y, x = y, x < y. In this paper, we only use two comparison results: xy, x < y. Hence it needs one quantum bit z to store the result of comparison. After comparing operation, we could measure z bit, if the value is 1, then xy, otherwise x < y. Figure 4 shows the comparator, the comparator has two inputs and one output, and the output is |1〉 when xy, |0〉 when x < y.

Fig. 4
figure 4

The comparator

3.1.1 Quantum Comparator

In this section, we will present one new quantum comparator. The truth table of comparison between one bitx and y shows in Table 1. The one bit comparator circuit is given in Fig. 5. In order to compare the two values, one auxiliary bit z is used to store the result of comparison. The auxiliary bit z is first set to |0〉, when { (x, y)} = { (1,1),(1,0),(0,0)}, the z is set to |1〉. Thus if the auxiliary bit z is |1〉, we could know that x is greater than or equal to y.

Table 1 Truth table of one bit comparison
Fig. 5
figure 5

One bit comparison circuit

Two bits comparison’s truth table shows in Table 2. When the bits sequence x 1 x 0 y 1 y 0 are 0000,0100,0101,1000,1001,1010,1100,1101,1110,1111 that z is changed to |1〉. Also the circuit of Figs. 5 and 6 can be inversed.

Table 2 Truth table of two bit comparison
Fig. 6
figure 6

Two bits comparison circuit

The number of combination gates are 2 and 7 in one-bit comparator circuit and two-bits comparator. Base on the method of designing these two comparator circuits, the n-bits comparator could be given. The n-bits quantum comparator’s complexity is based on the number of quantum gates. In Theorem 1, we gave the number of quantum gates in the n-bits quantum comparator.

Theorem 1

The n-bits quantum comparator has \(\frac {2^{n}(2^{n}-1)}{2}+1\) combination gats.

Proof

From Tables 1 and 2 we can conclude, if each bit of y n−1 y n−2...y 0 are set all zero to all one then there need N comparison circuits for comparsion.Where the N = 2n + (2n − 1) + ... + 1.

To simplify formula of N = 2n+(2n−1)+...+1 = 2n+(2n−1)+...+(2n−(2n−1)) = 2n2n−(1+2+...+(2n−1)) = 22n−2n−1(2n−1).

When y n−1 y n−2...y 0 are all zero, the comparison circuits can be compressed, it needs only one comparison circuit. Hence there save 2n − 1 comparison circuits. Then

$$Sum = N-(2^{n}-1) = 2^{2n}-2^{n-1}(2^{n}-1)-(2^{n}-1) = \frac{2^{n}(2^{n}-1)}{2}+1. $$

3.1.2 Improvement of Quantum N-bits Comparator

In Section 3.1.1, the n-bits quantum comparator has \(\frac {2^{n}(2^{n}-1)}{2}+1\) combination gates. Hence we will improve the quantum comparator to simplify complexity.

Firstly, the new two bits comparator is given in Fig. 7. The two bits comparator is based on the one bit comparison circuit which is shown in Fig. 5. There have three auxiliary bits z, z 0, z 1, and the initial states are |0〉. The auxiliary bit z 0 indicates the relation between x 0, y 0, and the auxiliary bit z 1 indicates the relation between x 1, y 1. The auxiliary bit z is used to indicate the relation between x 1 x 0 and y 1 y 0.

Fig. 7
figure 7

Improvement of two bits comparator

To expand this new two bits comparator, the n-bits quantum comparator is shown in Fig. 8. The |x n−1, x n−2,...,x 1, x 0〉 and |y n−1, y n−2,...,y 1, y 0〉 compare from the most significant bit to least significant bit. If one bit of these two quantum states aren’t same, then these could dedicate the relation of |x n−1, x n−2,...,x 1, x 0〉 and |y n−1, y n−2,...,y 1, y 0〉 . For example, while |x n−1〉 is equal to |y n−1〉, if |x n−2〉 is bigger than |y n−2〉, then |x n−1, x n−2,...,x 1, x 0〉 must be bigger than |y n−1, y n−2,...,y 1, y 0〉. The auxiliary bit |z a 〉 dedicates the relation between |x a 〉 and |y a 〉. The |z〉 dedicates the relation between |x n−1, x n−2,...,x 1, x 0〉 and |y n−1, y n−2,...,y 1, y 0〉 .

Fig. 8
figure 8

Improvement of n bits comparator

Theorem 2

The complexity of a n-bits improvement quantum comparator is based on the number of combination gats, it has 4n combination gates.

Proof

In Fig. 8, there have many shadows, and each shadow includes one bit comparator circuit. There has n shadows in a n-bits improvement quantum comparator. For example, The (n − 1)th shadow is the comparator circuit of |x n−1〉 and |y n−1〉, that |z n−1〉 and |z〉 are the auxiliary bits to dedicate the relation of two bits. Shown as Fig. 9, it obviously has four combination gates. Hence a n-bits improvement quantum comparator, n shadows, has 4n combination gates. □

Fig. 9
figure 9

Equivalent circuit of comparator circuit

3.1.3 Analysis of Quantum Comparator

In this section, we realize the analysis for our quantum comparator and other comparators. Our n-bits quantum comparator needs n + 1 auxiliary qubits and has 4n combination gates. Oliveira and Ramos [32] uses about 2n − 1 auxiliary qubits, and the cost of construct a n-qubits comparator is about 4n + 1 combination gates. Khan [33] needs n + 1 auxiliary qubits and 14n − 2 combination gates. Wang et al. [34] uses 2n − 2 auxiliary qubits and about 4n combination gates. From the comparison, our quantum comparator is simpler than other comparators.

3.2 Quantum Endpoint Algorithm

We proposed one quantum endpoint detection algorithm by energy feature. The parameters include energy windows shifting and data sampling rate. To do quantum endpoint detection needs three phases: Firstly, it represents the frame data base QRDA, so it facilitates to compute. The secondly, it gives the quantum algorithm for detecting the endpoint. Finally, it gets the outcome by the measurement.

Based on QRDA, the definition of |A〉 is same with (3). The given audio data has L sampling data, and each of sampling data is in the range [0,2q − 1], q denotes that it uses q qubits to storage the sample’s binary value. |A〉 represented the quantum voice data.

In the first phase, the sampling data is divided into frames. Each frame includes N samples, and the adjacent frames have the same M samples. The definition of i th frame is below:

$$ |A_{i}\rangle = \frac{1}{\sqrt2^{l}}\sum\limits_{T=0}^{N-1}|D_{T} \rangle\otimes |T\rangle\ $$
(5)

where

$$ l=\left\{ \begin{array}{ll} \lceil \log_{2} N \rceil, & N > 1 \\ 1, & N=1 \end{array}, \right. $$
(6)

The i is frame number, which could use to calculate every frames’s samples. The i th frame has samples from [N ∗ (i − 1) − M ∗ (i − 1)] to [NiM ∗ (i − 1)]. In order to facilitate the calculation, there has one rule which is the L = [NiM ∗ (i − 1)]. For example, suppose the N is 4, M is 2, then L = [4i − 2(i − 1)]. If i is 2, then L is 6. So there are two frames in this example, and have 6 samples. Each sample’s value is between (0, 2q−1 ). The preparation and retrieving of |A i 〉 had discussed in QRDA [26].

In the second phase,it calculates the short-term energy for every frames. Then the short-term energy will compare with the threshold to determine the result. The quantum end-point detection algorithm is one unitary transformation in the 2n- dimensional Hilbert space of n qubits. Our work is to present this unitary transformation. There have four steps in our quantum endpoint algorithm, which are shown in Fig. 10.

Fig. 10
figure 10

The algorithm

Step 1: Squaring operation

All samples are squared in the first step. These samples are stored in q bits of qubit. As shown in Fig. 11b, \({D_{T}^{0}}\) to \(D_{T}^{q-1}\) are used to storage all samples of the frame. The \({D_{T}^{q}}\) is the new auxiliary qubit, that help to do squaring operation. The squaring operation uses the Swap gates, and there are q gates to be used. After these operations, we could get \({D_{T}^{0}}...{D_{T}^{q}}\) in the Little-Endian model. The Fig. 11a shows the universal circuit of squaring, and the \({D_{T}^{q}}\) is also the new auxiliary qubit for computing operation that is the (q + 1)th qubit. Then the |A i 〉 is changed into |B i 〉:

$$ |B_{i}\rangle = \frac{1}{\sqrt2^{l}}\sum\limits_{T=0}^{N-1}|D_{T} \rangle\otimes |T\rangle\ $$
(7)
Fig. 11
figure 11

The squaring operation

Where \(|D_{T}\rangle =|{D_{T}^{0}}{D_{T}^{1}}{\cdots } D_{T}^{q-2}D_{T}^{q-1}{D_{T}^{q}}\rangle \) is the binary sequence of the squaring sample, and \(|{D_{T}^{q}}\rangle \) is the new auxiliary qubit.

Step 2: Shiftiing operation

For the feature of superposition, all samples of one frame are stored in q qubits, hence it is easy to do squaring in the first operation. But how to add up all values of samples is difficult. The shifting operation helps to get every samples’ value for the next adding operation. Based on the circuit of shifting operation in Fig. 12, the |B i 〉 changes into |C i 〉 after shifting operation.

$$ |C_{i}\rangle = \frac{1}{\sqrt2^{l}}\sum\limits_{V=1}^{q}\sum\limits_{T=0}^{N-1}|{D_{T}^{V}} \rangle $$
(8)
Fig. 12
figure 12

The shifting operation

The V represents the number of bit for data in the equation 8. In Fig. 12b, there have n shadow modules that each module can get one sample data out of superposition. Hence the n samples are stored into \({D^{q}_{0}}, {D^{q}_{1}}, ..., {D^{q}_{n}}\).

Step 3: Adding operation

In this step, the plain adders [31] are used to calculate the sum of samples, such as in Fig. 13. Because the plain adder has only two inputs, hence there need many plain adders to add the samples. If there have 2n samples, then it needs 2n − 1 plain adders and it is n + 1 layers of adder network.

Fig. 13
figure 13

The adder network

Step 4: Adjudging operation

Base on the last step, the comparator is used to adjudge if the energy is the higher than threshold. So we could know if this frame is endpoint.

The circuit of quantum endpoint algorithm includes four steps, and it is given in Fig. 14. In this algorithm, the inputs have data of audio, time, threshold and auxiliary qubits. This algorithm is used to detect endpoint of audio, hence we only need to measure Z qubit, shown in Fig. 14, to get result wether the inputs frame data is the endpoint. Because our algorithm only needs to measure one qubit, so it’s one effective quantum algorithm for easily getting output.

Fig. 14
figure 14

The endpoint algorithm

3.3 Analysis of Quantum Endpoint Detection Algorithm

Our quantum endpoint algorithm uses (n + 2)q + n + 1 qubits. However, the network complexity of quantum algorithm is usually in according with the number of elementary quantum gates. In [35], one tc o n t r o lN O T g a t e is equivalent to (2(t − 1)T o f f o l i g a t e s + 1C N O T g a t e), and one T o f f o l i g a t e is equivalent to 6 C N O T g a t e s, then one tc o n t r o lN O T g a t e is equivalent to (12t − 11)C N O T g a t e s.

In Fig. 10, the quantum circuit network can be divided into four layers. So we would calculate the number of quantum gates for each layer, then get the sum of quantum gates for four layers. We consider that the C o n t r o lN O T g a t e is the basic unit.

In U1 layer, it needs (q + 1) qbits to storage the one frame, and has q swap gates. One swap gate could be implemented by three C N O T g a t e s [36]. The network complexity of U1 is 3q C N O T g a t e s. In U2 layer, there have N sub-shifting operations, and each sub-shifting operation has (q + 1) 3 − c o n t r o lN O T g a t e s. Then the network complexity of U2 is N(q + 1) 3 − c o n t r o lN O T g a t e s, it is equivalent to 25N(q + 1)C N O T g a t e s. In U3 layer, there have (N − 1) plain adders, the network complexity of the plain adder is 28n − 12 as it contains 2n − 1 carries, n sums, and one Control-NOT gate, thus the network complexity of U3 is (N − 1)28(q + 1) − 12. According to theorem 2 and that one tc o n t r o lN O T g a t e is equivalent to (12t − 11)C N O T g a t e s, the network complexity of U4 layer is 4q combination gates, and which is equal to (6q 2 − 5q)C N O T g a t e s. To sum up this four layers’ complexity, hence network complexity of this quantum endpoint detection algorithm is 6q 2 + (53N − 30)q + 53N − 1. Because the q represents the number of bits in a sample. Normally, the q value is 8, so the network complexity of the algorithm is 477N − 57.

4 Conclusion

Quantum speech processing is the new branch of quantum information processing, and it is still in the bud. In this paper, a n-bits quantum comparator is given firstly, it could be used in many fields, such as the quantum searching algorithm. This n-bits quantum comparator only uses n + 1 auxiliary qubits which is the simpler than other comparators. The first quantum endpoint detection algorithm is presented which is based on the quantum audio representation of QRDA and the n-bits quantum comparator, then the algorithm circuits are designed and analysed their network complexity.