Hybrid scheme for safe speech transmission based on multiple chaotic maps, watermarking and Arnold scrambling algorithm

Ouguissi, Hadda; Saadi, Slami; Merrad, Ahmed; Kious, Mecheri

doi:10.1007/s11042-022-13301-4

Hybrid scheme for safe speech transmission based on multiple chaotic maps, watermarking and Arnold scrambling algorithm

Published: 06 June 2022

Volume 82, pages 327–346, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

Hybrid scheme for safe speech transmission based on multiple chaotic maps, watermarking and Arnold scrambling algorithm

Download PDF

Hadda Ouguissi¹,
Slami Saadi ORCID: orcid.org/0000-0001-8091-5232²,
Ahmed Merrad² &
…
Mecheri Kious¹

157 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

In this paper, we present a novel scheme for enhancing the security of speech information in communication systems. We build a hybridization of three approaches: Chaotic logistic and tent maps for generating an arbitrary vector by some primarily initiated values to be joined in the original speech signal, an integrated watermark image within the encrypted signal in order to verify, through decryption process, that the encrypted signal is authentic as well as does not suffer from eventual attacks, and the third approach is using an Arnold scrambling key (cat map) to spread signal samples by means of a secret key, then recuperate the original signal from samples which is not possible without this key. Obtained correlation value in the proposed scheme is closer to null which proves that original and encrypted signals are completely dissimilar. Moreover, we recovered the original speech without disturbing the quality. Numerical results of the Signal to Noise Ratio (SNR) and Correlation Coefficient (CC) reported below, and the comparison between the proposed approach to seven recently published works, also reported, reveal the superiority of the proposed scheme and validate our design to be considered amongst the best methods compared to other recently existing strong approaches.

A Robust Speech Encryption System Based on DNA Addition and Chaotic Maps

Speech encryption using chaotic shift keying for secured speech communication

Article Open access 07 September 2017

Secure speech coding using chaotic shift keying for encryption combined with error recovery

Article 15 June 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Speech security systems have been broadly exploited in many applications. Currently, it is very important to protect speeches over communication systems with rapid and safe cryptosystems. As speech communications become further broadly used and yet more delicate, the significance of offering a superior rank of security is of great significance. Up to date, various speech encryption methods have been suggested. Speech watermarking is a strong means to secrete and hence protect information from several intended or unintended utilization during communication. Speech watermarking types and applications as well as topics of robustness, capacity and imperceptibility are detailed in [21]. Authors of [5] presents an efficient safe communication system based on a speech watermarking approach in the purpose to permit an automatic recognition of the speaker, with an optimization of the whole system to boost its performance.

We proposed in [24] a new design for blind watermarking of speech and audio signals, in which we introduced the discrete wavelet transform (DWT) and the discrete cosine transform (DCT) after segmenting the signal. For protection reason, we applied Arnold transform on the watermark to save recognition security. In addition, we presented in [18, 19] a robust blind speech watermarking method using DCT and DWT inside signal sub-sampling. To get high-quality imperceptibility, fusion is realized against various attacks such as: re-quantization, cropping, echo amplification and additive white Gaussian noise (AWGN).

Chaos is a characteristic action of nonlinear dynamic systems. It is described by its large sensitivity to factors and first conditions, which operates as the encryption keys. Mathematically identified as uncertainty governed through deterministic laws. This behavior of chaotic signals offers the possibility to handle several applications. Amongst, the use of Chaos within safe communication has a big consideration. The interesting chapter in [11] explains chaotic systems and illustrates their aptness for use to protect communications information. One of the major motivations for the improved protection of communication given by Chaotic is its broadband signal property that permits efficient spectral cover up of the communication by the chaotic transporter. Authors of [29]discuss the problem of the chaotic safe communication. New double channel diffusion system is given and used in protected communication design, next the channel-switching procedures are assumed to more boost the safety of messages transmission. Paper [20] introduces a low complexity, small delay, and high degree of secured speech encryption method based on change of speech pieces by chaotic Baker map and replacement by masks in together time and transform domains to fill the unvoiced periods inside speech conversation. Chaotic shift keying-based speech encryption and decryption approaches has been presented in [25], where the input speech signals are sampled and its values are segmented into four levels which are permuted using four chaotic generators. A novel speech encryption using fractional chaotic systems is given in [28], where two-channel transmission process is used. The original speech is encoded by a nonlinear function of the chaotic states. The work in [23] intends to show modules for improving the security of speaker authentication by inserting the watermark in the detail coefficients of the speech signal after applying wavelet transform and basing on the energy computation. Speaker is identified by speech and the removed watermark from the watermarked speech. Authors of [4] study and implement the effect of using different floating-point representations on the chaotic system’s performance, for speech security, with numerical simulations for all discussed chaotic systems showing good results in terms of MSE, entropy, correlation coefficient, and pass the NIST test.

In this work, we combined the watermarking with a chaotic approach in a hybrid scheme based on Arnold scrambling Algorithm, in the scope to enhance the security of speeches in communication systems. Our contribution in this work is the substitution of the discrete wavelet transform (DWT) and the discrete cosine transform (DCT) and the segmentation of the speech signal by a novel chaotic representation after spreading and Arnold scrambling by a secret key. The superiority of the proposed scheme is revealed in numerical comparisons with other published works and our design is validated as well in order to be considered amongst the best methods compared to other recently existing strong approaches. In addition, this work highlights the efficiency of the designed schemes in furnishing strength for copyright protection to possession of the data and validating people by using speech as a biometric tool.

2 Chaotic generator (tent map, logistic map)

2.1 Logistic map

The simplest discrete chaotic systems functions that have been used recently for cryptography applications is the logistic map. The logic map function is expressed as:

$$ {\mathrm{x}}_{\mathrm{n}+1}=r.{\mathrm{x}}_{\mathrm{n}}.\left(1-{\mathrm{x}}_{\mathrm{n}}\right) $$

Where x_n takes value in the interval (0, 1), the parameter r is a positive constant and takes values up to four. Its value establishes and investigates the manner of the logistic map. From r = 3.57 the iterations become completely chaotic and start to provide themselves to the aim of encryption. So a superior value of parameter r is selected to get an extremely chaotic so far deterministic discrete-time signal [20, 25, 28]. The preliminary value x₀ and the parameter r areconsidered as the secret key.

2.2 Tent map

The chaotic manners of the tent map (a piecewise linear, constant map with a single maximum) has been considered analytically all over its chaotic region in terms of the invariant density and the power spectrum. As the elevation of the highest point is lowered, consecutive band-splitting changes take place in the chaotic area and gather to the change point into the non-chaotic area. The time-correlation function of non-periodic paths and their power spectrum are computed precisely at the band-splitting points and in the neighborhood to these points. The tent map is topologically conjugate, and hence the performances of the map are in this sense equal below iteration. The chaotic tent map is defined by:

$$ {\displaystyle \begin{array}{c}{\mathrm{x}}_{\mathrm{i}+1}=f\left({\mathrm{x}}_{\mathrm{i}},u\right)\\ {}f\left({\mathrm{x}}_{\mathrm{i}},u\right)=u{\mathrm{x}}_{\mathrm{i}}\kern1em \mathrm{if}\ {\mathrm{x}}_{\mathrm{i}}<0.5\\ {}f\left({\mathrm{x}}_{\mathrm{i}},u\right)=u\left(1-{\mathrm{x}}_{\mathrm{i}}\right)\kern0.75em \mathrm{otherwise}\end{array}} $$

Where:x_i ⋲ [0, 1] for i ≥ 0.

This map converts an interval [0, 1] onto itself and includes merely one control parameter u, correspondingly, where u⋲[0, 2], x₀ is the initial value of the system. The set of real values x₀, x₁, …x_n is named the orbit of the system. Depending on the control parameter u, the system illustrates a variety of dynamical actions varying from expected to chaotic [7, 13, 26].

3 Watermarking

Digital watermarking retrieve is stronger if the original un-watermarked information are available. However, access to the original main signal cannot be acceptable on the entire real-world circumstances [16].In many applications, the identification algorithm is capable of using the original audio signal to extract the watermark from the watermarked signal [3]. It, often significantly, obtains superiorly the detector performance; because the watermark information is extracted throughout subtract the original signal from the watermarked signal. However, if the identification algorithm does not have access to the original signal and this inability considerably decreases the amount of information that could be masked in the original signal. The full process of the watermark insertion and removal is modeled as a communication canal where the watermark is distorted due to the presence of strong intrusive in addition to canal properties.

4 Arnold scrambling algorithm

The KxK matrix W is altered into W′ by Arnold transformation to decrease the autocorrelation coefficient of the image and subsequently the confidentiality of watermark is strengthened [14]. Arnold transformation is cyclical and whereas it is iterated, rarely the original signal will be achieved. The Arnold scrambling algorithm [10] has the features of simplicity and periodicity. So it is generally used to provide an extra stage of protection. Arnold Transform is well recognized as cat seem transforms and is merely suitable for scrambling speech signals by dividing the signal into some vectors which can be converted to N × N dimension matrices used then to mix up the signal.

Arnold Transform is cyclic in nature. The signal decryption depends on the scrambling key, which can be used as secret key and identifies the amount of times that has been scrambled.

5 The proposed hybrid chaotic watermarking architecture

5.1 Emitter side

We try to construct a random chaotic signal using (Tent map, Logistic map). We carry out the fusion of the used watermark with the original speech signal. The produced watermarked signal is combined with the random chaotic signal using a chaotic key to generate an encrypted signal, this signal is then transmitted, Figs. 1, 2 and 3.

The encryption process starting with reading a speech signal stocked in Hard disk using a Matlab function, where, the Matlab recommends to represent the speech file in the range [−1,1]. Also, read the watermark file. Then steps are as follows:

1)
In this step, the user inputs a key (key1) to embed the watermark securely within the original speech signal. The scheme considered to embed the watermark is presented in [24]. In this method, we embed the watermark in DCT and DWT domain and employ sub-sampling technique. This method offers the embedding control from side transparency and robustness of the watermark with a shifting value (∆). This step results a speech signal marked by a secret information (watermark) named Wtr_Sp.
2)
Logistic map and tent map create two chaotic signals, depending on the initial values input by the user, those chaotic signals are generated, and named as Lg_S and Tn_S, respectively.
3)
Using the formulas below, the three signals Lg_S, Tn_S and Wtr_Sp are mixed to produce a new signal named Mx_Sg:

$$ \left\{\begin{array}{c} Mx\_{Sg}_i=\left(\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}\times \mathrm{Wtr}\_{\mathrm{S}\mathrm{p}}_{\mathrm{i}}+\left(1-\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}\right)\mathrm{Lg}\_{\mathrm{S}}_{\mathrm{i}}\right)-1;\kern5.5em \mathrm{Wtr}\_{\mathrm{S}\mathrm{p}}_{\mathrm{i}}=>0\ \\ {} Mx\_{Sg}_i=\left(\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}\times \mathrm{Wtr}\_{\mathrm{S}\mathrm{p}}_{\mathrm{i}}+\left(1-\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}\right)\mathrm{Lg}\_{\mathrm{S}}_{\mathrm{i}}\right)+1;\kern6.25em \mathrm{Wtr}\_{\mathrm{S}\mathrm{p}}_{\mathrm{i}}<0\end{array}\right. $$

(1)

Where i represents samples index.

4)
Decomposing the Mx_Sg into segments, where each segment length is a square number.
5)
Before applying Arnold transform, each segment is reshaped into 2D matrix (N × N elements)
6)
The user inserts another key (key 2), then the encryption process employs that key on each matrix to scramble its elements with Arnold transform.
7)
Reshape each scrambled matrix into 1D vector with length N².
8)
To obtain the final encrypted speech signal, the process of encryption collects the segment with each other.

5.2 Receiver side

The inverse process is performed here. Using the previous chaotic key, the received signal is decrypted by removing the same random chaotic signal generated in the transmitter side. We extract the watermark from the decrypted signal and verify the obtained signal with the original to ensure its originality without degradation, Figs. 2 and 3.

1)
The steps 4 and 5 in the encryption process, are applied on the encrypted speech signal.
2)
Inverse Arnold transform is then applied on each 2D matrix using the same key(key2) employed previously.
3)
Reshape each retrieved matrix to 1D vector with length N².
4)
Collect the retrieved segments with each other to produce$ Mx\_{Sg}_i^{\prime } $.
5)
The same second step in the encryption process is applied without changing the initial value.
6)
Decrypted speech signal samples separation is accomplished respecting the following:

$$ \left\{\begin{array}{c}\mathrm{Wtr}\_{\mathrm{S}\mathrm{p}}_{\mathrm{i}}^{\prime }=\frac{\left( Mx\_{Sg}_i^{\prime }+1-\left(1-\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}\right)\mathrm{Lg}\_{\mathrm{S}}_{\mathrm{i}}\right)}{\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}} Mx\_{Sg}_i<0\\ {}\mathrm{Wtr}\_{\mathrm{S}\mathrm{p}}_{\mathrm{i}}^{\prime }=\frac{\left( Mx\_{Sg}_i^{\prime }-1+\left(1-\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}\right)\mathrm{Lg}\_{\mathrm{S}}_{\mathrm{i}}\right)}{\mathrm{Tn}\_{\mathrm{S}}_{\mathrm{i}}} Mx\_{Sg}_i=>0\end{array}\right. $$

(2)

Where: $ \mathrm{Wtr}\_{\mathrm{Sp}}_{\mathrm{i}}^{\prime } $ is the decrypted speech signal and $ Mx\_{Sg}_i^{\prime } $ results from the fourth step.

Until this step the speech signal is decrypted, but not confirmed. To verify that the speech signal is safe and sent from authenticate side, the decryption process maintain with these steps:

1)
Extraction of the watermark included within the encrypted speech signal, and that by employing the same key(key1) in extraction process presented in [24].
2)
Authentication of decrypted speech signal controlled by verification of the similarity between extracted and original watermark. So, more similarity between the two means decrypted speech more authenticate.

6 Performance evaluation metrics

6.1 Correlation coefficient

The correlation coefficient, usually denoted by ‘r’, is a measure of the strength of the straight-line or linear relationship between two variables [22].In our case the variables are original, encrypted and decrypted speech signal. If two variables are closely related with stronger association, the correlation coefficient is close to the value 1. On the other hand, if the coefficient is close to 0, two variables are not related and cannot predict each other.

The correlation coefficient ‘r’ can be calculated respecting the following formulas [6]:

$$ {r}_{S_1{S}_2=}\frac{\frac{1}{L}{\sum}_{i=1}^L\left({S}_{1,i}-E\left({S}_1\right)\right)\left(\left({S}_{2,i}-E\left({S}_2\right)\right)\right)}{\surd \left(\frac{1}{L}{\sum}_{1,i}^L{\left({S}_{1,i}-E\left({S}_1\right)\right)}^2\right)\times \surd \left(\frac{1}{L}{\sum}_{1,i}^L{\left({S}_{2,i}-E\left({S}_2\right)\right)}^2\right)}\mathrm{Where}\ E(S)=\frac{1}{L}{\sum}_{i=1}^L{S}_{i.} $$

L is the length of speech signals (number of samples), S₁andS₂ are the duality of the two signals (original,encrypted) or (original, decrypted).

6.2 Signal to noise ratio (SNR)

To confirm the performance of digital speech encryption schemes the SNR is calculated, where the SNR measures the noise content in the encrypted speech signals. Cryptanalyst always try to increase the noise content in the encrypted signal so as to minimize the information content in the encrypted data. Also the decipher tries to reduce the noise content in the decrypted signal. Signal to noise ratio is a factor employed to identify the amount by which the signal is stained with noise. The Signal to noise ratio can be calculated by the equation below [8]:

$$ \mathrm{SNR}=10\times \log 10\frac{\sum_{\mathrm{i}=1}^{\mathrm{L}}{\mathrm{S}}_{1,\mathrm{i}}^2}{\sum_{\mathrm{i}=1}^{\mathrm{L}}{\left({\mathrm{S}}_{1,\mathrm{i}}-{\mathrm{S}}_{2,\mathrm{i}}\right)}^2} $$

S_{1, i}and S_{2, i}represent the i^th samples of the (original, ciphered) or (original,deciphered) speech signals, respectively, and L represents the length of speech signals,

6.3 Bit error rate (BER)

To authenticate that the received encrypted speech signal was sent from trusty side, we make examination of the BER. The BER is employed to verify the similarity between the two watermarks, the original and the extracted watermarking image. In addition, BER equals zero means that there is no effect on the watermark and the extraction is successful which means that the received speech signal is sent from authenticated side. BER is expressed by the following formula [24].

$$ \mathrm{BER}=\frac{{\mathrm{B}}_{\mathrm{ERR}}}{\mathrm{N}}\times 100\% $$

Where: B_ERR: The quantity of erroneous bits

N: The number of all bits(size of the watermark)

6.4 NSCR and UACI

In our proposal, computing the unified average changing intensity (UACI) and number of sample change rate (NSCR)between the two encrypted speech signals is to look for the degree of variation when the key is modified slowly. In other words to evaluate the sensitivity of the key. The NSCR and UACI of the two encrypted speech signals are calculated using equations below [26]:

$$ {\displaystyle \begin{array}{c}\mathrm{NSCR}={\sum}_{\mathrm{i}=1}^{\mathrm{L}}\frac{{\mathrm{d}}_{\mathrm{i}}}{\mathrm{L}}\times 100\%\kern0.5em \mathrm{Where}:{\mathrm{d}}_{\mathrm{i}}=\left\{\begin{array}{c}1,\kern1.75em {\mathrm{S}}_{\mathrm{i},1}^{\prime }={\mathrm{S}}_{\mathrm{i},2}^{\prime}\\ {}0,\kern1em \mathrm{otherwise}\ \end{array}\right.\\ {}\mathrm{UACI}=\frac{1}{\mathrm{L}}\left[{\sum}_{\mathrm{i}=1}^{\mathrm{L}}\frac{{\mathrm{S}}_{\mathrm{i},1}^{\prime }-{\mathrm{S}}_{\mathrm{i},2}^{\prime }}{\operatorname{Max}}\right]\end{array}} $$

$ {\mathrm{S}}_{\mathrm{i},1}^{\prime}\mathrm{and}\ {\mathrm{S}}_{\mathrm{i},2}^{\prime } $are the two speech signals with a slow difference on the key in the i^th sample.

L: represents the length of the speech vector.

Max: depends on [2, 15], each sample of speech and audio signals assuming an integer value in the range [0–65,535]and in that situation Max = 65,535,so when using Matlab environment, the digital speeches are normalized in the range [−1–1], subsequently, the Max is 2.

7 Experimental results

In this part, we will assess the proposed scheme by experimental tests using two computing PC’s with windows7, 32bits, CPU dual core. The first with 2gb RAM and on MATLAB7.1 environment and the second with 4gb RAM and on MATLAB8.1 environment. We used the second computing machine for experimenting elapsed time for execution.

All experiments are made using 20 speech files including male and female voices with different periods. This mono voice samples are selected randomly with 16bits for each sample. Table 1 illustrates the used speech signals with duration taken from the famous voices database TIMIT with gender identification. In Table 2, we present the initial statics data values of the two Chaotic maps (Tent and Logistic) on which all results are obtained. The watermark used image (16x16bit) is given in Fig. 4.

Table 1 The used speech signals with duration taken from the famous voices database TIMIT

Full size table

Table 2 The initial statics data values of the two Chaotic maps

Full size table

7.1 Key space

The total number of different keys that used in the encryption system called briefly as key space [13]. In addition, the good encryption system needs to offer a great key space and that for compensating the degradation dynamics in PC, and thus prevents invaders to decrypt original data even after they invest large amounts of resources and time [9].With omitting logistic and tent map, only Arnold scrambling can give a wide key space, where, available permutation positions of an M × M matrix are (M × M)!, for example if we consider a size of matrix are (8 × 8)! ≈ 1,26 × 10⁸⁹, so what will be if the size of the matrix with hundreds.

Depending on [17] the designing of a cryptosystem resists against brute force attack, the size of the key space should be larger than 2¹²⁸(≈3,24 × 10³⁸), based on this point we can conclude that the proposed cryptosystem can resists brute-force attack sufficient for reliable practical employ.

7.2 Keys sensitivity analysis

All the initial values (a₀ and r) from logistic map, (b₀ and u) from tent map and K from Arnold scrambling are a keys, so we tried to test and examine the sensitivity of the encryption algorithm by changing one key or multiple keys. Table 3 shows the values of correlation coefficient, NSCR and UACI, where those values obtained with encryption one of the selected speech signal with a series of keys, then tried to detect the encrypted speech signal using different keys series. To look at the difference and importantly of the keys, firstly the speech signal is encrypted then the metrics mentioned previously calculated using the decrypted speeches signals with the true keys and with wrong keys. The NCSR demonstrates that the two decrypted speech signal hold usually a differ samples with a percentages near 100%. The UACI confirms that the intensities of the samples between the decrypted speeches signals are divergent. Finally correlation coefficient affirms the relationship it to be underprivileged, specifically from Arnold key changing. From the obtained results we can conclude that even changes on the encryption keys values during the decryption process leads to wrong decryption results.

Table 3 Keys Sensitivity

Full size table

7.3 SNR and correlation coefficient

7.3.1 Encryption process effect

The encryption is considered more acceptable when the correlation coefficient value is close to zero. In addition, the encryption process is better when the SNR value decreased. Based on this and from data gathered in Table 4,we observe that the SNR values look too small and the correlation coefficient values are close to zero, and become negative which show that the encrypted signal is very far from the original speech signal and this indicates that the characteristics of the original signal are completely segregated.

Table 4 Numerical results of the Signal to Noise Ratio (SNR) and Correlation Coefficient (CC) between the original and the encrypted signals

Full size table

7.3.2 Decryption process effect

The quality of the speech signal extracted from the encrypted signal is an essential characteristic. Otherwise, the encryption process is not significant. For this, we will discuss the quality of the decrypted signal from the encrypted one, using the two previous coefficient (correlation and SNR). Table 5 gives all statistics data for these coefficients for all speech signals. From these values, we can easily observe that the obtained values are excellent. The correlation coefficient reaches the smallest value of 0.99943 and close to 1, which signifies that there is no difference between the original and the decrypted speech signals and the encryption process is very good. The SNR values are also almost significant. The variations in SNR values are due to speech signal interval and energy, see Table 5. From all this discussion, we can conclude that the proposed scheme conserves greatly the quality of the speech signal when it is decrypted.

Table 5 Numerical results of the Signal to Noise Ratio (SNR) and Correlation Coefficient (CC) between the original and the decrypted signals

Full size table

7.4 Waveforms review

7.4.1 Original and encrypted speech signals

The waveform A in Figs. 5, 6, 7 and 8 shows the original speech signal of: SI770,SI839,SI943 and SI1217 respectively. The waveform B shows the encrypted signal, and for more clarification, the last waveform B is illustrated in two parts. By observing these figures, we can clearly mention that there is no similarity between the original speech signal (A) and its encrypted version (B) which is regularly uniform and it has no relation with the variations of the original waveform (A).

7.4.2 Original and decrypted speech signals

The speech signals SI1715, SI2194, SI2303 and SX29 are showed in the first waveform of Figs. 9, 10, 11 and 12 respectively, and the decrypted speech signals are presented in the second waveform of the same figures. The third waveform illustrates the difference between the original speech signal end the decrypted one. Even if we focus well on the waveforms, we cannot distinguish between the original speech signal and the extracted decrypted signal, and we can only see the difference when we make the difference waveform. This difference waveform is showed with very small amplitude (0.01–0.01), and based on this very tiny difference, we can conclude that the two speech signals: the original and the decrypted one are similar and too close to each other.

7.5 Watermark control and authentication

The proposed scheme is based on adding a watermark to the original speech signal during encryption process and extracting this watermark during decryption process. The purpose of this operation is enhancing more the security and further credibility, so that extracting successfully the watermark during decryption confirms that the received signal is well authenticated and it is transmitted from and authenticated original signal without any transformations in the transmission media. Table 6 provides results when the speech signal is attacked by some AWGN additive white Gaussian noises. In the presented data in this table, we mention that the watermark is extracted successfully in the presence of small noise, but when the noise increases considerably, it affects the watermark. Which indicates that the speech signal is affected. We can observe this in the BER values implying that the transmitted encrypted signal is suffering from some attacks. We cite that we can control the strength of the watermark introduction so that it is possible increasing or decreasing the watermark sensitivity during undergoing the attacks by only varying the ∆ values.

Table 6 BER values variation after speech Signals AWGN attacks

Full size table

Reversible watermarking is based on the process of watermark insertion into a medical image, transmission of the watermarked image, where the complete removal of the watermark from the image on the recipient’s side is important and after watermark removal, the original image is completely restored and unchanged. In our case, since the quality of the decrypted speech signal is accepted and the SNR is greater than the requested value (20 dB), the removal of the watermark is not necessary to be reversible. The only condition on the watermark is that it does not affect the encrypted speech signal.

7.6 Time complexity analysis

Table 7 presents the elapsed time to accomplish the encryption/decryption operations using the proposed scheme on some speech signals.

Table 7 Elapsed times for encryption/decryption operations

Full size table

We observe from Table 7 that the number of seconds taken by the proposed Algorithm to complete the encryption/decryption process is less than the time duration of the speech signal (see Table 1). So, we can judge that the proposed scheme works in real time. This can be explained by the well exploitation, and not costly, of the computing machine performances. Figure 13 illustrates the speech signal durations in addition to the two graphs with different colors represents the time variation of the two operations (encryption and decryption). We can deduce from this figure observation, that the length of speech signal can slightly affects the Algorithm execution time with a proportional relation, when the speech signal length increases the needed time for its processing increases with a real time treatment.

7.7 Comparisons

From previous results, we confirmed that the proposed scheme offers excellent results and we can stand on them. For more substantiation that our design merits further interest and may be considered among the best methods, we try to compare it with other recently published strong approaches.

Basing on results illustrated in Table 8, the proposed method seems well again in many records than other methods used in the comparison and too close in other records. The correlation coefficient between the original and the encrypted speech signal in the proposed approach is classified second for its neighboring to zero just following the method proposed in [26]. The rest of values are also close to zero indicating good quality encryption.

Table 8 Comparison between the proposed approach and seven published methods

Full size table

But the correlation coefficient in the proposed scheme between the original and the decrypted speech signal extracted at the receiver is observed the best with the value of one, which means that there is no difference between the original and the decrypted speech signal. In addition, the proposed approach has the preference of the farthest value from one compared to other methods. The very robust methods presented in [12, 15], show significant SNR values between the original and the decrypted speech signal. The proposed scheme comes following giving a SNR value of 34.08 dB; But the SNR between the original and the encrypted signal is the smallest in the proposed scheme which demonstrates that the encrypted signal is very far from the original speech signal compared to other methods.

8 Conclusion

In this work, a novel scheme for securing speech signals using three approaches: Chaotic generator (tent and logistic maps) for producing a random vector by some initially introduced values to be merged with the original speech signal values, secondly the watermarking is included inside the encrypted signal for the purpose of verification during decryption process that the encrypted signal is authenticated and does not undergo external attacks; The third process Arnold scrambling key (cat map) is used to disperse signal samples by a secret key, and recovering the original signal from samples is not achievable without this key. As a result, we can say that the larger key space is a measure of better encryption and the obtained correlation value in the proposed scheme is nearer to zero which shows that original and encrypted signals are totally uncorrelated. Also, we recovered the original speech without affecting the quality.

References

Abdelfatah RI (2020) Audio encryption scheme using self-adaptive bit scrambling and two multi chaotic-based dynamic DNA computations. IEEE Access 8:69894–69907. https://doi.org/10.1109/ACCESS.2020.2987197
Article Google Scholar
Al-Hooti M, Ahmad T, Djanali S (2019) “Developing audio data hiding scheme using random sample bits with logical operators”, Indonesian J Electr Eng Comput Sci. https://doi.org/10.11591/ijeecs.v13.i1.pp147-154.
Dhar PK, Shimamura T (2015) “Advances in Audio Watermarking Based on Singular Value Decomposition”, Springer briefs in electrical and computer engineering. https://doi.org/10.1007/978-3-319-14800-7.
Elsafty AH, Tolba MF, Said LA, Madian AH, Radwan AG (2020) Enhanced hardware implementation of a mixed-order nonlinear chaotic system and speech encryption application. Int J Electron Commun 125:153347. https://doi.org/10.1016/j.aeue.2020.153347
Article Google Scholar
Fantacci R, Menci S, Micciullo L, Pierucci L (2009) A secure radio communication system based on an efficient speech watermarking approach. Security and Communication Networks 2:305–314. https://doi.org/10.1002/sec.70
Article Google Scholar
Farsana FJ, Gopakumar AK (2016) “A Novel Approach for Speech Encryption: Zaslavsky Map as Pseudo Random Number Generator”, 6th International Conference on Advances InComputing& Communications (Procedia Computer Science 2016) https://doi.org/10.1016/j.procs.2016.07.302.
Farsana FJ, Gopakumar K (2020) “Speech Encryption Algorithm Based on Nonorthogonal Quantum State with Hyperchaotic Keystreams”, Hindawi Advances in Mathematical Physics https://doi.org/10.1155/2020/8050934.
Farsana FJ, Devi VR, Gopakumar K (2020) An audio encryption scheme based on fast Walsh Hadamard transform and mixed chaotic keystreams. Appl Comput Inf. https://doi.org/10.1016/j.aci.2019.10.001
Huang CK (2009) H.H. Nien “multi chaotic systems based pixel shuffle for image encryption”. Opt Commun 282:2123–2127. https://doi.org/10.1016/j.optcom.2009.02.044
Article Google Scholar
Joshi Subir Dr Amit M (2016) “DWT-DCT based Blind Audio Watermarking using Arnold Scrambling and Cyclic Codes”, 3rd International Conference on Signal Processing and Integrated Networks. https://doi.org/10.1109/SPIN.2016.7566666.
Jovic B (2011) Chaotic Signals and Their Use in Secure Communications. In: Synchronization Techniques for Chaotic Communication Systems. Signals and Communication Technology. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21849-1_2
Book MATH Google Scholar
Kaur G,Singh K, Gill HS “Chaos-based joint speech encryption scheme using SHA-1”, Multimedia tools and applications (2021). https://doi.org/10.1007/s11042-020-10223-x.
Khanzadi H, Eshghi M, Borujeni SE (2014) Image encryption using random bit sequence based on chaotic maps. Arab J Sci Eng. https://doi.org/10.1007/s13369-013-0713-z
LalithaCh NV, JayaSree SRPVY (2013) "DWT-Arnold Transform Based Audio Watermarking", IEEE Asia Pacific Conference on Postgraduate Research in Microelectronics and Electronics. https://doi.org/10.1109/PrimeAsia.2013.6731204.
Lima JB, da Silva Neto EF (2015) Audio encryption based on the cosine number transform. Multimedia Tools Appl. https://doi.org/10.1007/s11042-015-2755-6
Lin Y, Abdulla WH, “Audio watermark: a Comprehensive Foundation using MATLAB”, springer Cham (2015). https://doi.org/10.1007/978-3-319-07974-5.
Liu H, Zhao B, Huang L (2019) Quantum image encryption scheme using Arnold transform and S-box scrambling. Entropy. https://doi.org/10.3390/e21040343
Merrad A, Saadi S (2018) Blind speech watermarking using hybrid scheme based on DWT/DCT and sub-sampling. Multimed Tools Appl. https://doi.org/10.1007/s11042-018-5939-z
Merrad A, Saadi S, Benziane A, Hafaifa A (2018) Robust Blind Approach for Digital SpeechWatermarking. In 2nd International Conference on Natural Language and Speech Processing. https://doi.org/10.1109/ICNLSP.2018.8374366.
Mosa E, Messiha N, Zahran O, El-Samie FEAbd (2011) “Chaotic encryption of speech signals”, Int J Speech Technol https://doi.org/10.1007/s10772-011-9103-7.
Nematollahi SAR Al-Haddad (2013) “An overview of digital speech watermarking”. Int JSpeech Technol. https://doi.org/10.1007/s10772-013-9192-6.
Ratner B (2009) The correlation coefficient: its values range between + 1 / − 1, or do they ? J Target Meas Anal Mark 17:139–142. https://doi.org/10.1057/jt.2009.5
Article Google Scholar
Revathi A, Sasikaladevi N, Jeyalakshmi C (2018) Digital speech watermarking to enhance the security using speech as a biometric for person authentication. International Journal of Speech Technology 21:1021–1031. https://doi.org/10.1007/s10772-018-09563-9
Article Google Scholar
Saadi S, Merrad A, Benziane A (2019) Novel secured scheme for blind audio/speech norm-space watermarking by Arnold algorithm. Signal Proc. https://doi.org/10.1016/j.sigpro.2018.08.011
Sathiyamurthi P, Ramakrishnan S (2017) “Speech encryption using chaotic shift keying for secured speech communication”, EURASIP Journal on Audio, Speech, and Music Processing https://doi.org/10.1186/s13636-017-0118-0.
Sathiyamurthi P, Ramakrishnan (2020) “Speech encryption algorithm using FFT and 3D-Lorenz–logistic chaotic map”, Multimed Tools Appl https://doi.org/10.1007/s11042-020-08729-5.
Shah D, Shah T, Ahamad I, Haider MI, Khalid I (2021) A three-dimensional chaotic map and their applications to digital audio security. Multimed Tools Appl. https://doi.org/10.1007/s11042-021-10697-3
Sheu LJ (2011) “A speech encryption using fractional chaotic systems”, Nonlinear Dyn https://doi.org/10.1007/s11071-010-9877-1.
Wangab B, Dong XC (2016) On the novel chaotic secure communication scheme design. Commun Nonlinear Sci Numer Simul 39:108–117. https://doi.org/10.1016/j.cnsns.2016.02.035
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Technology, Materials, Energetic Systems, Renewable Energies, and Energy Management Laboratory (LMSEERGE), Ammar Thelidji University of Laghouat, Laghouat, Algeria
Hadda Ouguissi & Mecheri Kious
Faculty of Exact Sciences & Informatics, Ziane Achour University of Djelfa (UZAD), Djelfa, Algeria
Slami Saadi & Ahmed Merrad

Authors

Hadda Ouguissi
View author publications
You can also search for this author in PubMed Google Scholar
Slami Saadi
View author publications
You can also search for this author in PubMed Google Scholar
Ahmed Merrad
View author publications
You can also search for this author in PubMed Google Scholar
Mecheri Kious
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Slami Saadi.

Ethics declarations

Conflict of interest/Competing interests

no conflicts of interest.

Participants

no other research participants except authors that are all consent.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ouguissi, H., Saadi, S., Merrad, A. et al. Hybrid scheme for safe speech transmission based on multiple chaotic maps, watermarking and Arnold scrambling algorithm. Multimed Tools Appl 82, 327–346 (2023). https://doi.org/10.1007/s11042-022-13301-4

Download citation

Received: 15 April 2021
Revised: 21 January 2022
Accepted: 30 May 2022
Published: 06 June 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s11042-022-13301-4

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Hybrid scheme for safe speech transmission based on multiple chaotic maps, watermarking and Arnold scrambling algorithm

Abstract

Similar content being viewed by others

A Robust Speech Encryption System Based on DNA Addition and Chaotic Maps

Speech encryption using chaotic shift keying for secured speech communication

Secure speech coding using chaotic shift keying for encryption combined with error recovery

1 Introduction

2 Chaotic generator (tent map, logistic map)

2.1 Logistic map

2.2 Tent map

3 Watermarking

4 Arnold scrambling algorithm

5 The proposed hybrid chaotic watermarking architecture

5.1 Emitter side

5.2 Receiver side

6 Performance evaluation metrics

6.1 Correlation coefficient

6.2 Signal to noise ratio (SNR)

6.3 Bit error rate (BER)

6.4 NSCR and UACI

7 Experimental results

7.1 Key space

7.2 Keys sensitivity analysis

7.3 SNR and correlation coefficient

7.3.1 Encryption process effect

7.3.2 Decryption process effect

7.4 Waveforms review

7.4.1 Original and encrypted speech signals

7.4.2 Original and decrypted speech signals

7.5 Watermark control and authentication

7.6 Time complexity analysis

7.7 Comparisons

8 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest/Competing interests

Participants

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation