A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping

Rassam, Murad A.; Almekhlafi, Rasha; Alosaily, Eman; Hassan, Haneen; Hassan, Reem; Saeed, Eman; Alqershi, Elham

doi:10.1007/978-3-030-33582-3_38

Murad A. Rassam^17,18,
Rasha Almekhlafi¹⁸,
Eman Alosaily¹⁸,
Haneen Hassan¹⁸,
Reem Hassan¹⁸,
Eman Saeed¹⁸ &
…
Elham Alqershi¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1073))

Included in the following conference series:

International Conference of Reliable Information and Communication Technology

1619 Accesses

Abstract

In this paper, a new model for voice morphing is proposed. The spectral characteristics of a source speaker’s speech have been transferred to speech as it was spoken by another designated target speaker. The proposed model performs a phoneme segmentation of the voice signal and then transforms the spectral characteristics of each segment using a Linear Prediction model. The spectral features extracted using the Linear Prediction Coding (LPC) technique are aligned using the Dynamic Time Wrapping (DTW). The Generative Topographic Mapping (GTM) method was used for modeling the LPC features. Then, the transformation is achieved using the Gaussian Mixture Model (GMM). The transformed code-books are finally converted to prediction coefficients, and the excitation signal is filtered in order to synthesis the speech. A correlation test is performed between the source, and target signals showed a high correlation. The results reveal that the proposed model is promising in terms of recognizing full sentences in addition to individual words.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A Multi-level GMM-Based Cross-Lingual Voice Conversion Using Language-Specific Mixture Weights for Polyglot Synthesis

Article 10 July 2015

Voice Conversion for TTS Systems with Tuning on the Target Speaker Based on GMM

Analysis of Features and Metrics for Alignment in Text-Dependent Voice Conversion

References

Hutchinson, M.: Methods for voice conversion (2012)
Google Scholar
Saundade, M., Kurle, P.: Speech recognition using digital signal processing. Int. J. Electron. Commun. Soft Comput. Sci. Eng. 2, 31 (2013)
Google Scholar
Orphanidou, C., et al.: Voice morphing using the generative topographic mapping (2003)
Google Scholar
Kain, A., Macon, M.W.: Spectral voice conversion for text-to-speech synthesis (1998)
Google Scholar
Mccree, A.: Low-Bit-Rate Speech Coding. Information Systems Technology Group, MIT Lincoln Laboratory (2008)
Google Scholar
Abe, M., Nakamura, S., Shikano, K., Kuwabara, H.: Voice conversion through vector quantization. In: Proceedings of IEEE ICASSP (1988)
Google Scholar
Rabiner, L.R., Schafer, R.W.: Digital Processing of Speech Signals. Prentice-Hall Signal Processing Series (1978)
Google Scholar
Drioli, C.: Radial basis function networks for conversion of sound spectra. EURASIP J. Appl. Signal Process. 2001, 36–44 (2001)
Google Scholar
Orphanidou, C., Moroz, I.M., Roberts, S.J.: Wavelet-based voice morphing (2004)
Google Scholar
Garofolo, J.S.: TIMIT Acoustic-Phonetic Continuous Speech Corpus LDC93S1. Web Download. Linguistic Data Consortium, Philadelphia (1993)
Google Scholar
Songar, A., Harita, M.B.: MATLAB based voice conversion model using PSOLA algorithm. Int. J. Digit. Appl. Contemp. Res. 1, 2319–4863 (2013)
Google Scholar
Makhoul, J.: Linear prediction: a tutorial review. Proc. IEEE 64, 561–580 (1975)
Article Google Scholar
Hosom, J.-P.: Automatic time alignment of phonemes using acoustic-phonetic information, May 2000
Google Scholar
Markus, J.F.: GTM: the generative topographic mapping, April 1998
Google Scholar
Netlab Toolbox. http://www1.aston.ac.uk/eas/research/groups/ncrg/resources/netlab/

Download references

Author information

Authors and Affiliations

Information Technology Department, College of Computer, Qassim University, Buraidah, Kingdom of Saudi Arabia
Murad A. Rassam
Faculty of Engineering and Information Technology, Taiz University, 6803, Taiz, Yemen
Murad A. Rassam, Rasha Almekhlafi, Eman Alosaily, Haneen Hassan, Reem Hassan, Eman Saeed & Elham Alqershi

Authors

Murad A. Rassam
View author publications
You can also search for this author in PubMed Google Scholar
Rasha Almekhlafi
View author publications
You can also search for this author in PubMed Google Scholar
Eman Alosaily
View author publications
You can also search for this author in PubMed Google Scholar
Haneen Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Reem Hassan
View author publications
You can also search for this author in PubMed Google Scholar
Eman Saeed
View author publications
You can also search for this author in PubMed Google Scholar
Elham Alqershi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Murad A. Rassam .

Editor information

Editors and Affiliations

College of Computer Science and Engineering, Taibah University, Medina, Saudi Arabia
Faisal Saeed
School of Computing, Universiti Utara Malaysia (UUM), Sintok, Kedah Darul Aman, Malaysia
Fathey Mohammed
Management of Information Systems Department College of Business Administration, Taibah University, Yanbu, Saudi Arabia
Nadhmi Gazem

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rassam, M.A. et al. (2020). A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping. In: Saeed, F., Mohammed, F., Gazem, N. (eds) Emerging Trends in Intelligent Computing and Informatics. IRICT 2019. Advances in Intelligent Systems and Computing, vol 1073. Springer, Cham. https://doi.org/10.1007/978-3-030-33582-3_38

Download citation

DOI: https://doi.org/10.1007/978-3-030-33582-3_38
Published: 02 November 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-33581-6
Online ISBN: 978-3-030-33582-3
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-level GMM-Based Cross-Lingual Voice Conversion Using Language-Specific Mixture Weights for Polyglot Synthesis

Voice Conversion for TTS Systems with Tuning on the Target Speaker Based on GMM

Analysis of Features and Metrics for Alignment in Text-Dependent Voice Conversion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

A Voice Morphing Model Based on the Gaussian Mixture Model and Generative Topographic Mapping

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

A Multi-level GMM-Based Cross-Lingual Voice Conversion Using Language-Specific Mixture Weights for Polyglot Synthesis

Voice Conversion for TTS Systems with Tuning on the Target Speaker Based on GMM

Analysis of Features and Metrics for Alignment in Text-Dependent Voice Conversion

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation