Abstract
The increase in development of machines that have capability to understand the complicated behavior to solve the human brain involvement problems, the auto text generation application also gets the wide attention. The Language modeling or text generation is a task of next character or word prediction in a sequence with analysis of input data. The ATG enable the machines to write and provide the help to reduce the human brain effort. The ATG is also useful for understanding and analysis of languages and provide the techniques that enable the machines to exchange information in natural languages. At the large scale the text data are created everywhere (whatsApp, facebook, and tweets etc.) and freely online available therefore an effective system is needed for automation of text generation process and analysis of the text data for extracting meaningful information from it so in this work, a case study is presented on how develop a text generation model using hybrid recurrent neural network for English language. The explore model find the dependencies between characters and the conditional probabilities of character in sequences from the available input text data and generate the wholly new sequences of characters like human beings writing (correct in meaning, spelling and sentence structure). A comprehensive comparison between these models, namely, LSTM, deep LSTM, GRU and HRNN is also presented. Previously the RNN models are used for text predictions or auto text generation but these models created the problem of vanishing gradient (short memory) when process long text, therefore the GRU and LSTM models were created for solving this problem. The text generated by GRU and LSTM have many spellings error, incorrect sentence structure, therefore, filling this gap the HRNN model is explore. The HRNN model is the combination of LSTM, GRU and a dense layer. The experiments performed on Penn Treebank, Shakespeare, and Nietzsche datasets. The perplexity of HRNN model is 3.27, the bit per character is 1.18 and average word prediction accuracy is 0.63. As compare with baseline work and previous models (LSTM, deep LSTM and GRU), our model (HRNN) perplexity and bit per character are less. The texts generated by HRNN have fewer spelling errors and sentence structure mistakes. A closer analysis of explored models’ performance and efficiency is described with the help of graph plots and generated texts by taking some input strings. These graphs explain the performance for each model.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Abbreviations
- ATG:
-
Auto text generation
- BRNN:
-
Bidirectional recurrent neural networks
- ELIZA:
-
First computer program for text generation
- GLOVE:
-
Global vector
- GRU:
-
Gated recurrent unit
- HRNN:
-
Hybrid recurrent neural network
- LDA:
-
Latent Dirichlet allocation
- LM:
-
Language modeling
- LSTM:
-
Long short-term memory
- NN:
-
Neural network
- PTB:
-
Penn treebank dataset
- NLP:
-
Natural language processing
- RNN:
-
Recurrent neural networks
- SLM:
-
Statical language modeling
- TG:
-
Text generation
References
Shacklett M (2017) Unstructured data: a cheat sheet. https://www.techrepublic.com/article/unstructured-data-the-smart-persons-guide/ (online acceptance 12 Nov 2019)
Turing AM (1950) Computing machinery and intelligence. Mind LIX:433–460
Jurafsky D, Martin JH (2014) Speech and language processing, vol. 3. Pearson London
Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp. 310–318
Kneser R, Ney H (1995) Improved backing-off for m-gram language modeling. In: International conference on acoustics, speech, and signal processing. ICASSP-95, vol 1. IEEE, pp. 181–184
Deng L (2014) Deep learning: methods and applications. Foundations Trends Signal Process 7(3–4):197–387
Mikolov T et al (2013) Efficient estimation of word representations in vector space
Dale R, Reiter E (2000) Building natural language generation systems. Cambridge University Press. ISBN 978-0-521-02451-8
Grover A, Leskovec J (2016) node2vec: scalable feature learning for Networks. KDD 2016:855–864
Frankle J, Carbin M (2019) The lottery ticket hypothesis: finding sparse, trainable neural networks. In: ICLR
Goth G (2016) Deep or shallow, NLP is breaking out. Commun ACM 59(3):13–16
Bengio Y et al (2003) A neural probabilistic language model. J Mach Learn Res 3(2003):1137–1155
Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning, pp 160–167
Graves A, Jaitly N, Mohamed A (2013) Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 273–278
Krizhevsky A et al. (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105
Sutskever I (2013) Training recurrent neural networks. Ph.D. thesis, University of Toronto
Sutskever I et al. (2011) Generating text with recurrent neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-2011), pp 1017–1024
Wang H et al (2015) Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1235–1244. ACM
Bojanowski P et al (2017) Enriching word vectors with subword information
Mikolov T et al (2010) Recurrent neural network based language model. Interspeech, Sep 2010
Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlin Phenomena 404: 132306
Bengio Y, Frasconi P, Simard P (1993) Problem of learning long-term dependencies in recurrent networks, vol 3, pp 1183–1188
Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International conferrence on machine learning
Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90
Salehinejad H et al (2018) Recent advances in recurrent neural networks. arXiv
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Graves A (2013) Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850
Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association
Lee D, Kim K (2019) Recurrent neural network-based hourly prediction of photovoltaic power output using meteorological information
Lopyrev K (2015) Generating news headlines with recurrent neural networks. arXiv preprint arXiv
Chung J et al (2014) Gated feedback recurrent neural networks. arXiv
Mangal S, Joshi P, Modak R (2019) LSTM vs. GRU vs. bidirectional RNN for script generation. arXiv
Hong M et al (2018) Combining gated recurrent unit and attention pooling for sentimental classification, pp 99–104
Berglund M et al (2015) Bidirectional recurrent neural networks as generative models. In: Neural information processing system
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11)
Zeng K-H, Shoeybi M, Liu M-Y (2020) Style example-guided text generation using generative adversarial transformers. arXiv:2003.00674v1
Yu Z, Tan J, Wan X (2018) A neural approach to pun generation
Maqsud U (2015) Synthetic text generation for sentiment analysis. WASSA EMNLP
Martens J, Sutskever I (2012) Training deep and recurrent networks with Hessian-free optimization. In: Neural networks: tricks of trade. Springer, Berlin
Roemmele M, Gordon AS (2015) Creative help: a story writing assistant. In: International conference on interactive digital storytelling. Springer
Graves A (2014) generating sequences with recurrent neural networks. arxiv
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this chapter
Cite this chapter
Samreen, Iqbal, M.J., Ahmad, I., Khan, S., Khan, R. (2021). Language Modeling and Text Generation Using Hybrid Recurrent Neural Network. In: Koubaa, A., Azar, A.T. (eds) Deep Learning for Unmanned Systems. Studies in Computational Intelligence, vol 984. Springer, Cham. https://doi.org/10.1007/978-3-030-77939-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-030-77939-9_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-77938-2
Online ISBN: 978-3-030-77939-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)