Skip to main content

Language Modeling and Text Generation Using Hybrid Recurrent Neural Network

  • Chapter
  • First Online:
Deep Learning for Unmanned Systems

Part of the book series: Studies in Computational Intelligence ((SCI,volume 984))

Abstract

The increase in development of machines that have capability to understand the complicated behavior to solve the human brain involvement problems, the auto text generation application also gets the wide attention. The Language modeling or text generation is a task of next character or word prediction in a sequence with analysis of input data. The ATG enable the machines to write and provide the help to reduce the human brain effort. The ATG is also useful for understanding and analysis of languages and provide the techniques that enable the machines to exchange information in natural languages. At the large scale the text data are created everywhere (whatsApp, facebook, and tweets etc.) and freely online available therefore an effective system is needed for automation of text generation process and analysis of the text data for extracting meaningful information from it so in this work, a case study is presented on how develop a text generation model using hybrid recurrent neural network for English language. The explore model find the dependencies between characters and the conditional probabilities of character in sequences from the available input text data and generate the wholly new sequences of characters like human beings writing (correct in meaning, spelling and sentence structure). A comprehensive comparison between these models, namely, LSTM, deep LSTM, GRU and HRNN is also presented. Previously the RNN models are used for text predictions or auto text generation but these models created the problem of vanishing gradient (short memory) when process long text, therefore the GRU and LSTM models were created for solving this problem. The text generated by GRU and LSTM have many spellings error, incorrect sentence structure, therefore, filling this gap the HRNN model is explore. The HRNN model is the combination of LSTM, GRU and a dense layer. The experiments performed on Penn Treebank, Shakespeare, and Nietzsche datasets. The perplexity of HRNN model is 3.27, the bit per character is 1.18 and average word prediction accuracy is 0.63. As compare with baseline work and previous models (LSTM, deep LSTM and GRU), our model (HRNN) perplexity and bit per character are less. The texts generated by HRNN have fewer spelling errors and sentence structure mistakes. A closer analysis of explored models’ performance and efficiency is described with the help of graph plots and generated texts by taking some input strings. These graphs explain the performance for each model.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Abbreviations

ATG:

Auto text generation

BRNN:

Bidirectional recurrent neural networks

ELIZA:

First computer program for text generation

GLOVE:

Global vector

GRU:

Gated recurrent unit

HRNN:

Hybrid recurrent neural network

LDA:

Latent Dirichlet allocation

LM:

Language modeling

LSTM:

Long short-term memory

NN:

Neural network

PTB:

Penn treebank dataset

NLP:

Natural language processing

RNN:

Recurrent neural networks

SLM:

Statical language modeling

TG:

Text generation

References

  1. Shacklett M (2017) Unstructured data: a cheat sheet. https://www.techrepublic.com/article/unstructured-data-the-smart-persons-guide/ (online acceptance 12 Nov 2019)

  2. Turing AM (1950) Computing machinery and intelligence. Mind LIX:433–460

    Google Scholar 

  3. Jurafsky D, Martin JH (2014) Speech and language processing, vol. 3. Pearson London

    Google Scholar 

  4. Chen SF, Goodman J (1996) An empirical study of smoothing techniques for language modeling. In: Proceedings of the 34th annual meeting on association for computational linguistics. Association for Computational Linguistics, pp. 310–318

    Google Scholar 

  5. Kneser R, Ney H (1995) Improved backing-off for m-gram language modeling. In: International conference on acoustics, speech, and signal processing. ICASSP-95, vol 1. IEEE, pp. 181–184

    Google Scholar 

  6. Deng L (2014) Deep learning: methods and applications. Foundations Trends Signal Process 7(3–4):197–387

    Google Scholar 

  7. Mikolov T et al (2013) Efficient estimation of word representations in vector space

    Google Scholar 

  8. Dale R, Reiter E (2000) Building natural language generation systems. Cambridge University Press. ISBN 978-0-521-02451-8

    Google Scholar 

  9. Grover A, Leskovec J (2016) node2vec: scalable feature learning for Networks. KDD 2016:855–864

    Article  Google Scholar 

  10. Frankle J, Carbin M (2019) The lottery ticket hypothesis: finding sparse, trainable neural networks. In: ICLR

    Google Scholar 

  11. Goth G (2016) Deep or shallow, NLP is breaking out. Commun ACM 59(3):13–16

    Article  Google Scholar 

  12. Bengio Y et al (2003) A neural probabilistic language model. J Mach Learn Res 3(2003):1137–1155

    MATH  Google Scholar 

  13. Collobert R, Weston J (2008) A unified architecture for natural language processing: deep neural networks with multitask learning, pp 160–167

    Google Scholar 

  14. Graves A, Jaitly N, Mohamed A (2013) Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 273–278

    Google Scholar 

  15. Krizhevsky A et al. (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp. 1097–1105

    Google Scholar 

  16. Sutskever I (2013) Training recurrent neural networks. Ph.D. thesis, University of Toronto

    Google Scholar 

  17. Sutskever I et al. (2011) Generating text with recurrent neural networks. In: Proceedings of the 28th international conference on machine learning (ICML-2011), pp 1017–1024

    Google Scholar 

  18. Wang H et al (2015) Collaborative deep learning for recommender systems. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pp 1235–1244. ACM

    Google Scholar 

  19. Bojanowski P et al (2017) Enriching word vectors with subword information

    Google Scholar 

  20. Mikolov T et al (2010) Recurrent neural network based language model. Interspeech, Sep 2010

    Google Scholar 

  21. Sherstinsky A (2020) Fundamentals of recurrent neural network (RNN) and long short-term memory (LSTM) network. Physica D Nonlin Phenomena 404: 132306

    Google Scholar 

  22. Bengio Y, Frasconi P, Simard P (1993) Problem of learning long-term dependencies in recurrent networks, vol 3, pp 1183–1188

    Google Scholar 

  23. Pascanu R, Mikolov T, Bengio Y (2013) On the difficulty of training recurrent neural networks. In: International conferrence on machine learning

    Google Scholar 

  24. Krizhevsky A, Sutskever I, Hinton GE (2017) ImageNet classification with deep convolutional neural networks. Commun ACM 60(6):84–90

    Article  Google Scholar 

  25. Salehinejad H et al (2018) Recent advances in recurrent neural networks. arXiv

    Google Scholar 

  26. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Google Scholar 

  27. Graves A (2013) Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850

  28. Sak H, Senior A, Beaufays F (2014) Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Fifteenth annual conference of the international speech communication association

    Google Scholar 

  29. Lee D, Kim K (2019) Recurrent neural network-based hourly prediction of photovoltaic power output using meteorological information

    Google Scholar 

  30. Lopyrev K (2015) Generating news headlines with recurrent neural networks. arXiv preprint arXiv

    Google Scholar 

  31. Chung J et al (2014) Gated feedback recurrent neural networks. arXiv

    Google Scholar 

  32. Mangal S, Joshi P, Modak R (2019) LSTM vs. GRU vs. bidirectional RNN for script generation. arXiv

    Google Scholar 

  33. Hong M et al (2018) Combining gated recurrent unit and attention pooling for sentimental classification, pp 99–104

    Google Scholar 

  34. Berglund M et al (2015) Bidirectional recurrent neural networks as generative models. In: Neural information processing system

    Google Scholar 

  35. Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11)

    Google Scholar 

  36. Zeng K-H, Shoeybi M, Liu M-Y (2020) Style example-guided text generation using generative adversarial transformers. arXiv:2003.00674v1

  37. Yu Z, Tan J, Wan X (2018) A neural approach to pun generation

    Google Scholar 

  38. Maqsud U (2015) Synthetic text generation for sentiment analysis. WASSA EMNLP

    Google Scholar 

  39. Martens J, Sutskever I (2012) Training deep and recurrent networks with Hessian-free optimization. In: Neural networks: tricks of trade. Springer, Berlin

    Google Scholar 

  40. Roemmele M, Gordon AS (2015) Creative help: a story writing assistant. In: International conference on interactive digital storytelling. Springer

    Google Scholar 

  41. Graves A (2014) generating sequences with recurrent neural networks. arxiv

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Samreen, Iqbal, M.J., Ahmad, I., Khan, S., Khan, R. (2021). Language Modeling and Text Generation Using Hybrid Recurrent Neural Network. In: Koubaa, A., Azar, A.T. (eds) Deep Learning for Unmanned Systems. Studies in Computational Intelligence, vol 984. Springer, Cham. https://doi.org/10.1007/978-3-030-77939-9_19

Download citation

Publish with us

Policies and ethics