Abstract
Long Short-term Memory was designed to avoid vanishing and exploding gradient problems in recurrent neural networks. Over the last twenty years, various modifications of an original LSTM cell were proposed. This chapter gives an overview of basic LSTM cell structures and demonstrates forward and backward propagation within the most widely used configuration called traditional LSTM cell. Besides, LSTM neural network configurations are described.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rosenblatt F (1958) The perceptron: a probabilistic model for information storage and organization in the brain. Psychol Rev 65(6):386
Lipton ZC, Berkowitz J, Elkan C (2015) A critical review of recurrent neural networks for sequence learning. arXiv:1506.00019
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Gers FA, Schmidhuber J, Cummins F (1999) Learning to forget: Continual prediction with LSTM
Greff K, Srivastava RK, Koutník J, Steunebrink BR, Schmidhuber J (2017) LSTM: a search space odyssey. IEEE Trans Neural Netw Learn Syst 28(10):2222–2232
Gomez, A. (2016). Backpropogating an LSTM: A Numerical Example. Aidan Gomez blog at Medium
Xingjian SHI, Chen Z, Wang H, Yeung DY, Wong WK, Woo WC (2015) Convolutional LSTM network: a machine learning approach for precipitation now casting. In: Advances in neural information processing systems, pp 802–810
Neil D, Pfeiffer M, Liu SC (2016) Phased lstm: accelerating recurrent network training for long or event-based sequences. In: Advances in Neural Information Processing Systems, pp 3882–3890
Karpathy A (2015) The unreasonable effectiveness of recurrent neural networks. Andrej Karpathy blog
Schuster M, Paliwal KK (1997) Bidirectional recurrent neural networks. IEEE Trans Signal Process 45(11):2673–2681
Graves A, Jaitly N, Mohamed AR (2013) Hybrid speech recognition with deep bidirectional LSTM. In: 2013 IEEE workshop on automatic speech recognition and understanding (ASRU). IEEE, pp 273–278
Yoon J, Zame WR, van der Schaar M (2017) Multi-directional recurrent neural networks: a novel method for estimating missing data
Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in neural information processing systems, pp 545–552
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Chapter Highlights
Chapter Highlights
-
Long short-term memory (LSTM) is a special type of recurrent neural network (RNN).
-
LSTM unit has a memory and multiple weighted gates. Therefore it does not suffer from vanishing or exploding gradient problems of RNN and can process sequences of arbitrary length.
-
An original LSTM unit has no forget gate (NFG).
-
Traditional LSTM configuration:
$$ \begin{pmatrix} g_{t} \\ i_{t} \\ f_{t} \\ o_{t} \\ \end{pmatrix} = \begin{pmatrix} \text {tanh}\\ \sigma \\ \sigma \\ \sigma \\ \end{pmatrix} \cdot \begin{pmatrix} W^{(g)} &{}U^{(g)} \\ W^{(i)} &{}U^{(i)} \\ W^{(f)} &{}U^{(f)} \\ W^{(o)} &{}U^{(o)} \\ \end{pmatrix} \cdot \begin{pmatrix} x_{t} \\ h_{t-1} \end{pmatrix};$$$$C_{t}= f_{t} \bigodot C_{t-1}+i_{t}\bigodot g_t{}; \qquad h_{t}=o_{t} \bigodot \text {tanh}(C_{t}).$$ -
Traditional LSTM with peephole connections is distinguished for precise timing and often referred as ‘Vanilla’ LSTM.
-
ConvLSTM are effective in spatiotemporal sequential problems.
-
Updates in Phased LSTM occur at irregularly sampled time points \(t_{j}\) which can be controlled.
-
Depending on input and output sequnces’ length, following LSTM models are differentiated: ‘One-to-One’, ‘One-to-Many’, ‘Many-to-One’, ‘Many-to-Many’.
-
LSTM architecture can have different directionality, dimensionality and combination of them.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Smagulova, K., James, A.P. (2020). Overview of Long Short-Term Memory Neural Networks. In: James, A. (eds) Deep Learning Classifiers with Memristive Networks. Modeling and Optimization in Science and Technologies, vol 14. Springer, Cham. https://doi.org/10.1007/978-3-030-14524-8_11
Download citation
DOI: https://doi.org/10.1007/978-3-030-14524-8_11
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14522-4
Online ISBN: 978-3-030-14524-8
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)