ResNet: Solving Vanishing Gradient in Deep Networks

Borawar, Lokesh; Kaur, Ravinder

doi:10.1007/978-981-19-8825-7_21

Lokesh Borawar¹³ &
Ravinder Kaur¹³

Part of the book series: Lecture Notes in Networks and Systems ((LNNS,volume 600))

585 Accesses
11 Citations

Abstract

Training of a neural network is easier when layers are limited but situation changes rapidly when more layers are added and a deeper architecture network is built. Due to the vanishing gradient and complexity issues, it makes it more challenging to train neural networks, which makes training deeper neural networks more time consuming and resource intensive. When residual blocks are added to neural networks, training becomes more effective even with more complex architecture. Due to skip connections linked to the layers of artificial neural networks, which improves residual network (ResNet) efficiency, otherwise it was a time consuming procedure. The implantation of residual networks, their operation, formulae, and the solution to the vanishing gradient problem are the topics of this study. It is observed that because of ResNet, the model obtains good accuracy on image recognition task, and it is easier to optimize. In this study, ResNet is tested on the CIFAR-10 dataset, which has a depth of 34 layers and is both, more dense than VGG nets and less complicated. ResNet achieves error rates of up to 20% on the CIFAR-10 test dataset after constructing this architecture, which takes 80 epochs. More epochs can decrease the error further. The outcomes of ResNet and its corresponding convolutional network (ConvNet) without skip connection are compared. The findings indicate that ResNet offers more accuracy but is more prone to overfitting. To improve accuracy, overfitting prevention techniques including stochastic augmentation on training datasets and the addition of dropout layers in networks have been used.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 189.00; Price excludes VAT (USA)

Softcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Study of Residual Networks for Image Recognition

High-speed hyperparameter optimization for deep ResNet models in image recognition

Article 17 May 2021

Optimization Method of Residual Networks of Residual Networks for Image Classification

References

Detailed guide to understand and implement ResNets. CV (2019) Retrieved November 27, 2021, from https://cv-tricks.com/keras/understand-implement-resnets/
Bock S, Weiß M (2019) A proof of local convergence for the adam optimizer. Int Joint Conf Neural Netw (IJCNN) 2019:1–8. https://doi.org/10.1109/IJCNN.2019.8852239
Article Google Scholar
Cs.toronto.edu (2022) CIFAR-10 and CIFAR-100 datasets. [online] Available at: https://www.cs.toronto.edu/kriz/cifar.html. Accessed 15 Dec 2021
Bishop CM (1995) Neural networks for pattern recognition. Oxford university press
Google Scholar
Ripley BD (1996) Pattern recognition and neural networks. Cambridge university press
Google Scholar
Venables W, Ripley B (1999) Modern applied statistics with s-plus
Google Scholar
Lee CY, Xie S, Gallagher P, Zhang Z, Tu Z (2014) Deeplysupervised nets. arXiv:1409.5185
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A. Going deeper with convolutions. In: CVPR
Google Scholar
Raiko T, Valpola H, LeCun Y (2012) Deep learning made easier by linear transformations in perceptrons. In: AISTATS
Google Scholar
Schraudolph NN (1998) Centering neural network gradient factors. In: Neural networks: tricks of the trade. Springer, p 207–226
Google Scholar
Schraudolph NN (1998) Accelerated gradient descent by factor centering decomposition. Technical report
Google Scholar
Vatanen T, Raiko T, Valpola H, LeCun Y (2013) Pushing stochastic gradient towards second-order methods–backpropagation learning with transformations in nonlinearities. In: Neural Information Processing
Google Scholar
Srivastava RK, Greff K, Schmidhuber J (2015) Highway networks. arXiv:1505.00387
Srivastava RK, Greff K, Schmidhuber J (2015) Training very deep networks. 1507.06228
Google Scholar
Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780
Article Google Scholar
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. IEEE Conf Comput Vis Pattern Recogn (CVPR) 2016:770–778
Google Scholar
Torch.ch (2021) Torch—exploring residual networks. Available at: http://torch.ch/blog/2016/02/04/resnets.html Accessed 26 Nov 2021
Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: ICML
Google Scholar
He K, Zhang X, Ren S, Sun J, Identity mappings in deep residual networks
Google Scholar
TensorFlow (n.d.) The functional API, TensorFlow Core. [online] Available at: https://www.tensorflow.org/guide/keras/functional#a toy resnet model. Accessed 29 Nov 2021

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Chandigarh University, Mohali, Punjab, India
Lokesh Borawar & Ravinder Kaur

Authors

Lokesh Borawar
View author publications
You can also search for this author in PubMed Google Scholar
Ravinder Kaur
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ravinder Kaur .

Editor information

Editors and Affiliations

SRM Institute of Science and Technology, Ghaziabad, Uttar Pradesh, India
Rajendra Prasad Mahapatra
Department of Computer Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, India
Sateesh K. Peddoju
Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Sudip Roy
SRM Institute of Science and Technology, Ghaziabad, Uttar Pradesh, India
Pritee Parwekar

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Borawar, L., Kaur, R. (2023). ResNet: Solving Vanishing Gradient in Deep Networks. In: Mahapatra, R.P., Peddoju, S.K., Roy, S., Parwekar, P. (eds) Proceedings of International Conference on Recent Trends in Computing. Lecture Notes in Networks and Systems, vol 600. Springer, Singapore. https://doi.org/10.1007/978-981-19-8825-7_21

Download citation

DOI: https://doi.org/10.1007/978-981-19-8825-7_21
Published: 21 March 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-19-8824-0
Online ISBN: 978-981-19-8825-7
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

ResNet: Solving Vanishing Gradient in Deep Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Study of Residual Networks for Image Recognition

High-speed hyperparameter optimization for deep ResNet models in image recognition

Optimization Method of Residual Networks of Residual Networks for Image Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ResNet: Solving Vanishing Gradient in Deep Networks

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Study of Residual Networks for Image Recognition

High-speed hyperparameter optimization for deep ResNet models in image recognition

Optimization Method of Residual Networks of Residual Networks for Image Classification

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation