Subsampled First-Order Optimization Methods with Applications in Imaging

Reference work entry
First Online: 25 February 2023

pp 61–95
Cite this reference work entry

Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging

Stefania Bellavia⁵,
Tommaso Bianconcini⁶,
Nataša Krejić⁷ &
…
Benedetta Morini⁵

2165 Accesses

Abstract

This work presents and discusses optimization methods for solving finite-sum minimization problems which are pervasive in applications, including image processing. The procedures analyzed employ first-order models for the objective function and stochastic gradient approximations based on subsampling. Among the variety of methods in the literature, the focus is on selected algorithms which can be cast into two groups: algorithms using gradient estimates evaluated on samples of very small size and algorithms relying on gradient estimates and machinery from standard globally convergent optimization procedures. Neural networks and convolutional neural networks widely used for image processing tasks are considered, and a classification problem of images is solved with some of the methods presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 899.99; Price excludes VAT (USA)

Hardcover Book: USD 949.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Subsampled First-Order Optimization Methods with Applications in Imaging

Chapter © 2021

Conjugate gradient based acceleration for inverse problems

Article 12 October 2017

A three-term conjugate gradient descent method with some applications

Article Open access 28 May 2024

References

Aggarwal, C.C.: Neural Networks and Deep Learning. Springer (2018)
Google Scholar
Andradottir, S.: A scaled stochastic approximation algorithm. Manag. Sci. 42, 475–498 (1996)
Article MATH Google Scholar
Armijo, L.: Minimization of functions having lipschitz continuous first partial derivatives. Pac. J. Math. 16, 1–3 (1966)
Article MathSciNet MATH Google Scholar
Babanezhad, R., Ahmed, M.O., Virani, A., Schmidt, M., Konečný, J., Sallinen S.: Stop wasting my gradients: Practical SVRG. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, Vol. 2 pp. 2251–2259 (2015)
Google Scholar
Barzilai, J., Borwein, J.: Two-point step size gradient. IMA J. Numer. Anal. 8, 141–148 (1988)
Article MathSciNet MATH Google Scholar
Bellavia, S., Gurioli, G.: Complexity analysis of a Stochastic cubic regularisation method under inexact gradient evaluations and dynamic Hessian accuracy, (2020). arXiv:2001.10827
Google Scholar
Bellavia, S., Gurioli, G., Morini, B., Adaptive cubic regularization methods with dynamic inexact Hessian information and applications to finite-sum minimization. IMA J. Numer. Anal. 41, 764–799 (2021). https://doi.org/10.1093/imanum/drz076
Article MathSciNet MATH Google Scholar
Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: Adaptive regularization algorithms with inexact evaluations for nonconvex optimization. SIAM J. Optimiz. 29, 2881–2915 (2019)
Article MathSciNet MATH Google Scholar
Bellavia, S., Gurioli, G., Morini, B., Toint, P.L.: High-order Evaluation Complexity of a Stochastic Adaptive Regularization Algorithm for Nonconvex Optimization Using Inexact Function Evaluations and Randomly Perturbed Derivatives (2020a) arXiv:2005.04639
Google Scholar
Bellavia, S., Krejić, N., Krklec Jerinkić, N.: Subsampled Inexact Newton methods for minimizing large sums of convex function. IMA J. Numer. Anal. 40, 2309–2341 (2020b)
Article MathSciNet MATH Google Scholar
Bellavia, S., Krejić, N., Morini, B.: Inexact restoration with subsampled trust-region methods for finite-sum minimization. Comput. Optim. Appl. 76, 701–736 (2020c)
Article MathSciNet MATH Google Scholar
Berahas, A.S., Bollapragada, R., Nocedal, J.: An investigation of Newton-sketch and subsampled Newton methods. Optim. Method Softw. 35, 661–680 (2020)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P., Tsitsiklis, J.N.: Gradient convergence in gradient methods with errors. SIAM J. Optimiz. 10, 627–642 (2000)
Article MathSciNet MATH Google Scholar
Bishop, C.M.: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer (2006)
Google Scholar
Birgin, G.E., Krejić, N., Martínez, J.M.: On the employment of Inexact Restoration for the minimization of functions whose evaluation is subject to programming errors. Math. Comput. 87, 1307–1326 (2018)
Article MATH Google Scholar
Blanchet, J., Cartis, C., Menickelly, M., Scheinberg, K.: Convergence rate analysis of a stochastic trust region method via submartingales. INFORMS J. Optim. 1, 92–119 (2019)
Article MathSciNet Google Scholar
Bollapragada, R., Byrd, R., Nocedal, J.: Exact and inexact subsampled Newton methods for optimization. IMA J. Numer. Anal. 39, 545–578 (2019)
Article MathSciNet MATH Google Scholar
Bottou, L., Curtis, F.C., Nocedal, J.: Optimization methods for large-scale machine learning. SIAM Rev. 60, 223–311 (2018)
Article MathSciNet MATH Google Scholar
Byrd, R.H., Hansen, S.L., Nocedal, J., Singer Y.: A stochastic quasi-Newton method for large-scale optimization. SIAM J. Optimiz. 26, 1008–1021 (2016)
Article MathSciNet MATH Google Scholar
Byrd, R.H., Chin, G.M., Nocedal, J., Wu, Y.: Sample size selection in optimization methods for machine learning. Math. Program. 134, 127–155 (2012)
Article MathSciNet MATH Google Scholar
Cartis, C., Scheinberg, K.: Global convergence rate analysis of unconstrained optimization methods based on probablistic models. Math. Program. 169, 337–375 (2018)
Article MathSciNet MATH Google Scholar
Chen, R., Menickelly, M., Scheinberg, K.: Stochastic optimization using a trust-region method and random models. Math. Program. 169, 447–487 (2018)
Article MathSciNet MATH Google Scholar
Chollet, F.: Deep Learning with Python. Manning Publications Co. (2017)
Google Scholar
Curtis, F.E., Scheinberg, K., Shi, R.: A stochastic trust region algorithm based on careful step normalization. INFORMS J. Optimiz. 1, 200–220 (2019)
Article MathSciNet Google Scholar
Defazio, A., Bach, F., Lacoste-Julien, S.: SAGA: A fast incremental gradient method with support for non-strongly convex composite objectives. Advances in Neural Information Processing Systems 27 (NIPS 2014)
Google Scholar
Delyon, B., Juditsky, A.: Accelerated stochastic approximation. SIAM J. Optimiz. 3, 868–881 (1993)
Article MathSciNet MATH Google Scholar
Erdogdu, M.A., Montanari, A.: Convergence rates of sub-sampled Newton methods, Advances in Neural Information Processing Systems 28 (NIPS 2015)
Google Scholar
Forsyth, D.A., Ponce, J.: Computer Vision: A Modern Approach. Pearson (2002)
Google Scholar
Friedlander, M.P., Schmidt, M.: Hybrid deterministic-stochastic methods for data fitting. SIAM J. Sci. Comput. 34, 1380–1405 (2012)
Article MathSciNet MATH Google Scholar
Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. The MIT Press (2016)
Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer New York Inc. (2001)
Google Scholar
Johnson, R., Zhang, T.: Accelerating stochastic gradient descent using predictive variance reduction. In: Proceedings of the 26th International Conference on Neural Information Processing Systems 26 (NIPS 2013)
Google Scholar
Lei, L., Jordan, M.I.: Less than a single pass: Stochastically controlled stochastic gradient method. In: Proceedings of the Twentieth Conference on Artificial Intelligence and Statistics (AISTATS) (2017)
Google Scholar
Liu, L., Liu, X., Hsieh, C.-J., Tao, D.: Stochastic second-order methods for non-convex optimization with inexact Hessian and gradient (2018) arXiv:1809.09853
Google Scholar
Loizou, N., Richtárik, P.: Momentum and stochastic momentum for stochastic gradient, Newton, proximal point and subspace descent methods (2017) arXiv:1712.09677
Google Scholar
Kesten, H.: Accelerated stochastic approximation. Ann. Math. Statist. 29, 41–59 (1958)
Article MathSciNet MATH Google Scholar
Kiefer, J., Wolfowitz, J.: Stochastic estimation of the maximum of a regression function. Ann. Math. Stat. 23, 462–466 (1952)
Article MathSciNet MATH Google Scholar
Kingma, D.P., Ba, J.: Adam: A Method for Stochastic Optimization, 3rd International Conference on Learning Representations, ICLR 2015 (2015) arXiv: 1412.6980
Google Scholar
Krejić, N., Lužanin, Z., Stojkovska, I.: A gradient method for unconstrained optimization in noisy environment. App. Numer. Math. 70, 1–21 (2013)
Article MathSciNet MATH Google Scholar
Krejić, N., Lužanin, Z., Ovcin, Z., Stojkovska, I.: Descent direction method with line search for unconstrained optimization in noisy environment. Optim. Method. Soft. 30, 1164–1184 (2015)
Article MathSciNet MATH Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization. Springer Series in Operations Research. Springer (1999)
Book MATH Google Scholar
Krejić, N., Martínez, J.M.: Inexact restoration approach for minimization with inexact evaluation of the objective function. Math. Comput. 85, 1775–1791 (2016)
Article MathSciNet MATH Google Scholar
Krejić, N., Krklec, N.: Line search methods with variable sample size for unconstrained optimization. J. Comput. Appl. Math. 245, 213–231 (2013)
Article MathSciNet MATH Google Scholar
Krejić, N., Krklec Jerinkić N.: Nonmonotone line search methods with variable sample size. Numer. Algorithms 68, 711–739 (2015)
Article MathSciNet MATH Google Scholar
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical Report, University of Toronto (2009)
Google Scholar
Nemirovski, A., Juditsky, A., Lan, G., Shapiro, A.: Robust stochastic approximation approach to stochastic programming. SIAM J. Optimiz. 19, 1574–1609 (2009)
Article MathSciNet MATH Google Scholar
Nesterov, Y.: Introductory lectures on convex programming, Volume I: Basic course. Lecture Notes (1998)
Google Scholar
Nocedal, J., Sartenaer, A., Zhu, C.: On the behavior of the gradient norm in the steepest descent method. Comput. Optim. Appl. 22, 5–35 (2002)
Article MathSciNet MATH Google Scholar
Nguyen, L.M., Liu, J., Scheinberg, K., Takač, M., SARAH: A novel method for machine learning problems using stochastic recursive gradient. In: Proceedings of the 34th International Conference on Machine Learning (2017) pp. 2613–2621
Google Scholar
Paquette, C., Scheinberg, K.: A stochastic line search method with expected complexity analysis. SIAM J. Optim. 30, 349–376 (2020)
Article MathSciNet MATH Google Scholar
Patterson, J., Gibson, A.: Deep Learning: A Practitioner’s Approach. O’Reilly Media, Inc (2017)
Google Scholar
Pilanci, M., Wainwright, M.J.: Newton sketch: A near linear-time optimization algorithm with linear-quadratic convergence. SIAM J. Optimiz. 27, 205–245 (2017)
Article MathSciNet MATH Google Scholar
Polak, E., Royset, J.O.: Efficient sample sizes in stochastic nonlinear programing. J. Comput. Appl. Math. 217, 301–310 (2008)
Article MathSciNet MATH Google Scholar
Raj, A., Stich, S.U.: k-SVRG: Variance Reduction for Large Scale Optimization (2018) arXiv:1805.00982
Google Scholar
Raydan, M.: The Barzilai and Borwein gradient method for the large scale unconstrained minimization problem. SIAM J. Optim. 7, 26–33 (1997)
Article MathSciNet MATH Google Scholar
Ross, S.: Simulation. Elsevier, 4th edn. (2006)
Google Scholar
Roosta-Khorasani, F., Mahoney, M.W.: Sub-sampled Newton methods. Math. Progr. 174, 293–326 (2019)
Article MathSciNet MATH Google Scholar
Robbins, H., Monro, S.: A stochastic approximation method. Ann. Math. Stat. 22, 400–407 (1951)
Article MathSciNet MATH Google Scholar
Rumelhart, D.E., Hinton, G.E., Williams, R.J.: Learning representations by back-propagating errors. Nature 323, 533–536 (1986)
Article MATH Google Scholar
Sashank, J.R., Satyen, K.A., Sanjiv, K.U.: On the convergence of Adam and beyond. In: 6th International Conference on Learning Representations (ICLR 2018)
Google Scholar
Schmidt, M., Le Roux, N., Bach, F.: Minimizing finite sums with the stochastic average gradient. Math. Program. 162, 83–112 (2017)
Article MathSciNet MATH Google Scholar
Shanmugamani, R.: Deep Learning for Computer Vision: Expert Techniques to Train Advanced Neural Networks Using TensorFlow and Keras. Packt Publishing (2018)
Google Scholar
Spall J.C.: Introduction to Stochastic Search and Optimization. Wiley-Interscience series in discrete mathematics (2003)
Google Scholar
Spall J.C.: Implementation of the simultaneous perturbation algorithm for stochastic optimization. IEEE Trans. Aerosp. Electron. Syst. 34, 817–823 (1998)
Article Google Scholar
Shanmugamani, R.: Deep Learning for Computer Vision: Expert Techniques to Train Advanced Neural Networks Using TensorFlow and Keras. Packt Publishing (2018)
Google Scholar
Strang, G.: Linear Algebra and Learning from Data. Wellesley-Cambridge Press (2019)
Google Scholar
Sutton, R.: Two problems with back propagation and other steepest descent learning procedures for networks. In: Proceedings of the Eighth Annual Conference of the Cognitive Science Society, pp. 823—832 (1986)
Google Scholar
Tan, C., Ma, S., Dai, Y., Qian, Y.: Barzilai-Borwein step size for stochastic gradient descent. Advances in Neural Information Processing Systems 29, 685–693 (2016)
Google Scholar
Toint, P.L.: Nonlinear stepsize control, trust regions and regularizations for unconstrained optimization. Optimiz. Meth. Softw. 28, 82–95 (2013)
Article MathSciNet MATH Google Scholar
Tripuraneni, N., Stern, M., Regier, M., Jordan, M.I.: Stochastic cubic regularization for fast nonconvex optimization. Adv. Neural Inf. Proces. Syst. 31, 2899–2908 (2018)
Google Scholar
Tropp, J.: An introduction to matrix concentration inequalities. Found. Trends Mach. Learn. 8(1–2), 1–230 (2015)
Article MATH Google Scholar
Yousefian, F., Nedic, A., Shanbhag, U.V.: On stochastic gradient and subgradient methods with adaptive steplength sequences. Automatica 48, 56–67 (2012)
Article MathSciNet MATH Google Scholar
Wang, C., Chen, X., Smola, A., Xing, E.: Variance reduction for stochastic gradient optimization. Adv. Neural Inf. Proces. Syst. 26, 181–189 (2013)
Google Scholar
Xu, Z., Dai, Y.H.: New stochastic approximation algorithms with adaptive step sizes. Optim. Lett. 6, 1831–1846 (2012)
Article MathSciNet MATH Google Scholar
Xu, P., Yang, J., Roosta-Khorasani, F., Ré, C., Mahoney, M.W.: Sub-sampled Newton methods with non-uniform sampling. Adv. Neural Inf. Proces. Syst. 29, 3008–3016 (2016)
Google Scholar
Xu, P., Roosta-Khorasani, F., Mahoney, M.W.: Newton-type methods for non-convex optimization under inexact Hessian information. Math. Program. (2019). https://doi.org/10.1007/s10107-019-01405-z
MATH Google Scholar
Wang, X., Yuan, Y.X.: Stochastic Trust Region Methods with Trust Region Radius Depending on Probabilistic Models (2019). arXiv:1904.03342
Google Scholar

Download references

Acknowledgements

The financial support of INdAM-GNCS Projects 2019 and 2020 is gratefully acknowledged by the first and by the fourth authors. Thanks are due the referee whose comments improved the presentation of this paper.

Author information

Authors and Affiliations

Dipartimento di Ingegneria Industriale, Università degli Studi di Firenze (INdAM-GNCS members), Firenze, Italia
Stefania Bellavia & Benedetta Morini
Verizon Connect, Firenze, Italia
Tommaso Bianconcini
Department of Mathematics and Informatics, Faculty of Sciences, University of Novi Sad, Novi Sad, Serbia
Nataša Krejić

Authors

Stefania Bellavia
View author publications
You can also search for this author in PubMed Google Scholar
Tommaso Bianconcini
View author publications
You can also search for this author in PubMed Google Scholar
Nataša Krejić
View author publications
You can also search for this author in PubMed Google Scholar
Benedetta Morini
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Benedetta Morini .

Editor information

Editors and Affiliations

Department of Mathematical Sciences, The University of Liverpool, Liverpool, UK
Ke Chen
Department of Applied Mathematics and Theoretical Physics, University of Cambridge, Cambridge, UK
Carola-Bibiane Schönlieb
Hong Kong Center for Cerebrocardiovascular Health Engineering (COCHE), Shatin, Hong Kong
Xue-Cheng Tai
Department of Applied Mathematics and Statistics, Johns Hopkins University, Baltimore, MD, USA
Laurent Younes

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this entry

Cite this entry

Bellavia, S., Bianconcini, T., Krejić, N., Morini, B. (2023). Subsampled First-Order Optimization Methods with Applications in Imaging. In: Chen, K., Schönlieb, CB., Tai, XC., Younes, L. (eds) Handbook of Mathematical Models and Algorithms in Computer Vision and Imaging. Springer, Cham. https://doi.org/10.1007/978-3-030-98661-2_78

Download citation

DOI: https://doi.org/10.1007/978-3-030-98661-2_78
Published: 25 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-98660-5
Online ISBN: 978-3-030-98661-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 899.99; Price excludes VAT (USA)

Hardcover Book: USD 949.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions