MgNet: A unified framework of multigrid and convolutional neural network

He, Juncai; Xu, Jinchao

doi:10.1007/s11425-019-9547-2

MgNet: A unified framework of multigrid and convolutional neural network

Articles
Published: 24 May 2019

Volume 62, pages 1331–1354, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Science China Mathematics Aims and scope Submit manuscript

MgNet: A unified framework of multigrid and convolutional neural network

Download PDF

Juncai He¹ &
Jinchao Xu²

1845 Accesses
67 Citations
10 Altmetric
Explore all metrics

Abstract

We develop a unified model, known as MgNet, that simultaneously recovers some convolutional neural networks (CNN) for image classification and multigrid (MG) methods for solving discretized partial differential equations (PDEs). This model is based on close connections that we have observed and uncovered between the CNN and MG methodologies. For example, pooling operation and feature extraction in CNN correspond directly to restriction operation and iterative smoothers in MG, respectively. As the solution space is often the dual of the data space in PDEs, the analogous concept of feature space and data space (which are dual to each other) is introduced in CNN. With such connections and new concept in the unified model, the function of various convolution operations and pooling used in CNN can be better understood. As a result, modified CNN models (with fewer weights and hyperparameters) are developed that exhibit competitive and sometimes better performance in comparison with existing CNN models when applied to both CIFAR-10 and CIFAR-100 data sets.

Article PDF

Deep Neural Networks Motivated by Partial Differential Equations

Article 18 September 2019

Enhancing Image Recognition with Pre-Defined Convolutional Layers Based on PDEs

Article 26 May 2023

Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs

Article 04 March 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Barron A R. Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans Inform Theory, 1993, 39: 930–945
Article MathSciNet MATH Google Scholar
Bottou L, Curtis F E, Nocedal J. Optimization methods for large-scale machine learning. SIAM Rev, 2018, 60: 223–311
Article MathSciNet MATH Google Scholar
Chang B, Meng L, Haber E, et al. Multi-level residual networks from dynamical systems view. ArXiv:1710.10348, 2017
Google Scholar
Chen Y, Li J, Xiao H, et al. Dual path networks. In: Advances in Neural Information Processing Systems, vol. 30. Long Beach: Neural Information Processing Systems Foundation, 2017, 4467–4475
Google Scholar
Cybenko G. Approximation by superpositions of a sigmoidal function. Math Control Signals Systems, 1989, 2: 303–314
Article MathSciNet MATH Google Scholar
Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical image database. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition. Long Beach: IEEE, 2009, 248–255
Google Scholar
E W. A proposal on machine learning via dynamical systems. Commun Math Stat, 2017, 5: 1–11
Article MathSciNet MATH Google Scholar
E W, Wang Q. Exponential convergence of the deep neural network approximation for analytic functions. Sci China Math, 2018, 61: 1733–1740
Article MathSciNet MATH Google Scholar
Ellacott S W. Aspects of the numerical analysis of neural networks. Acta Numer, 1994, 3: 145–202
Article MathSciNet MATH Google Scholar
Golub G H, Van Loan C F. Matrix Computations, 3rd ed. Baltimore: Johns Hopkins University Press, 2012
MATH Google Scholar
Gomez A N, Ren M, Urtasun R, et al. The reversible residual network: Backpropagation without storing activations. In: Advances in Neural Information Processing Systems, vol. 30. Long Beach: Neural Information Processing Systems Foundation, 2017, 2214–2224
Google Scholar
Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge: MIT Press, 2017
MATH Google Scholar
Haber E, Ruthotto L, Holtham E. Learning across scales—A multiscale method for convolution neural networks. ArXiv:1703.02009, 2017
Google Scholar
Hackbusch W. Iterative Solution of Large Sparse Systems of Equations. New York: Springer, 1994
Book MATH Google Scholar
Hackbusch W. Multi-grid Methods and Applications. Heidelberg: Springer, 2013
MATH Google Scholar
He J, Li L, Xu J, et al. ReLU deep neural networks and linear finite elements. ArXiv:1807.03973, 2018
Google Scholar
He K, Zhang X, Ren S, et al. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In: Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015, 1026–1034
Google Scholar
He K, Zhang X, Ren S, et al. Deep residual learning for image recognition. In: Proceedings of the 29th IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas: IEEE, 2016, 770–778
Google Scholar
He K, Zhang X, Ren S, et al. Identity mappings in deep residual networks. In: Proceedings of the 14th European Conference on Computer Vision. Amsterdam: Springer, 2016, 630–645
Google Scholar
Hornik K, Stinchcombe M, White H. Multilayer feedforward networks are universal approximators. Neural Networks, 1989, 2: 359–366
Article MATH Google Scholar
Hsieh J T, Zhao S, Eismann S, et al. Learning neural PDE solvers with convergence guarantees. In: Proceedings of the 7th International Conference on Learning Representations. https://openreview.net/forum?id=rklaWn0qK7, 2019
Google Scholar
Huang G, Liu Z, Van Der Maaten L, et al. Densely connected convolutional networks. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017, 4700–4708
Google Scholar
Ioffe S, Szegedy C. Batch normalization: Accelerating deep network training by reducing internal covariate shift. ArXiv:1502.03167, 2015
Google Scholar
Katrutsa A, Daulbaev T, Oseledets I. Deep multigrid: Learning prolongation and restriction matrices. ArXiv:1711.03825, 2017
Google Scholar
Ke T W, Maire M, Stella X Y. Multigrid neural architectures. ArXiv:1611.07661, 2016
Google Scholar
Krizhevsky A, Hinton G. Learning multiple layers of features from tiny images. Technical report. Toronto: University of Toronto, 2009
Google Scholar
Krizhevsky A, Sutskever I, Hinton G E. Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, vol. 25. Lake Tahoe: Neural Information Processing Systems Foundation, 2012, 1097–1105
Google Scholar
Larsson G, Maire M, Shakhnarovich S. Fractalnet: Ultra-deep neural networks without residuals. ArXiv:1605.07648, 2016
Google Scholar
LeCun L, Bottou L, Bengio Y, et al. Gradient-based learning applied to document recognition. In: Proceedings of the IEEE, vol. 86. New York: IEEE, 1998, 2278–2324
Google Scholar
LeCun Y, Bengio Y, Hinton G. Deep learning. Nature, 2015, 521: 436
Article Google Scholar
Li Z, Shi Z. A flow model of neural networks. ArXiv:1708.06257v2, 2017
Google Scholar
Lin T Y, Dollar P, Girshick R, et al. Feature pyramid networks for object detection. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017, 2117–2125
Google Scholar
Liu D, Wen B, Liu X, et al. When image denoising meets high-level vision tasks: A deep learning approach. ArXiv:1706.04284, 2017
Google Scholar
Long Z, Lu Y, Dong B. PDE-Net 2.0: Learning PDEs from data with a numeric-symbolic hybrid deep network. ArXiv:1812.04426, 2018
Google Scholar
Long Z, Lu Y, Ma X, et al. PDE-Net: Learning PDEs from data. In: Proceedings of the 35th International Conference on Machine Learning. Stockholm: PMLR, 2018, 3214–3222
Google Scholar
Lu Y, Zhong A, Li Q, et al. Beyond finite layer neural networks: Bridging deep architectures and numerical differential equations. In: Proceedings of the 35th International Conference on Machine Learning. Stockholm: PMLR, 2018, 3282–3291
Google Scholar
Mao X, Shen C, Yang Y B. Image restoration using very deep convolutional encoder-decoder networks with symmetric skip connections. In: Advances in Neural Information Processing Systems, vol. 29. Barcelona: Neural Information Processing Systems Foundation, 2016, 2802–2810
Google Scholar
Milletari F, Navab N, Ahmadi S A. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In: Proceedings of 2016 4th International Conference on 3D Vision. Stanford: IEEE, 2016, 565–571
Google Scholar
Montanelli H, Du Q. Deep ReLU networks lessen the curse of dimensionality. ArXiv:1712.08688, 2017
Google Scholar
Nair V, Hinton G E. Rectified linear units improve restricted boltzmann machines. In: Proceedings of the 27th International Conference on Machine Learning. Haifa: PMLR, 2010, 807–814
Google Scholar
Noh H, Hong S, Han B. Learning deconvolution network for semantic segmentation. In: Proceedings of the 2015 IEEE International Conference on Computer Vision. Santiago: IEEE, 2015, 1520–1528
Google Scholar
Pinkus A. Approximation theory of the MLP model in neural networks. Acta Numer, 1999, 8: 143–195
Article MathSciNet MATH Google Scholar
Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks for biomedical image segmentation. In: Proceedings of Medical Image Computing and Computer-Assisted Intervention. Munich: Springer, 2015, 234–241
Google Scholar
Shaham U, Cloninger A, Coifman R R. Provable approximation properties for deep neural networks. Appl Comput Harmon Anal, 2018, 44: 537–557
Article MathSciNet MATH Google Scholar
Siegel J W, Xu J. On the approximation properties of neural networks. ArXiv:1904.02311, 2019
Google Scholar
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. ArXiv:1409.1556, 2014
Google Scholar
Szegedy C, Liu W, Jia Y, et al. Going deeper with convolutions. In: Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015, 1–9
Google Scholar
Xu J. Iterative methods by space decomposition and subspace correction. SIAM Rev, 1992, 34: 581–613
Article MathSciNet MATH Google Scholar
Xu J. The Finite Element Methods. http://www.multigrid.org/wiki, 2019
Google Scholar
Xu J, Zikatanov L. The method of alternating projections and the method of subspace corrections in Hilbert space. J Amer Math Soc, 2002, 15: 573–597
Article MathSciNet MATH Google Scholar
Xu J, Zikatanov L. Algebraic multigrid methods. Acta Numer, 2017, 26: 591–721
Article MathSciNet MATH Google Scholar
Zagoruyko S, Komodakis N. Wide residual networks. ArXiv:1605.07146, 2016
Book Google Scholar
Zhang T, Qi G J, Xiao B, et al. Interleaved group convolutions. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017, 4373–4382
Google Scholar
Zhang X, Li Z, Change Loy C, et al. Polynet: A pursuit of structural diversity in very deep networks. In: Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu: IEEE, 2017, 3900–3908
Google Scholar
Zhou D X. Universality of deep convolutional neural networks. ArXiv:1805.10769, 2018
Book MATH Google Scholar

Download references

Acknowledgements

The first author was supported by the Elite Program of Computational and Applied Mathematics for PhD Candidates of Peking University. The second author was supported in part by the National Science Foundation of USA (Grant No. DMS-1819157) and the US Department of Energy Office of Science, Office of Advanced Scientific Computing Research, Applied Mathematics Program (Grant No. DE-SC0014400). The authors thank Xiaodong Jia for his help with the numerical experiments.

Author information

Authors and Affiliations

School of Mathematical Sciences, Peking University, Beijing, 100871, China
Juncai He
Department of Mathematics, The Pennsylvania State University, University Park, PA, 16802, USA
Jinchao Xu

Authors

Juncai He
View author publications
You can also search for this author in PubMed Google Scholar
Jinchao Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jinchao Xu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

He, J., Xu, J. MgNet: A unified framework of multigrid and convolutional neural network. Sci. China Math. 62, 1331–1354 (2019). https://doi.org/10.1007/s11425-019-9547-2

Download citation

Received: 12 February 2019
Accepted: 08 May 2019
Published: 24 May 2019
Issue Date: July 2019
DOI: https://doi.org/10.1007/s11425-019-9547-2

Keywords

MSC(2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

MgNet: A unified framework of multigrid and convolutional neural network

Abstract

Article PDF

Similar content being viewed by others

Deep Neural Networks Motivated by Partial Differential Equations

Enhancing Image Recognition with Pre-Defined Convolutional Layers Based on PDEs

Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

MSC(2010)

Navigation

MgNet: A unified framework of multigrid and convolutional neural network

Abstract

Article PDF

Similar content being viewed by others

Deep Neural Networks Motivated by Partial Differential Equations

Enhancing Image Recognition with Pre-Defined Convolutional Layers Based on PDEs

Inference, Learning and Attention Mechanisms that Exploit and Preserve Sparsity in CNNs

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

MSC(2010)

Search

Navigation