Abstract
We present a method for succinctly structuring neural networks having a few thousands weights. Here structuring means weight sharing where weights in a network are divided into clusters and weights within the same cluster are constrained to have the same value. Our method employs a newly developed weight sharing technique called bidirectional clustering of weights (BCW), together with second-order optimal criteria for both cluster merge and split. Our experiments using two artificial data sets showed that the BCW method works well to find a succinct network structure from an original network having about two thousands weights in both regression and classification problems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
C. M. Bishop. Neural networks for pattern recognition. Clarendon Press, Oxford, 1995.
C. L. Blake and C. J. Merz. UCI Repository of machine learning databases [http://www.ics.uci.edu/~mlearn/MLRepository.html]. 1998.
R. O. Duda and P. E. Hart. Pattern classification and scene analysis. John Wiley & Sons, 1973.
B. Hassibi, D. G. Stork, and G. Wolf. Optimal brain surgeon and general network pruning. In Proc. IEEE Int. Conf. on Neural Networks, pages 293–299, 1992.
S. Haykin. Neural networks-a comprehensive foundation, 2nd edition. Prentice-Hall, 1999.
M. Ishikawa. Structural learning and rule discovery. In Knowledge-based Neurocomputing, pages 153–206. MIT Press, 2000.
Y. LeCun, J. S. Denker, and S. A. Solla. Optimal brain damage. In Advances in Neural Information Processing Systems 2, pages 598–605, 1990.
R. Nakano and K. Saito. Discovering polynomials to fit multivariate data having numeric and nominal variables. In Progress in Discovery Science, LNAI 2281, pages 482–493, 2002.
S. J. Nowlan and G. E. Hinton. Simplifying neural networks by soft weight sharing. Neural Computation, 4(4):473–493, 1992.
R. S. Sutton and C. J. Matheus. Learning polynomial functions by feature construction. In Proc. 8th Int. Conf. on Machine Learning, pages 208–212, 1991.
S. B. Thrun, J. Bala, and et al. The Monk’s problem-a performance comparison of different learning algorithm. Technical Report CMU-CS-91-197, CMU, 1991.
G. G. Towell and J. W. Shavlik. Extracting refined rules from knowledge-based neural networks. Machine Learning, 13:71–101, 1993.
N. Ueda, R. Nakano, Z. Ghahramani, and G. E. Hinton. SMEM algorithm for mixture models. Neural Computation, 12(9):2109–2128, 2000.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Saito, K., Nakano, R. (2002). Structuring Neural Networks through Bidirectional Clustering of Weights. In: Lange, S., Satoh, K., Smith, C.H. (eds) Discovery Science. DS 2002. Lecture Notes in Computer Science, vol 2534. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36182-0_19
Download citation
DOI: https://doi.org/10.1007/3-540-36182-0_19
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00188-1
Online ISBN: 978-3-540-36182-4
eBook Packages: Springer Book Archive