Abstract
Stochastic matrices are arrays whose elements are discrete probabilities. They are widely used in techniques such as Markov Chains, probabilistic latent semantic analysis, etc. In such learning problems, the learned matrices, being stochastic matrices, are non-negative and all or part of the elements sum up to one. Conventional multiplicative updates which have been widely used for nonnegative learning cannot accommodate the stochasticity constraint. Simply normalizing the nonnegative matrix in learning at each step may have an adverse effect on the convergence of the optimization algorithm. Here we discuss and compare two alternative ways in developing multiplicative update rules for stochastic matrices. One reparameterizes the matrices before applying the multiplicative update principle, and the other employs relaxation with Lagrangian multipliers such that the updates jointly optimize the objective and steer the estimate towards the constraint manifold. We compare the new methods against the conventional normalization approach on two applications, parameter estimation of Hidden Markov Chain Model and Information-Theoretic Clustering. Empirical studies on both synthetic and real-world datasets demonstrate that the algorithms using the new methods perform more stably and efficiently than the conventional ones.
Chapter PDF
Similar content being viewed by others
References
Dhillon, I.S., Sra, S.: Generalized nonnegative matrix approximations with bregman divergences. In: Advances in Neural Information Processing Systems, vol. 18, pp. 283–290 (2006)
Choi, S.: Algorithms for orthogonal nonnegative matrix factorization. In: Proceedings of IEEE International Joint Conference on Neural Networks, pp. 1828–1832 (2008)
Cichocki, A., Zdunek, R., Phan, A.H., Amari, S.: Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-way Data Analysis. John Wiley (2009)
Faivishevsky, B., Goldberger, J.: A nonparametric information theoretic clustering algorithm. In: The 27th International Conference on Machine Learning (2010)
Jin, R., Ding, C., Kang, F.: A probabilistic approach for optimizing spectral clustering. In: Advances in Neural Information Processing Systems, pp. 571–578 (2005)
Lee, D.D., Seung, H.S.: Learning the parts of objects by non-negative matrix factorization. Nature 401, 788–791 (1999)
Ding, C., Li, T., Peng, W.: On the equivalence between non-negative matrix factorization and probabilistic latent semantic indexing. Computational Statistics and Data Analysis 52(8), 3913–3927 (2008)
Mørup, M., Hansen, L.: Archetypal analysis for machine learning. In: IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 172–177. IEEE (2010)
Yang, Z., Oja, E.: Linear and nonlinear projective nonnegative matrix factorization. IEEE Transaction on Neural Networks 21(5), 734–749 (2010)
Yang, Z., Oja, E.: Unified development of multiplicative algorithms for linear and quadratic nonnegative matrix factorization. IEEE Transactions on Neural Networks 22(12), 1878–1891 (2011)
Yang, Z., Oja, E.: Quadratic nonnegative matrix factorization. Pattern Recognition 45(4), 1500–1510 (2012)
Yang, Z., Oja, E.: Clustering by low-rank doubly stochastic matrix decomposition. In: International Conference on Machine Learning (ICML) (2012)
Lakshminarayanan, B., Raich, R.: Non-negative matrix factorization for parameter estimation in hidden markov models. In: IEEE International Workshop on Machine Learning for Signal Processing (MLSP), pp. 89–94 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Zhu, Z., Yang, Z., Oja, E. (2013). Multiplicative Updates for Learning with Stochastic Matrices. In: Kämäräinen, JK., Koskela, M. (eds) Image Analysis. SCIA 2013. Lecture Notes in Computer Science, vol 7944. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-38886-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-38886-6_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-38885-9
Online ISBN: 978-3-642-38886-6
eBook Packages: Computer ScienceComputer Science (R0)