Abstract
Most of the existing analysis methods for tensors (or multi-way arrays) only assume that tensors to be completed are of low rank. However, for example, when they are applied to tensor completion problems, their prediction accuracy tends to be significantly worse when only a limited number of entries are observed. In this paper, we propose to use relationships among data as auxiliary information in addition to the low-rank assumption to improve the quality of tensor decomposition. We introduce two regularization approaches using graph Laplacians induced from the relationships, one for moderately sparse cases and the other for extremely sparse cases. We also give present two kinds of iterative algorithms for approximate solutions: one based on an EM-like algorithms which is stable but not so scalable, and the other based on gradient-based optimization which is applicable to large scale datasets. Numerical experiments on tensor completion using synthetic and benchmark datasets show that the use of auxiliary information improves completion accuracy over the existing methods based only on the low-rank assumption, especially when observations are sparse.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Acar E, Dunlavy DM, Kolda TG, Mørup M (2010) Scalable tensor factorizations with missing data. In: Proceedings of the 2010 SIAM international conference on data mining, pp 701–712
Adams RP, Dahl GE, Murray I (2010) Incorporating side information into probabilistic matrix factorization using Gaussian processes. In: Grünwald P, Spirtes P (eds) Proceedings of the 26th conference on uncertainty in artificial intelligence, Catalina Island, California, pp 1–9
Banerjee A, Basu S, Merugu S (2007) Multi-way clustering on relation graphs. In: Proceedings of the 2007 SIAM international conference on data mining
Cai D, He X, Han J, Huang TS (2010) Graph regularized non-negative matrix factorization for data representation. IEEE Trans Pattern Anal Mach Intell 33(10): 2026–2038
Cai JF, Candes EJ, She Z (2010) A singular value thresholding algorithm for matrix completion. SIAM J Optim 20(4): 1956–1982
Candes EJ, Tao T (2010) The power of convex relaxation: near-optimal matrix completion. IEEE Trans Inf Theory 56(5): 2053–2080
Chu W, Ghahramani Z (2009) Probabilistic models for incomplete multi-dimensional arrays. In: Proceedings of the 12th international conference on artificial intelligence and statistics
Collins M, Dasgupta S, Schapire RE (2002) A generalization of principal components analysis to the exponential family. In: Dietterich TG, Becker S, Ghahramani Z (eds) Advances in neural information processing systems, vol 14. MIT Press, Cambridge
Dunlavy DM, Kolda TG, Acar E (2011) Temporal link prediction using matrix and tensor factorizations. ACM Trans Knowl Discov Data 5: 10–11027
Gandy S, Recht B, Yamada I (2010) Tensor completion and low-n-rank tensor recovery via convex optimization. Inverse Problems 27(2): 025010
Harshman RA (1970) Foundations of the PARAFAC procedure: models and conditions for an “explanatory” multi-modal factor analysis. UCLA Work Pap Phonetics 16(1): 84
Hayashi K, Takenouchi T, Shibata T, Kamiya Y, Kato D, Kunieda K, Yamada K, Ikeda K (2010) Exponential family tensor factorization for missing-values prediction and anomaly detection. In: Proceedings of the 10th IEEE international conference on data mining, pp 216 –225
Kashima H, Kato T, Yamanishi Y, Sugiyama M, Tsuda K (2009) Link propagation: a fast semi-supervised algorithm for link prediction. In: Proceedings of the 2009 SIAM international conference on data mining
Kolda TG, Bader BW (2009) Tensor decompositions and applications. SIAM Rev 51(3): 455–500
Li WJ, Yeung DY (2009) Relation regularized matrix factorization. In: Proceedings of the 21st international joint conference on artificial intelligence, pp 1126–1131
Lu Z, Agarwal D, Dhillon IS (2009) A spatio-temporal approach to collaborative filtering. In: Proceedings of the 3rd ACM conference on recommender systems, pp 13–20
Nishimori Y, Akaho S (2010) Learning algorithms utilizing quasi-geodesic flows on the Stiefel manifold. Neurocomputing 67: 106–135
Plumbley MD (2005) Geometrical methods for non-negative ICA: manifolds, Lie groups and toral subalgebras. Neurocomputing 67: 161–197
Porteous I, Asuncion A, Welling M (2010) Bayesian matrix factorization with side information and Dirichlet process mixtures. In: Proceedings of the 24th AAAI conference on artificial intelligence, pp 563–568
Rendle S, Thieme LS (2010) Pairwise interaction tensor factorization for personalized tag recommendation. In: Proceedings of the 3rd ACM international conference on web search and data mining, pp 81–90
Salakhutdinov R, Mnih A (2008) Probabilistic matrix factorization. In: Platt JC, Koller D, Singer Y, Roweis S (eds.) Advances in neural information processing systems, vol 20. MIT Press, Cambridge
Shashua A, Hazan T (2005) Non-negative tensor factorization with applications to statistics and computer vision. In: Proceedings of the 22nd international conference on machine learning, pp 792–799
Srebro N (2004) Learning with matrix factorizations. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge
Srebro N, Rennie J, Jaakkola T (2005) Maximum-margin matrix factorization. In: Advances in neural information processing systems, vol 17. MIT Press, Cambridge
Tomioka R, Suzuki T, Hayashi K, Kashima H (2012) Statistical performance of convex tensor decomposition. In: Advances in Neural Information Processing Systems, vol 24. MIT Press, Cambridge
Tucker L (1966) Some mathematical notes on three-mode factor analysis. Psychometrika 31(3): 279–311
Walczak B (2001) Dealing with missing data. Part I. Chemom Intell Lab Syst 58(1): 15–27
Yu K, Lafferty J, Zhu S, Gong Y (2009) Large-scale collaborative prediction using a nonparametric random effects model. In: Proceedings of the 26th international conference on machine learning, pp 1185–1192
Acknowledgments
The authors would like to thank the anonymous reviewers of ECML PKDD 2011 for their valuable comments and suggestions to improve the quality of the paper.
Open Access
This article is distributed under the terms of the Creative Commons Attribution License which permits any use, distribution, and reproduction in any medium, provided the original author(s) and the source are credited.
Author information
Authors and Affiliations
Corresponding author
Additional information
Responsible editor: Dimitrios Gunopulos, Donato Malerba and Michalis Vazirgiannis.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 2.0 International License (https://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
About this article
Cite this article
Narita, A., Hayashi, K., Tomioka, R. et al. Tensor factorization using auxiliary information. Data Min Knowl Disc 25, 298–324 (2012). https://doi.org/10.1007/s10618-012-0280-z
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10618-012-0280-z