Abstract
Background
Restricted Boltzmann machines (RBMs) are endowed with the universal power of modeling (binary) joint distributions. Meanwhile, as a result of their confining network structure, training RBMs confronts less difficulties when dealing with approximation and inference issues. But little work has been developed to fully exploit the capacity of these models to analyze cancer data, e.g., cancer genomic, transcriptomic, proteomic and epigenomic data. On the other hand, in the cancer data analysis task, the number of features/predictors is usually much larger than the sample size, which is known as the “p ≫ N” problem and is also ubiquitous in other bioinformatics and computational biology fields. The “p ≫ N” problem puts the bias-variance trade-off in a more crucial place when designing statistical learning methods. However, to date, few RBM models have been particularly designed to address this issue.
Methods
We propose a novel RBMs model, called elastic restricted Boltzmann machines (eRBMs), which incorporates the elastic regularization term into the likelihood function, to balance the model complexity and sensitivity. Facilitated by the classic contrastive divergence (CD) algorithm, we develop the elastic contrastive divergence (eCD) algorithm which can train eRBMs efficiently.
Results
We obtain several theoretical results on the rationality and properties of our model.We further evaluate the power of our model based on a challenging task — predicting dichotomized survival time using the molecular profiling of tumors. The test results show that the prediction performance of eRBMs is much superior to that of the state-of-the-art methods.
Conclusions
The proposed eRBMs are capable of dealing with the “p ≫ N” problems and have superior modeling performance over traditional methods. Our novel model is a promising method for future cancer data analysis.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Ding, L., Wendl, M. C., McMichael, J. F. and Raphael, B. J. (2014) Expanding the computational toolbox for mining cancer genomes. Nat. Rev. Genet., 15, 556–570
Jiang, P. and Liu, X. S. (2015) Big data mining yields novel insights on cancer. Nat. Genet., 47, 103–104
Kristensen, V. N., Lingjærde, O. C., Russnes, H. G., Vollan, H. K. M., Frigessi, A. and Børresen-Dale, A.-L. (2014) Principles and methods of integrative genomic analyses in cancer. Nat. Rev. Cancer, 14, 299–313
The Cancer Genome Atlas Research Network, Weinstein, J. N., Collisson, E. A., Mills, G. B., Shaw, K. R., Ozenberger, B. A., Ellrott, K., Shmulevich, I., Sander, C., and Stuart, J. M. (2013) The cancer genome atlas pan-cancer analysis project. Nat. Genet., 45, 1113–1120
Hastie, T., Tibshirani, R. and Friedman, J. (2009) The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer Series in Statistics, 2nd ed., NewYork: Springer
West, M., Blanchette, C., Dressman, H., Huang, E., Ishida, S., Spang, R., Zuzan, H., Olson, J. A., Marks, J. R. and Nevins, J. R. (2001) Predicting the clinical status of human breast cancer by using gene expression profiles. Proc. Natl. Acad. Sci. USA, 98, 11462–11467
Fan, J. and Lv, J. (2010) A selective overview of variable selection in high dimensional feature space. Stat Sin, 20, 101–148
Tibshirani, R. (1994) Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. B, 58, 267–288
Zou, H. and Hastie, T. (2005) Regularization and variable selection via the elastic net. J. R. Statist. Soc. B, 67, 301–320
Fischer, A. and Igel, C. (2012) An Introduction to Restricted Boltzmann Machines. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications. Alvarez, L., Mejail, M., Gomez, L. and Jacobo, J. eds., Vol. 7441 of Lecture Notes in Computer Science, pp. 14–36, Berlin: Springer
Hinton, G. E. and Salakhutdinov, R. R. (2006) Reducing the dimensionality of data with neural networks. Science, 313, 504–507
Hinton, G. E., Osindero, S. and Teh, Y.-W. (2006) A fast learning algorithm for deep belief nets. Neural Comput., 18, 1527–1554
Bengio, Y. (2009) Learning deep architectures for AI. Found. Trends Mach. Learn., 2, 1–127
Zhang, S., Zhou, J., Hu, H., Gong, H., Chen, L., Cheng, C. and Zeng, J. (2016) A deep learning framework for modeling structural features of RNA-binding protein targets. Nucleic Acids Res., 44, e32
Salakhutdinov, R. and Hinton, G. E. (2009) Deep boltzmann machines. In International Conference on Artificial Intelligence and Statistics, 448–455
Le Roux, N. and Bengio, Y. (2008) Representational power of restricted boltzmann machines and deep belief networks. Neural Comput., 20, 1631–1649
Hinton, G. E. (2002) Training products of experts by minimizing contrastive divergence. Neural Comput., 14, 1771–1800
Hinton, G. E. and Salakhutdinov, R. R. (2009) Replicated Softmax: an Undirected Topic model. In Advances in Neural Information Processing Systems 22. Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C. and Culotta, A. eds., pp. 1607–1614. New York: Curran Associates, Inc
Salakhutdinov, R., Mnih, A. and Hinton, G. (2007) Restricted boltzmann machines for collaborative filtering. In Proceedings of the 24th International Conference on Machine Learning, 791–798
Wang, Y. and Zeng, J. (2013) Predicting drug-target interactions using restricted Boltzmann machines. Bioinformatics, 29, i126–i134
Hinton, G. (2010) A practical guide to training restricted Boltzmann machines. In Neural Networks: Tricks of the Trade, pp. 599–619. Berlin: Springer
Yuan, Y., Van Allen, E. M., Omberg, L.,Wagle, N., Amin-Mansour, A., Sokolov, A., Byers, L. A., Xu, Y., Hess, K. R., Diao, L., et al. (2014) Assessing the clinical utility of cancer genomic and proteomic data across tumor types. Nat. Biotechnol., 32, 644–652
Bengio, Y. (2012) Practical recommendations for gradient-based training of deep architectures. arXiv:1206.5533
Schervish, M. J. (1995) Theory of Statistics. In Springer series in statistics. New York: Springer. Corrected second printing: 1997
Fan, J. and Li, R. (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc., 96, 1348–1360
Olshausen, B. A. and Field, D. J. (1997) Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Res., 37, 3311–3325
Olshausen, B. A. and Field, D. J. (1996) Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Nature, 381, 607–609
Bengio, Y., Courville, A. and Vincent, P. (2013) Representation learning: a review and new perspectives. IEEE Trans. Pattern Anal. Mach. Intell., 35, 1798–1828
Ranzato, M. A., Boureau, Y. L. and Le Cun, Y., (2008) Sparse Feature Learning for Deep Belief Networks. In Advances in Neural Information Processing Systems 20. Platt, J., Koller, D., Singer, Y. and Roweis, S. eds., pp. 1185–1192, New York: Curran Associates, Inc
Ranzato, M. A., Poultney, C., Chopra, S. and Le Cun, Y. (2007) Efficient Learning of Sparse Representations with an Energy-based Model. In Advances in Neural Information Processing Systems 19. Schölkopf, B. Platt, J. and Hoffman, T., eds., pp. 1137–1144. Cambridge: MIT Press
Ranzato, M., Huang, F., Boureau, Y. and LeCun, Y. (2007) Unsupervised learning of invariant feature hierarchies with applications to object recognition. In Computer Vision and Pattern Recognition, IEEE Computer Society Conference on, 1–8
Nair, V. and Hinton, G. E. (2009) 3D Object Recognition with Deep Belief Nets. In Advances in Neural Information Processing Systems 22. Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C. and Culotta, A. eds., 1339–1347. New York: Curran Associates, Inc
Min, W., Liu, J. and Zhang, S. (2016) Network-regularized sparse logistic regression models for clinical risk prediction and biomarker discovery. arXiv:1609.06480
Chawla, N. V. (2005) Data Mining for Imbalanced Datasets: an Overview. In Data Mining and Knowledge Discovery Handbook. pp. 853–867, New York: Springer
Larochelle, H. and Bengio, Y. (2008) Classification using discriminative restricted boltzmann machines. In Proceedings of the 25th International Conference on Machine Learning, 536–543
Larochelle, H., Mandel, M., Pascanu, R. and Bengio, Y. (2012) Learning algorithms for the classification restricted boltzmann machine. J. Mach. Learn. Res., 13, 643–669
Le Roux, N. and Bengio, Y. (2008) Representational power of restricted boltzmann machines and deep belief networks. Neural Comput., 20, 1631–1649
Vapnik, V. N. (1998) Statistical Learning Theory. 1 ed, New Jersey: Wiley
Efron, B., Hastie, T., Johnstone, L. and Tibshirani, R. (2004) Least angle regression. Ann. Stat., 32, 407–499
Boyd, S. and Vandenberghe, L. (2004) Convex Optimization. New York: Cambridge University Press
Acknowledgments
This work was supported in part by the National Basic Research Program of China (Nos. 2011CBA00300 and 2011CBA00301), the National Natural Science Foundation of China (Nos. 61033001, 61361136003 and 61472205), and China’s Youth 1000-Talent Program, the Beijing Advanced Innovation Center for Structural Biology.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhang, S., Liang, M., Zhou, Z. et al. Elastic restricted Boltzmann machines for cancer data analysis. Quant Biol 5, 159–172 (2017). https://doi.org/10.1007/s40484-017-0092-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40484-017-0092-7