Abstract
Big Data provides an opportunity to interrogate some of the deepest scientific mysteries, e.g., how the brain works and develop new technologies, like driverless cars which, till very recently, were more in the realm of science fiction than reality. However Big Data as an entity in its own right creates several computational and statistical challenges in algorithm, systems and machine learning design that need to be addressed. In this paper we survey the Big Data landscape and map out the hurdles that must be overcome and opportunities that can be exploited in this paradigm shifting phenomenon.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Bottou, L., Bengio, Y.: Convergence properties of the k-means algorithms. In: Advances in Neural Information Processing Systems Conference, NIPS, Denver, Colorado, USA, pp. 585–592 (1994)
Bottou, L., Bousquet, O.: The tradeoffs of large scale learning. In: Proceedings of the Twenty-First Annual Conference on Neural Information Processing Systems, Advances in Neural Information Processing Systems 2007, Vancouver, British Columbia, Canada, December 3-6, pp. 161–168 (2007)
Christen, P.: A survey of indexing techniques for scalable record linkage and deduplication. IEEE Trans. Knowl. Data Eng. 24(9), 1537–1555 (2012)
Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M.: Large scale distributed deep networks. In: Pereira, F., Burges, C., Bottou, L., Weinberger, K. (eds.) Advances in Neural Information Processing Systems, vol. 25, pp. 1223–1231 (2012)
Goodfellow, I.J., et al.: Challenges in representation learning: A report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013, Part III. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013)
Hinton, G.E.: Deep belief nets. In: Encyclopedia of Machine Learning, pp. 267–269 (2010)
Kraska, T., Talwalkar, A., Duchi, J.C., Griffith, R., Franklin, M.J., Jordan, M.I.: Mlbase: A distributed machine-learning system. In: Proceedings of the Sixth Biennial Conference on Innovative Data Systems Research, CIDR 2013, Asilomar, CA, USA, January 6-9 (2013)
Li, M., Andersen, D.G., Park, J.W., Smola, A.J., Ahmed, A., Josifovski, V., Long, J., Shekita, E.J., Su, B.-Y.: Scaling distributed machine learning with the parameter server. In: 11th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2014, Broomfield, CO, USA, October 6-8, pp. 583–598 (2014)
Mairal, J., Bach, F., Ponce, J., Sapiro, G.: Online learning for matrix factorization and sparse coding. Journal of Machine Learning Research 11, 19–60 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Agrawal, D., Chawla, S. (2015). The Big Data Landscape: Hurdles and Opportunities. In: Chu, W., Kikuchi, S., Bhalla, S. (eds) Databases in Networked Information Systems. DNIS 2015. Lecture Notes in Computer Science, vol 8999. Springer, Cham. https://doi.org/10.1007/978-3-319-16313-0_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-16313-0_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16312-3
Online ISBN: 978-3-319-16313-0
eBook Packages: Computer ScienceComputer Science (R0)