Abstract
As a machine learning method, extreme learning machine (ELM) has the characteristics of fast learning speed and high accuracy. With the explosive growth of data volume, running machine learning algorithms on distributed computing platforms is an unstoppable trend. Apache Flink is an open-source stream-based distributed platform for massive data processing with good scalability, high throughput, and fault-tolerant ability. In this paper, we first research the characteristics of ELM and distributed computing platforms, then propose a distributed ELM framework (FL-ELM) which is based on Flink. Then we evaluate this framework with synthetic data on a 5-node distributed cluster. In summary, the advantages of the proposed framework is highlighted as follows: (1) The training speed of FL-ELM is always faster than that in Spark; (2) The scalability of FL-ELM behave better than that in Spark.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Apache flink. http://flink.apache.org/
Apache hadoop. http://hadoop.apache.org/
Apache spark. http://spark.apache.org/
Banerjee, K.S.: Generalized inverse of matrices and its applications. Technometrics 15(1), 197–197 (1971)
Bi, X., Zhao, X., Wang, G., Zhang, P., Wang, C.: Distributed extreme learning machine with kernels based on mapreduce. Neurocomputing 149(PA), 456–463 (2015)
Deng, S., Wang, B., Huang, S., Yue, C., Zhou, J., Wang, G.: Self-adaptive framework for efficient stream data classification on storm. IEEE Trans. Syst. Man Cybern.: Syst. 99, 1–14 (2017)
Ding, S., Zhao, H., Zhang, Y., Xu, X., Nie, R.: Extreme learning machine: algorithm, theory and applications. Artif. Intell. Rev. 44(1), 103–115 (2015)
He, Q., Shang, T., Zhuang, F., Shi, Z.: Parallel extreme learning machine for regression based on mapreduce. Neurocomputing 102(2), 52–58 (2013)
Huang, G., Huang, G.B., Song, S., You, K.: Trends in extreme learning machines: a review. Neural Netw. Official J. Int. Neural Netw. Soc. 61(C), 32–48 (2015)
Huang, G.B., Chen, L.: Enhanced random search based incremental extreme learning machine. Neurocomputing 71(16), 3460–3468 (2008)
Huang, G.B., Chen, L., Siew, C.K.: Universal approximation using incremental constructive feedforward networks with random hidden nodes. IEEE Trans. Neural Netw. 17(4), 879–892 (2006)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: a new learning scheme of feedforward neural networks. Proc. Int. Joint Conf. Neural Netw. 2, 985–990 (2004)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: theory and applications. Neurocomputing 70(1), 489–501 (2006)
Huang, S., Wang, B., Chen, Y., Wang, G., Yu, G.: An efficient parallel method for batched OS-ELM training using mapreduce. Memetic Comput. 9(3), 1–15 (2016)
Morshed, S.J., Rana, J., Milrad, M.: Open source initiatives and frameworks addressing distributed real-time data analytics. In: 2016 IEEE International, Parallel and Distributed Processing Symposium Workshops, pp. 1481–1484 (2016)
Ning, K., Liu, M., Dong, M.: A new robust ELM method based on a Bayesian framework with heavy-tailed distribution and weighted likelihood function. Neurocomputing 149(PB), 891–903 (2015)
Serre, D.: Matrices: Theory and Applications. Mathematics, p. 32, xvi, 221 (2002)
Sun, Y., Yuan, Y., Wang, G.: An OS-ELM based distributed ensemble classification framework in P2P networks. Neurocomputing 74(16), 2438–2443 (2011)
Xin, J., Wang, Z., Chen, C., Ding, L., Wang, G., Zhao, Y.: ELM *: distributed extreme learning machine with mapreduce. World Wide Web-internet Web Inf. Syst. 17(5), 1189–1204 (2014)
Xin, J., Wang, Z., Qu, L., Wang, G.: Elastic extreme learning machine for big data classification. Neurocomputing 149(PA), 464–471 (2015)
Acknowledgments
This research was partially supported by the National Key Research and Development Program of China under Grant No. 2018YFB1004402; and the National Natural Science Foundation of China under Grant No. 61872072, U1401256, 67132003.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ji, H., Wu, G., Wang, G. (2020). Accelerating ELM Training over Data Streams. In: Cao, J., Vong, C., Miche, Y., Lendasse, A. (eds) Proceedings of ELM 2018. ELM 2018. Proceedings in Adaptation, Learning and Optimization, vol 11. Springer, Cham. https://doi.org/10.1007/978-3-030-23307-5_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-23307-5_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-23306-8
Online ISBN: 978-3-030-23307-5
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)