Abstract
Since centralized learning solutions are unable to meet the requirements of mining applications with massive training samples, a solution to distributed learning over massive XML documents is proposed in this paper, which provides distributed conversion of XML documents into representation model in parallel based on MapReduce, and a distributed learning component based on Extreme Learning Machine for mining tasks of classification or clustering. Within this framework, training samples are converted from raw XML datasets with better efficiency and information representation ability and taken to distributed learning algorithms in ELM feature space. Extensive experiments are conducted on massive XML documents datasets to verify the effectiveness and efficiency for both distributed classification and clustering applications.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Dean, J., Ghemawat, S.: MapReduce: Simplied Data Processing on Large Clusters. In: Operating Systems Design and Implementation, pp. 137–150 (2004)
Feng, G., Huang, G.B., Lin, Q., Gay, R.K.L.: Error minimized extreme learning machine with growth of hidden nodes and incremental learning. IEEE Transactions on Neural Networks 20, 1352–1357 (2009)
He, Q., Jin, X., Du, C., Zhuang, F., Shi, Z.: Clustering in extreme learning machine feature space. Neurocomputing 128, 88–95 (2014)
He, Q., Shang, T., Zhuang, F., Shi, Z.: Parallel extreme learning machine for regression based on mapreduce. Neurocomputing 102, 52–58 (2013)
Huang, G., Song, S., Gupta, J., Wu, C.: Semi-supervised and unsupervised extreme learning machines. IEEE Transactions on Cybernetics PP(99), 1–1 (2014)
Huang, G.B., Chen, L.: Convex incremental extreme learning machine. Neurocomputing 70, 3056–3062 (2007)
Huang, G.B., Chen, L.: Enhanced random search based incremental extreme learning machine. Neurocomputing 71, 3460–3468 (2008)
Huang, G.B., Ding, X., Zhou, H.: Optimization method based extreme learning machine for classification. Neurocomputing 74, 155–163 (2010)
Huang, G.B., Zhou, H., Ding, X., Zhang, R.: Extreme Learning Machine for Regression and Multiclass Classification. IEEE Transactions on Systems, Man, and Cybernetics 42, 513–529 (2012)
Huang, G.B., Zhu, Q.Y., Mao, K.Z., Siew, C.K., Saratchandran, P., Sundararajan, N.: Can threshold networks be trained directly? IEEE Transactions on Circuits and Systems II: Analog and Digital Signal Processing 53, 187–191 (2006)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: a new learning scheme of feedforward neural networks. In: International Symposium on Neural Networks, vol. 2 (2004)
Huang, G.B., Zhu, Q.Y., Siew, C.K.: Extreme learning machine: Theory and applications. Neurocomputing 70, 489–501 (2006)
Rong, H.J., Huang, G.B., Sundararajan, N., Saratchandran, P.: Online sequential fuzzy extreme learning machine for function approximation and classification problems. IEEE Transactions on Systems, Man, and Cybernetics 39, 1067–1072 (2009)
Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Information Processing and Management 24, 513–523 (1988)
Wang, B., Huang, S., Qiu, J., Liu, Y., Wang, G.: Parallel Online Sequential Extreme Learning Machine Based on MapReduce. To appear in Neurocomputing (2014)
Zhao, X., Bi, X., Qiao, B.: Probability based voting extreme learning machine for multiclass xml documents classification. In: World Wide Web, pp. 1–15 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Bi, X., Zhao, X., Wang, G., Zhang, Z., Chen, S. (2015). Distributed Learning over Massive XML Documents in ELM Feature Space. In: Cao, J., Mao, K., Cambria, E., Man, Z., Toh, KA. (eds) Proceedings of ELM-2014 Volume 2. Proceedings in Adaptation, Learning and Optimization, vol 4. Springer, Cham. https://doi.org/10.1007/978-3-319-14066-7_27
Download citation
DOI: https://doi.org/10.1007/978-3-319-14066-7_27
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-14065-0
Online ISBN: 978-3-319-14066-7
eBook Packages: EngineeringEngineering (R0)