Abstract
The processing of information data requires the application of crawler technology to collect network information. The purpose of this paper is to provide Suggestions for the optimization of web crawler technology by studying the application of web crawler technology in the context of big data. This paper first expounds the types and development trends of crawler technology, and then probes into the practical application methods of crawler technology in the field of big data in detail. In this method, the name of the object of interest set out by the specified user is captured by simulating the login of microblog. The program USES the techniques of resolving the critical path, breadth traversal, matching the name of the person that meets the specified conditions, and grasping the relevant content. Finally, the program is further optimized and improved, and the result shows that the grasping speed only needs 12 s, which is 1/4 of the traditional method. Based on the experimental results, this paper hopes to promote the application and popularization of crawler technology.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Rashmi, K.B., Vijaya Kumar, T., Guruprasad, H.S.: Deep web crawler: exploring and re-ranking of web forms. Int. J. Comput. Appl. 150(1), 32–35 (2016)
Hapase, D., Ingle, M.D.: Technique for proficiently yielding deep-web interfaces using smart crawler. Int. J. Comput. Appl. 146(4), 28–32 (2016)
Ma, Z., Zhang, Z.,, Yan, R.: Collecting model of focused crawler for mongolian website. J. Beijing Univ. Technol. 41(7), 1012–1019 (2015)
Nisha, N., Rajeswari, K.: Study of different focused web crawler to search domain specific information. Int. J. Comput. Appl. 136(11), 1–4 (2016)
Huerta, T.R., Walker, D.M., Ford, E.W.: Cancer center website rankings in the USA: expanding benchmarks and standards for effective public outreach and education. J. Cancer Educ. Off. J. Am. Assoc. Cancer Educ. 32(2), 1–10 (2015)
PriyaIyer, K.B., Shilpa, T.: Efficient web navigator for multi-constrained spatial keyword queries. J. Commun. Softw. Syst. 11(2), 63–69 (2015)
Shi, J., Salmon, C.T.: Identifying opinion leaders to promote organ donation on social media: network study. J. Med. Internet Res. 20(1), e7 (2018)
Hare, L.: The anthropocene trading zone: the new conservation, big data ecology, and the valuation of nature. Behav. Inf. Technol. 6(1), 109–127 (2015)
Hashem, I.A.T., Chang, V., Anuar, N.B., et al.: The role of big data in smart city. Int. J. Inf. Manage. 36(5), 748–758 (2016)
Belle, A., Thiagarajan, R., Soroushmehr, S.M.R., et al.: Big data analytics in healthcare. J. Biomed. Biotechnol. 54(6), 546 (2015)
Acknowledgements
This work was supported by the following projects: Anhui University Humanities and Social Sciences Key Research Project “Comparative Research on Factors Restricting Vocational College Students’ Learning Performance under the Background of Big Data” (SK2019A0920); Key Projects of the University’s Excellent Young Talent Support Program “Network under the Background of Big Data Related Research on Reptile Technology” (gxyqZD2018131); In 2020, Anhui Provincial Quality Engineering College Online Major Special Needs Project “Innovative Research and Practice of” Five in One “Online Teaching Mode” (2020zdxsjg029), Research on “Two Ends” Resonance System of Online Teaching (2020zdxsjg027).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Guo, L. (2021). Web Crawler Technology Under the Background of Big Data. In: Abawajy, J., Choo, KK., Xu, Z., Atiquzzaman, M. (eds) 2020 International Conference on Applications and Techniques in Cyber Intelligence. ATCI 2020. Advances in Intelligent Systems and Computing, vol 1244. Springer, Cham. https://doi.org/10.1007/978-3-030-53980-1_71
Download citation
DOI: https://doi.org/10.1007/978-3-030-53980-1_71
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-53979-5
Online ISBN: 978-3-030-53980-1
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)