Skip to main content

Web Crawler Technology Under the Background of Big Data

  • Conference paper
  • First Online:
2020 International Conference on Applications and Techniques in Cyber Intelligence (ATCI 2020)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1244))

  • 1655 Accesses

Abstract

The processing of information data requires the application of crawler technology to collect network information. The purpose of this paper is to provide Suggestions for the optimization of web crawler technology by studying the application of web crawler technology in the context of big data. This paper first expounds the types and development trends of crawler technology, and then probes into the practical application methods of crawler technology in the field of big data in detail. In this method, the name of the object of interest set out by the specified user is captured by simulating the login of microblog. The program USES the techniques of resolving the critical path, breadth traversal, matching the name of the person that meets the specified conditions, and grasping the relevant content. Finally, the program is further optimized and improved, and the result shows that the grasping speed only needs 12 s, which is 1/4 of the traditional method. Based on the experimental results, this paper hopes to promote the application and popularization of crawler technology.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 229.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 299.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Rashmi, K.B., Vijaya Kumar, T., Guruprasad, H.S.: Deep web crawler: exploring and re-ranking of web forms. Int. J. Comput. Appl. 150(1), 32–35 (2016)

    Google Scholar 

  2. Hapase, D., Ingle, M.D.: Technique for proficiently yielding deep-web interfaces using smart crawler. Int. J. Comput. Appl. 146(4), 28–32 (2016)

    Google Scholar 

  3. Ma, Z., Zhang, Z.,, Yan, R.: Collecting model of focused crawler for mongolian website. J. Beijing Univ. Technol. 41(7), 1012–1019 (2015)

    Google Scholar 

  4. Nisha, N., Rajeswari, K.: Study of different focused web crawler to search domain specific information. Int. J. Comput. Appl. 136(11), 1–4 (2016)

    Google Scholar 

  5. Huerta, T.R., Walker, D.M., Ford, E.W.: Cancer center website rankings in the USA: expanding benchmarks and standards for effective public outreach and education. J. Cancer Educ. Off. J. Am. Assoc. Cancer Educ. 32(2), 1–10 (2015)

    Google Scholar 

  6. PriyaIyer, K.B., Shilpa, T.: Efficient web navigator for multi-constrained spatial keyword queries. J. Commun. Softw. Syst. 11(2), 63–69 (2015)

    Article  Google Scholar 

  7. Shi, J., Salmon, C.T.: Identifying opinion leaders to promote organ donation on social media: network study. J. Med. Internet Res. 20(1), e7 (2018)

    Article  Google Scholar 

  8. Hare, L.: The anthropocene trading zone: the new conservation, big data ecology, and the valuation of nature. Behav. Inf. Technol. 6(1), 109–127 (2015)

    Google Scholar 

  9. Hashem, I.A.T., Chang, V., Anuar, N.B., et al.: The role of big data in smart city. Int. J. Inf. Manage. 36(5), 748–758 (2016)

    Article  Google Scholar 

  10. Belle, A., Thiagarajan, R., Soroushmehr, S.M.R., et al.: Big data analytics in healthcare. J. Biomed. Biotechnol. 54(6), 546 (2015)

    Google Scholar 

Download references

Acknowledgements

This work was supported by the following projects: Anhui University Humanities and Social Sciences Key Research Project “Comparative Research on Factors Restricting Vocational College Students’ Learning Performance under the Background of Big Data” (SK2019A0920); Key Projects of the University’s Excellent Young Talent Support Program “Network under the Background of Big Data Related Research on Reptile Technology” (gxyqZD2018131); In 2020, Anhui Provincial Quality Engineering College Online Major Special Needs Project “Innovative Research and Practice of” Five in One “Online Teaching Mode” (2020zdxsjg029), Research on “Two Ends” Resonance System of Online Teaching (2020zdxsjg027).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Li Guo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Guo, L. (2021). Web Crawler Technology Under the Background of Big Data. In: Abawajy, J., Choo, KK., Xu, Z., Atiquzzaman, M. (eds) 2020 International Conference on Applications and Techniques in Cyber Intelligence. ATCI 2020. Advances in Intelligent Systems and Computing, vol 1244. Springer, Cham. https://doi.org/10.1007/978-3-030-53980-1_71

Download citation

Publish with us

Policies and ethics