The Advance and Performance Analysis of MapReduce

Han, Rongpei; Wang, Yiting

doi:10.1007/978-981-99-4554-2_20

Rongpei Han⁴¹ &
Yiting Wang⁴²

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 1063))

Included in the following conference series:

International Conference on Artificial Intelligence, Robotics, and Communication

144 Accesses

Abstract

Cloud computing is highly praised for its high data reliability, lower cost, and nearly unlimited storage. In cloud computing projects, the MapReduce distributed computing model is prevalent. MapReduce distributed computing model is mainly divided into the Map and Reduce functions. As a mapper, the Map function is responsible for dividing tasks (such as uploaded files) into multiple small tasks executed separately; As a reducer, the Reduce function is responsible for summarizing the processing results of multiple tasks after decomposition. It is a scalable and fault-tolerant data processing tool that can process huge voluminous data in parallel with many low-end computing nodes. This paper implements the wordcount program based on the MapReduce framework and uses different dividing methods and data sizes to test the program. The common faults faced by the MapReduce framework also emerged during the experiment. This paper proposes schemes to improve the efficiency of the MapReduce framework. Finally, building an index or using a machine learning model to alleviate data skew is proposed to improve program efficiency. The application system is recommended to be a hybrid system with different modules to process variant tasks.

Rongpei Han and Yiting Wang these authors contributed equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

The Emergence of Modified Hadoop Online-Based MapReduce Technology in Cloud Environments

The Family of Map-Reduce

A Priori Study on Factors Affecting MapReduce Performance in Cloud-Based Environment

References

Baldini I, Castro P, Chang K et al (2017) Serverless computing: current trends and open problems. In: Research advances in cloud computing. Springer Singapore, pp 1–20. https://doi.org/10.1007/978-981-10-5026-8_1
Barranco CD, Campaña JR, Medina JM (2008) A B +—tree based indexing technique for fuzzy numerical data. Fuzzy Sets Syst 159(12):1431–1449. https://doi.org/10.1016/j.fss.2008.01.006
Article MathSciNet MATH Google Scholar
Benois-Pineau J, Zemmari A (2021) Multi-faceted deep learning. Springer International Publishing
Google Scholar
Chen Q, Yao J, Xiao Z (2015) LIBRA: lightweight data skew mitigation in MapReduce. IEEE Trans Parallel Distrib Syst 26:2520–2533
Article Google Scholar
Dean J, Ghemawat S (2004) MapReduce: simplified data processing on large clusters. In: Proceedings of the 6th conference on symposium on operating systems design & implementation, vol 6
Google Scholar
DeWitt D, Stonebraker M (2008) MapReduce: a major step backwards. Database Column 1:23
Google Scholar
Giménez-Alventosa V, Moltó G, Caballer M (2019) A framework and a performance assessment for serverless MapReduce on AWS Lambda. Future Gener Comput Syst 97:259–274. https://doi.org/10.1016/j.future.2019.02.057
Irandoost MA, Rahmani AM, Setayeshi S (2019) A novel algorithm for handling reducer side data skew in MapReduce based on a learning automata game. Inform Sci Int J 501:501
Google Scholar
Sardar TH, Ansari Z (2018) Partition based clustering of large datasets using MapReduce framework: an analysis of recent themes and directions. Future Comput Inform J 3(2):247–261. https://doi.org/10.1016/j.fcij.2018.06.002
Article Google Scholar
Sosinsky BA (2011) Cloud computing bible. Wiley Pub
Google Scholar

Download references

Author information

Authors and Affiliations

FedUni Information Engineering Institute, Hebei University of Science and Technology, Shijiazhuang, 050000, China
Rongpei Han
Chengdu University of Technology, Oxford Brookes University, Chengdu, 610000, China
Yiting Wang

Authors

Rongpei Han
View author publications
You can also search for this author in PubMed Google Scholar
Yiting Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Rongpei Han .

Editor information

Editors and Affiliations

Faculty of Physical Sciences, CSIR-NPL, New Delhi, India
Sanjay Yadav
Dept. of Mechanical Engineering, National Institute of Technology Delhi, Delhi, India
Harish Kumar
Department of Mechanical Engineering, Indian Institute of Technology Indore, Indore, Madhya Pradesh, India
Pavan Kumar Kankar
Nanjing University, Nanjing, China
Wanyang Dai
Fuzhou Economic Technology development zone, Yango University, Fuzhou, China
Fenghua Huang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Han, R., Wang, Y. (2023). The Advance and Performance Analysis of MapReduce. In: Yadav, S., Kumar, H., Kankar, P.K., Dai, W., Huang, F. (eds) Proceedings of 2nd International Conference on Artificial Intelligence, Robotics, and Communication . ICAIRC 2022. Lecture Notes in Electrical Engineering, vol 1063. Springer, Singapore. https://doi.org/10.1007/978-981-99-4554-2_20

Download citation

DOI: https://doi.org/10.1007/978-981-99-4554-2_20
Published: 01 October 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-4553-5
Online ISBN: 978-981-99-4554-2
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

The Advance and Performance Analysis of MapReduce

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Emergence of Modified Hadoop Online-Based MapReduce Technology in Cloud Environments

The Family of Map-Reduce

A Priori Study on Factors Affecting MapReduce Performance in Cloud-Based Environment

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

The Advance and Performance Analysis of MapReduce

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

The Emergence of Modified Hadoop Online-Based MapReduce Technology in Cloud Environments

The Family of Map-Reduce

A Priori Study on Factors Affecting MapReduce Performance in Cloud-Based Environment

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation