Efficient Processing of Multi-way Joins Using MapReduce

Ding, Linlin; Liu, Siping; Liu, Yu; Liu, Aili; Song, Baoyan

doi:10.1007/978-3-662-46248-5_10

Linlin Ding¹⁸,
Siping Liu¹⁸,
Yu Liu¹⁸,
Aili Liu¹⁸ &
…
Baoyan Song¹⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 503))

Included in the following conference series:

International Conference of Young Computer Scientists, Engineers and Educators

2044 Accesses
1 Citations

Abstract

Multi-way join is critical for many big data applications such as data mining and knowledge discovery. Even though lots of research have been devoted to processing multi-way joins using MapReduce, there are still several problems in general to be further improved, such as transferring numerous unpromising intermediate data and lacking of better coordination mechanisms. This work proposes an efficient multi-way joins processing model using MapReduce, named Sharing-Coordination-MapReduce (SC-MapReduce), which has the functions of sharing and coordination. Our SC-MapReduce model can filter the unpromising intermediate data largely by using the sharing mechanism and optimize the multiple tasks coordination of multi-way joins. Extensive experiments show that the proposed model is efficient, robust and scalable.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Join Optimization for Large-Scale Data Analysis in MapReduce

Improvement of Join Algorithms for Low-Selectivity Joins on MapReduce

Network-Aware Multiway Join for MapReduce

Keywords

References

Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM (CACM) 51(1), 107–113 (2008)
Article Google Scholar
Okcan, A., Riedewald, M.: Processing theta-joins using MapReduce. In: SIGMOD, pp. 949–960 (2011)
Google Scholar
Afrati, F.N., Ullman, J.D.: Optimizing Multiway Joins in a Map-Reduce Environment. IEEE Trans. Knowl. Data Eng (TKDE) 23(9), 1282–1298 (2011)
Article Google Scholar
Zhang, X., Chen, L., Wang, M.: Efficient Multi-way Theta-Join Processing Using MapReduce. PVLDB 5(11), 1184–1195 (2012)
MathSciNet Google Scholar
Pansare, N., Borkar, V.R., Jermaine, C., Condie, T.: Online Aggregation for Large MapReduce Jobs. PVLDB 4(11), 1135–1145 (2011)
Google Scholar
Okcan, A., Riedewald, M.: Processing theta-joins using MapReduce. In: SIGMOD, pp. 949–960 (2011)
Google Scholar
Afrati, F.N., Ullman, J.D.: Optimizing joins in a map-reduce environment. In: EDBT, pp. 99–110 (2010)
Google Scholar
Vernica, R., Carey, M.J., Li, C.: Efficient parallel set-similarity joins using MapReduce. In: SIGMOD, pp. 495–506 (2010)
Google Scholar
Jiang, D., Tung, A.K.H., Chen, G.: MAP-JOIN-REDUCE: Toward Scalable and Efficient Data Analysis on Large Clusters. IEEE Trans. Knowl. Data Eng (TKDE) 23(9), 1299–1311 (2011)
Article Google Scholar
Fries, S., Boden, B., et al.: PHiDJ: Parallel similarity self-join for high-dimensional vector data with MapReduce. In: ICDE, pp. 796–807 (2014)
Google Scholar
Ma, Y., Meng, X.: Set similarity join on massive probabilistic data using MapReduce. Distributed and Parallel Databases (DPD) 32(3), 447–464 (2014)
Article Google Scholar
Lee, T., Bae, H.-C., et al.: Join processing with threshold-based filtering in MapReduce. The Journal of Supercomputing (TJS) 69(2), 793–813 (2014)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

School of Information, Liaoning University, Shenyang, Liaoning, P.R. China
Linlin Ding, Siping Liu, Yu Liu, Aili Liu & Baoyan Song

Authors

Linlin Ding
View author publications
You can also search for this author in PubMed Google Scholar
Siping Liu
View author publications
You can also search for this author in PubMed Google Scholar
Yu Liu
View author publications
You can also search for this author in PubMed Google Scholar
Aili Liu
View author publications
You can also search for this author in PubMed Google Scholar
Baoyan Song
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Harbin Institute of Technology, Harbin, China
Hongzhi Wang & Wanxiang Che &
School of Computer Science and Technology, Heilongjiang Institute of Technology, Harbin, China
Haoliang Qi & Zhongyuan Han &
Northeast Forestry University, Harbin, China
Zhaowen Qiu
Heilongjiang Institute of Technology, Harbin, China
Leilei Kong
Harbin Engineering University, China
Junyu Lin
Zhongkeyunhai Company, Harbin, China
Zeguang Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ding, L., Liu, S., Liu, Y., Liu, A., Song, B. (2015). Efficient Processing of Multi-way Joins Using MapReduce. In: Wang, H., et al. Intelligent Computation in Big Data Era. ICYCSEE 2015. Communications in Computer and Information Science, vol 503. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-46248-5_10

Download citation

DOI: https://doi.org/10.1007/978-3-662-46248-5_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-46247-8
Online ISBN: 978-3-662-46248-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficient Processing of Multi-way Joins Using MapReduce

Abstract

Chapter PDF

Similar content being viewed by others

Join Optimization for Large-Scale Data Analysis in MapReduce

Improvement of Join Algorithms for Low-Selectivity Joins on MapReduce

Network-Aware Multiway Join for MapReduce

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Efficient Processing of Multi-way Joins Using MapReduce

Abstract

Chapter PDF

Similar content being viewed by others

Join Optimization for Large-Scale Data Analysis in MapReduce

Improvement of Join Algorithms for Low-Selectivity Joins on MapReduce

Network-Aware Multiway Join for MapReduce

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation