Giraphx: Parallel Yet Serializable Large-Scale Graph Processing

Tasci, Serafettin; Demirbas, Murat

doi:10.1007/978-3-642-40047-6_47

Serafettin Tasci¹⁹ &
Murat Demirbas¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 8097))

Included in the following conference series:

European Conference on Parallel Processing

3783 Accesses
12 Citations

Abstract

Bulk Synchronous Parallelism (BSP) provides a good model for parallel processing of many large-scale graph applications, however it is unsuitable/inefficient for graph applications that require coordination, such as graph-coloring, subcoloring, and clustering. To address this problem, we present an efficient modification to the BSP model to implement serializability (sequential consistency) without reducing the highly-parallel nature of BSP. Our modification bypasses the message queues in BSP and reads directly from the worker’s memory for the internal vertex executions. To ensure serializability, coordination is performed—implemented via dining philosophers or token ring— only for border vertices partitioned across workers. We implement our modifications to BSP on Giraph, an open-source clone of Google’s Pregel. We show through a graph-coloring application that our modified framework, Giraphx, provides much better performance than implementing the application using dining-philosophers over Giraph. In fact, Giraphx outperforms Giraph even for embarrassingly parallel applications that do not require coordination, e.g., PageRank.

Download to read the full chapter text

Chapter PDF

An Efficient Graph Processing System

Implementing Graph Transformations in the Bulk Synchronous Parallel Model

The STAPL Parallel Graph Library

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Low, Y., Bickson, D., Gonzalez, J., Guestrin, C., Kyrola, A., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning and data mining in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Google Scholar
Braun, S.A.: A cloud-resolving simulation of hurricane bob (1991): Storm structure and eyewall buoyancy. Mon. Wea. Rev. 130(6), 1573–1592 (2002)
Article Google Scholar
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of the 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, pp. 135–146. ACM, New York (2010)
Chapter Google Scholar
http://www.facebook.com/about/graphsearch/
http://incubator.apache.org/giraph/
http://hama.apache.org/
http://goldenorbos.org/
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Article Google Scholar
Gray, J., Reuter, A.: Transaction Processing: Concepts and Techniques. Morgan Kaufmann Publishers (1993)
Google Scholar
Chandy, K.M., Misra, J.: The drinking philosopher’s problem. ACM Trans. Program. Lang. Syst. 6(4), 632–646 (1984)
Article Google Scholar
Brin, S., Page, L.: The anatomy of a large-scale hypertextual web search engine. In: Proceedings of the Seventh International Conference on World Wide Web 7, WWW7, pp. 107–117 (1998)
Google Scholar
http://hadoop.apache.org/
Karypis, G., Kumar, V.: A fast and high quality multilevel scheme for partitioning irregular graphs. SIAM Journal on Scientific Computing 20(1), 359 (1999)
Article MathSciNet MATH Google Scholar
http://snap.stanford.edu/data/web-Google.html/
Wang, G., Xie, W., Demers, A., Gehrke, J.: Asynchronous large-scale graph processing made easy. In: Proceedings of CIDR 2013 (2013)
Google Scholar
Gonzalez, J.E., Low, Y., Gu, H., Bickson, D., Guestrin, C.: Powergraph: Distributed graph-parallel computation on natural graphs. In: Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood (October 2012)
Google Scholar
Kyrola, A., Blelloch, G., Guestrin, C.: Graphchi: Large-scale graph computation on just a pc. In: Proceedings of the 10th USENIX Symposium on Operating Systems Design and Implementation, OSDI 2012, Hollywood (October 2012)
Google Scholar
Huang, J., Abadi, D.J., Ren, K.: Scalable sparql querying of large rdf graphs. PVLDB 4(11), 1123–1134 (2011)
Google Scholar
Chen, R., Yang, M., Weng, X., Choi, B., He, B., Li, X.: Improving large graph processing on partitioned graphs in the cloud. In: Proceedings of the Third ACM Symposium on Cloud Computing, SoCC 2012, pp. 3:1–3:13. ACM, New York (2012)
Google Scholar
Stanton, I., Kliot, G.: Streaming graph partitioning for large distributed graphs. In: Yang, Q., Agarwal, D., Pei, J. (eds.) KDD, pp. 1222–1230. ACM (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science & Engineering Department, University at Buffalo, SUNY, USA
Serafettin Tasci & Murat Demirbas

Authors

Serafettin Tasci
View author publications
You can also search for this author in PubMed Google Scholar
Murat Demirbas
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

German Research School for Simulation Sciences, RWTH Aachen, Schinkelstr. 2a, 52062, Aachen, Germany
Felix Wolf
Jülich Supercomputing Centre, Forschungszentrum Jülich GmbH, Station 22,, 52425, Jülich, Germany
Bernd Mohr
Center for Computing and Communication, RWTH Aachen, Seffenter Weg 23, 52074, Aachen, Germany
Dieter an Mey

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Tasci, S., Demirbas, M. (2013). Giraphx: Parallel Yet Serializable Large-Scale Graph Processing. In: Wolf, F., Mohr, B., an Mey, D. (eds) Euro-Par 2013 Parallel Processing. Euro-Par 2013. Lecture Notes in Computer Science, vol 8097. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-40047-6_47

Download citation

DOI: https://doi.org/10.1007/978-3-642-40047-6_47
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-40046-9
Online ISBN: 978-3-642-40047-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Giraphx: Parallel Yet Serializable Large-Scale Graph Processing

Abstract

Chapter PDF

Similar content being viewed by others

An Efficient Graph Processing System

Implementing Graph Transformations in the Bulk Synchronous Parallel Model

The STAPL Parallel Graph Library

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Giraphx: Parallel Yet Serializable Large-Scale Graph Processing

Abstract

Chapter PDF

Similar content being viewed by others

An Efficient Graph Processing System

Implementing Graph Transformations in the Bulk Synchronous Parallel Model

The STAPL Parallel Graph Library

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation