Abstract
Collective communication in high-performance computing is traditionally implemented as a sequence of point-to-point communication operations. For example, in MPI a broadcast is often implemented using a linear or binomial tree algorithm. These algorithms are inherently unaware of any underlying network heterogeneity. Integrating topology awareness into the algorithms is the traditional way to address this heterogeneity, and it has been demonstrated to greatly optimize tree-based collectives. However, recent research in distributed computing shows that in highly heterogeneous networks an alternative class of collective algorithms - BitTorrent-based multicasts - has the potential to outperform topology-aware tree-based collective algorithms. In this work, we experimentally compare the performance of BitTorrent and tree-based large-message broadcast algorithms in a typical heterogeneous computational cluster. We address the following question: Can the dynamic data exchange in BitTorrent be faster than the static data distribution via trees even in the context of high-performance computing? We find that both classes of algorithms have a justification of use for different settings. While on single switch clusters linear tree algorithms are optimal, once multiple switches and a bottleneck link are introduced, BitTorrent broadcasts – which utilize the network in a more adaptive way – outperform the tree-based MPI implementations.
Chapter PDF
Similar content being viewed by others
References
Beaumont, O., Marchal, L., Robert, Y.: Broadcast trees for heterogeneous platforms. In: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium, p. 80b (April 2005)
Bhat, P., Raghavendra, C., Prasanna, V.: Efficient collective communication in distributed heterogeneous systems. In: Proceedings of the 19th IEEE International Conference on Distributed Computing Systems, pp. 15–24 (1999)
den Burger, M.: High-throughput multicast communication for grid applications. Ph.D. thesis, Vrije Universiteit Amsterdam (2009)
den Burger, M., Kielmann, T.: Collective receiver-initiated multicast for grid applications. IEEE Transactions on Parallel and Distributed Systems 22(2), 231–244 (2011)
Cohen, B.: Incentives build robustness in BitTorrent (2003)
Dichev, K., Reid, F., Lastovetsky, A.: Efficient and reliable network tomography in heterogeneous networks using BitTorrent broadcasts and clustering algorithms. In: SC 2012 (2012)
Dichev, K., Rychkov, V.M., Lastovetsky, A.: Two Algorithms of Irregular Scatter/Gather Operations for Heterogeneous Platforms. In: Keller, R., Gabriel, E., Resch, M., Dongarra, J. (eds.) EuroMPI 2010. LNCS (LNAI), vol. 2536, pp. 289–293. Springer, Heidelberg (2010)
Gabriel, E., Fagg, G.E., Bosilca, G., Angskun, T., Dongarra, J.J., Squyres, J.M., Sahay, V., Kambadur, P., Barrett, B., Lumsdaine, A., Castain, R.H., Daniel, D.J., Graham, R.L., Woodall, T.S.: Open MPI: Goals, Concept, and Design of a Next Generation MPI Implementation. In: Kranzlmüller, D., Kacsuk, P., Dongarra, J. (eds.) EuroPVM/MPI 2004. LNCS, vol. 3241, pp. 97–104. Springer, Heidelberg (2004)
Hatta, J., Shibusawa, S.: Scheduling algorithms for efficient gather operations in distributed heterogeneous systems. In: Proc. Int Parallel Processing Workshops, pp. 173–180 (2000)
Izal, M., Urvoy-Keller, G., Biersack, E.W., Felber, P., Al Hamra, A., Garcés-Erice, L.: Dissecting BitTorrent: Five Months in a Torrent’s Lifetime. In: Barakat, C., Pratt, I. (eds.) PAM 2004. LNCS, vol. 3015, pp. 1–11. Springer, Heidelberg (2004), http://dx.doi.org/10.1007/978-3-540-24668-8_1
Keller, R., Bosilca, G., Fagg, G., Resch, M., Dongarra, J.: Implementation and Usage of the PERUSE-Interface in Open MPI. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) PVM/MPI 2006. LNCS, vol. 4192, pp. 347–355. Springer, Heidelberg (2006), http://dx.doi.org/10.1007/11846802_48
Kielmann, T., Bal, H.E., Gorlatch, S.: Bandwidth-efficient collective communication for clustered wide area systems. In: Proc. 14th Int. Parallel and Distributed Processing Symp., IPDPS 2000, pp. 492–499 (2000)
Patarasuk, P., Faraj, A., Yuan, X.: Pipelined broadcast on Ethernet switched clusters. In: 20th International Parallel and Distributed Processing Symposium, IPDPS 2006, p. 10 (April 2006)
Rodriguez, P., Biersack, E.W.: Dynamic parallel access to replicated content in the Internet. IEEE/ACM Trans. Netw. 10, 455–465 (2002), http://dx.doi.org/10.1109/TNET.2002.801413
Wadsworth, D.M., Chen, Z.: Performance of MPI broadcast algorithms. In: Proc. IEEE Int. Symp. Parallel and Distributed Processing, IPDPS 2008, pp. 1–7 (2008)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Dichev, K., Lastovetsky, A. (2013). MPI vs. BitTorrent: Switching between Large-Message Broadcast Algorithms in the Presence of Bottleneck Links. In: Caragiannis, I., et al. Euro-Par 2012: Parallel Processing Workshops. Euro-Par 2012. Lecture Notes in Computer Science, vol 7640. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-36949-0_21
Download citation
DOI: https://doi.org/10.1007/978-3-642-36949-0_21
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-36948-3
Online ISBN: 978-3-642-36949-0
eBook Packages: Computer ScienceComputer Science (R0)