Abstract
This paper presents the analysis of a parallel formulation of depth-first search. At the heart of this parallel formulation is a dynamic work-distribution scheme that divides the work between different processors. The effectiveness of the parallel formulation is strongly influenced by the work-distribution scheme and the target architecture. We introduce the concept of isoefficiency function to characterize the effectiveness of different architectures and work-distribution schemes. Many researchers considered the ring architecture to be quite suitable for parallel depth-first search. Our analytical and experimental results show that hypercube and shared-memory architectures are significantly better. The analysis of previously known work-distribution schemes motivated the design of substantially improved schemes for ring and shared-memory architectures. In particular, we present a work-distribution algorithm that guarantees close to optimal performance on a shared-memory/ω-network-with-message-combining architecture (e.g. RP3). Much of the analysis presented in this paper is applicable to other parallel algorithms in which work is dynamically shared between different processors (e.g., parallel divide-and-conquer algorithms). The concept of isoefficiency is useful in characterizing the scalability of a variety of parallel algorithms.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
V. Nageshwara Rao and Vipin Kumar, Parallel Depth-first Search, Part I: Implementation.International Journal of Parallel Programming,16(6), 479–499 (1988).
R. E. Korf, Depth-first Iterative-deepening: An Optimal Admissible Tree Search,Artificial Intelligence,27:97–109 (1985).
Richard Korf, Optimal Path Finding Algorithms, In L. Kanal and V. Kumar, (eds.),Search in Artificial Intelligence, Springer-Verlag, New York (1988).
Nils J. Nilsson,Principles of Artificial Intelligence, Tioga Press (1980).
V. Nageshwara Rao and Vipin Kumar,Superlinear Speedup in State-Space Search, Technical Report, AI Lab TR88-80, University of Texas at Austin (June 1988).
T. H. Lai and Sartaj Sahni, Anomalies in Parallel Branch and Bound Algorithms,Communications of the ACM, pp. 594–602 (1984).
J. Lee, E. Shragowitz, and S. Sahni, A Hypercube algorithm for the 0/1 Knapsack Problem, inProceedings of International Conference on Parallel Processing, pp. 699–706 (1987).
Michael J. Quinn,Designing Efficient Algorithms for Parallel Computers, McGraw Hill, New York (1987).
Udi Manber, On Maintaining Dynamic Information in a Concurrent Environment,SIAM J. of Computing,15(4):1130–1142 (1986).
G. F. Pfister,et al., The IBM Research Parallel Processor Prototype (RP3), inProceedings of International Conference on Parallel Processing, pp. 764–797 (1985).
A. Gottlieb,et al., The NYU Ultracomputer—Designing A MIMD, Shared Memory Parallel Computer,IEEE Transactions on Computers, pp. 175–189 (February 1983).
Raphael A. Finkel and Udi Manber, Dib—A Distributed Implementation of Backtracking,ACM Trans. of Progr. Lang. and Systems,9(2):235–256 (April 1987).
Benjamin W. Wah and Y. W. Eva Ma, Manip—A Multicomputer Architecture for Solving Combinatorial Extremum-search Problems,IEEE Transactions on Computers, Vol. C-33 (May 1984).
B. Monien and O. Vornberger,The Ring Machine, Technical Report, Univ. of Paderborn, FRG (1985); also inComputers and Artificial Intelligence, Vol. 3 (1987).
Author information
Authors and Affiliations
Additional information
This work was supported by Army Research Office Grant No. DAAG29-84-K-0060 to the Artificial Intelligence Laboratory, and Office of Naval Research Grant N00014-86-K-0763 to the Computer Science Department at the University of Texas at Austin.
Rights and permissions
About this article
Cite this article
Kumar, V., Rao, V.N. Parallel depth first search. Part II. Analysis. Int J Parallel Prog 16, 501–519 (1987). https://doi.org/10.1007/BF01389001
Received:
Revised:
Issue Date:
DOI: https://doi.org/10.1007/BF01389001