Abstract
Current GPU computational power enables the execution of complex and parallel algorithms, such as ray tracing techniques supported by kD-trees for 3D scene rendering in real time. This work describes in detail the study and implementation of eight different kD-tree traversal algorithms using the parallel framework NVIDIA Compute Unified Device Architecture, in order to point their pros and cons regarding performance, memory consumption, branch divergencies and scalability on multiple GPUs. In addition, two new algorithms are proposed by the authors based on this analysis, aiming to performance improvement. Both of them are capable of reaching speedup gains up to 3 × when compared to recent and optimized parallel traversal implementations. As a consequence, interactive frame rates are possible for scenes with 1,408 × 768 pixels of resolution and 3.6 million primitives.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
Appel, A.: Some techniques for shading machine renderings of solids. In: AFIPS ’68 (Spring): Proceedings of the April 30–May 2, 1968, Spring Joint Computer Conference, pp. 37–45. ACM, New York (1968)
Glassner, A. (eds): An Introduction to Ray Tracing. Academic Press, London (1989)
Havran, V.: Heuristic ray shooting algorithms. Ph.D. dissertation, Czech Technical University, Praha, Czech Republic, Apr. 2001, available from http://www.cgg.cvut.cz/havran/phdthesis.html
NVIDIA, NVIDIA CUDA Programming Guide 3.0, 2010. [Online]. Available: http://www.nvidia.com/object/cuda
NVIDIA, Nvidia optix ray tracing engine, 2010. [Online]. Available: http://developer.nvidia.com/object/optix-home.html
Whitted T.: An improved illumination model for shaded display. Commun. ACM 23(6), 343–349 (1980)
Jansen F.: Data structures for ray tracing. In: Kessener, L.R.A., Peters, F.J., van Lierop, M.L.P. (eds) Data Structures for Raster Graphics, pp. 57–73. Springer, Berlin (1986)
Rubin, S.M., Whitted, T. (1980) A 3-dimensional representation for fast rendering of complex scenes. In: SIGGRAPH ’80: Proceedings of the 7th Annual Conference on Computer Graphics and Interactive Techniques, pp. 110–116. ACM, New York
Wächter, C., Keller, A.: Instant ray tracing: The bounding interval hierarchy. In: Akenine-Möller, T., Heidrich, W. (eds.) Eurographics Workshop/ Symposium on Rendering, pp. 139–149. Eurographics Association, Nicosia, Cyprus (2006) [Online]. Available: http://www.eg.org/EG/DL/WS/EGWR/EGSR06/139-149.pdf
Wald, I., Slusallek, P., Benthin, C., Wagner, M.: Interactive rendering with coherent ray tracing. Comput. Graph. Forum 20(3) (2001)
Boulos, S., Edwards, D., Lacewell, J.D., Kniss, J., Kautz, J., Wald, I., Shirley, P.: Packet-based whitted and distribution ray tracing. In Proceedings of Graphics Interface (2007)
Horn, D.R., Sugerman, J., Houston, M., Hanrahan, P.: Interactive k-d tree GPU raytracing. In: Gooch, B., Sloan, P.-P.J. (eds.) SI3D, pp. 167–174. ACM, London (2007) [Online]. Available: http://doi.acm.org/10.1145/1230100.1230129
Popov, S., Günther, J., Seidel, H.-P., Slusallek, P.: Stackless KD-tree traversal for high performance GPU ray tracing. Comput. Graph. Forum 26(3), 415–424 (2007) [Online]. Available: http://dx.doi.org/10.1111/j.1467-8659.2007.01064.x
Günther, J., Popov, S., Seidel, H.-P., Slusallek, P.: Realtime ray tracing on GPU with BVH-based packet traversal. In: Proceedings of the IEEE/Eurographics Symposium on Interactive Ray Tracing 2007, pp. 113–118 (2007)
Kaplan, M.: Space-tracing: a constant time ray-tracer. In: SIGGRAPH ’85: Proceedings of the 12th Annual Conference on Computer Graphics and Interactive Techniques. ACM, New York (1985)
Foley, T., Sugerman, J.: KD-tree acceleration structures for a GPU raytracer. In: Meißner, M., Schneider, B.-O. (eds.) Graphics Hardware, pp. 15–22. Eurographics Association, Los Angeles (2005). [Online]. Available: http://www.eg.org/EG/DL/WS/EGGH/EGGH05/015-022.pdf
Havran, V., Bittner, J., Sára, J.: Ray tracing with rope trees. In: Kalos, L.S. (ed.) 14th Spring Conference on Computer Graphics, pp. 130–140. Comenius University, Bratislava (1998)
Aila, T., Laine, S.: Understanding the efficiency of ray traversal on gpus. In: HPG ’09: Proceedings of the Conference on High Performance Graphics 2009, pp. 145–149. ACM, New York (2009)
Harris, M., Harris, M.: Parallel prefix sum (scan) with cuda (2007). [Online]. Available: http://beowulf.lcs.mit.edu/18.337/lectslides/scan.pdf
Wald, I., Havran, V.: On building fast kd-trees for ray tracing, and on doing that in O(N log N). In: Proceedings of the 2006 IEEE Symposium on Interactive Ray Tracing, pp. 61–69 (2006)
Brodal, G.S., Fagerberg, R., Jacob, R.: Cache oblivious search trees via binary trees of small height. In: SODA ’02: Proceedings of the Thirteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 39–48. Society for Industrial and Applied Mathematics, Philadelphia (2002)
The stanford 3d scanning repository. [Online]. Available: http://graphics.stanford.edu/data/3Dscanrep/
Bikker, J.: Arauna real time ray tracing (2009). [Online]. Available: http://igad.nhtv.nl/~bikker/
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Santos, A., Teixeira, J.M., Farias, T. et al. Understanding the Efficiency of kD-tree Ray-Traversal Techniques over a GPGPU Architecture. Int J Parallel Prog 40, 331–352 (2012). https://doi.org/10.1007/s10766-011-0186-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-011-0186-1