Abstract
In this paper, we propose a method for computing convolution of large 3-D images with respect to real signals. The convolution is performed in a frequency domain using a convolution theorem. Due to properties of real signals, the algorithm can be optimized so that both time and the memory consumption are halved when compared to complex signals of the same size. Convolution is decomposed in a frequency domain using the decimation in frequency (DIF) algorithm. The algorithm is accelerated on a graphics hardware by means of the CUDA parallel computing model, achieving up to 10× speedup with a single GPU over an optimized implementation on a quad-core CPU.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Graphic Process Unit
- Discrete Fourier Transform
- Point Spread Function
- Canny Edge Detection
- Graphic Process Unit Implementation
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Boden, A.F., Redding, D.C., Hanisch, R.J., Mo, J.: Massively parallel spatially variant maximum-likelihood restoration of Hubble Space Telescope imagery. J. Opt. Soc. Am. A 13(7), 1537–1545 (1996)
Bracewell, R.N.: The Fourier Transform and Its Applications, 3rd edn. McGraw-Hill (2000)
Brigham, E.: Fast Fourier Transform and Its Applications, 1st edn. Prentice-Hall (1988)
Domanski, L., Vallotton, P., Wang, D.: Two and Three-Dimensional Image Deconvolution on Graphics Hardware. In: Proceedings of the 18th World IMACS/MODSIM Congress, Cairns, Australia, July 13-17, pp. 1010–1016 (2009)
Fialka, O., Cadik, M.: FFT and Convolution Performance in Image Filtering on GPU. In: Tenth International Conference on Information Visualization, IV 2006, pp. 609–614 (2006)
Fraser, D.: Array permutation by index-digit permutation. J. ACM 23(2), 298–309 (1976), http://doi.acm.org/10.1145/321941.321949
Frigo, M., Johnson, S.G.: FFTW 3.2.2. Massachusetts Institute of Technology (July 2009), http://www.fftw.org/fftw3.pdf
Gonzales, R.C., Woods, R.E.: Digital Image Processing, 2nd edn. Prentice-Hall (2002)
Govindaraju, N.K., Lloyd, B., Dotsenko, Y., Smith, B., Manferdelli, J.: High performance discrete Fourier transforms on graphics processors. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–12. IEEE Press, Piscataway (2008)
Group, K.: OpenCL (2011), http://www.khronos.org/opencl/
Hanna, J.R., Rowland, J.H.: Fourier Series, Transforms, and Boundary Value Problems, 2nd edn. John Wiley & Sons (1990)
Hey, A.: The FFT Demystified. Engineering Productivity Tools Ltd., 21 Leaveden Road, Watford, Hertfordshire, UK (1999), http://www.engineeringproductivitytools.com/stuff/T0001/PT10.HTM
Ifeachor, E.C., Jervis, B.W.: Digital Signal Processing: A Practical Approach, 2nd edn. Pearson Education (2002)
Jähne, B.: Digital Image Processing, 6th edn. Springer (2005)
Karas, P., Svoboda, D.: Convolution of large 3D images on GPU and its decomposition. EURASIP Journal on Advances in Signal Processing (120), 1–12 (2011), http://asp.eurasipjournals.com/content/2011/1/120
Luo, Y., Duraiswami, R.: Canny edge detection on NVIDIA CUDA. In: Computer Vision and Pattern Recognition Workshop, pp. 1–8 (2008)
Nickolls, J., Dally, W.: The GPU Computing Era. IEEE Micro 30, 56–69 (2010), http://dx.doi.org/10.1109/MM.2010.41
Nukada, A., Ogata, Y., Endo, T., Matsuoka, S.: Bandwidth intensive 3-D FFT kernel for GPUs using CUDA. In: SC 2008: Proceedings of the 2008 ACM/IEEE Conference on Supercomputing, pp. 1–11. IEEE Press, Piscataway (2008)
NVIDIA Corporation: CUDATM CUFFT Library 2.3 (June 2009), http://developer.nvidia.com/object/cuda_2_3_downloads.html
NVIDIA Corporation: FERMI Tuning Guide (August 2010), http://developer.download.nvidia.com/compute/cuda/3_2_prod/toolkit/docs/Fermi_Tuning_Guide.pdf
NVIDIA Corporation, 2701 San Tomas Expressway, Santa Clara, USA: NVIDIA GPU Computing Developer Home Page (June 2011), http://developer.nvidia.com/category/zone/cuda-zone
Ogawa, K., Ito, Y., Nakano, K.: Efficient canny edge detection using a GPU. In: International Conference on Natural Computation, pp. 279–280 (2010)
Owens, J.D., Luebke, D., Govindaraju, N., Harris, M., Krüger, J., Lefohn, A.E., Purcell, T.J.: A Survey of General-Purpose Computation on Graphics Hardware, pp. 21–51 (August 2005)
Pankajakshan, P.: Blind Deconvolution for Confocal Laser Scanning Microscopy. Ph.D. thesis, Universite de Nice Sophia Antipolis (December 2009), http://tel.archives-ouvertes.fr/tel-00474264/fr/
Podlozhnyuk, V.: Image Convolution with CUDA (June 2007), http://developer.download.nvidia.com/compute/cuda/1.1-Beta/x86_64_website/projects/convolutionSeparable/doc/convolutionSeparable.pdf
Pratt, W.K.: Digital Image Processing, 3rd edn. John Wiley & Sons (2001)
Rabiner, L.R.: On the use of symmetry in fft computation. IEEE Transactions on Acoustics, Speech, and Signal Processing 27, 233–239 (1979)
Saidi, A.: Generalized FFT Algorithm. In: IEEE International Conference on Communications 93: Technical program, conference record. In: IEEE International Conference on Communications, Geneva, Switzerland, May 23-26, vols. 1-3, pp. 227–231 (1993)
Sarder, P., Nehorai, A.: Deconvolution methods for 3-D fluorescence microscopy images. IEEE Signal Processing Magazine 23(3), 32–45 (2006)
Schaa, D., Kaeli, D.: Exploring the multiple-GPU design space. In: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, IPDPS 2009, pp. 1–12. IEEE Computer Society, Washington, DC (2009)
Svoboda, D.: Efficient Computation of Convolution of Huge Images. In: Maino, G., Foresti, G.L. (eds.) ICIAP 2011, Part I. LNCS, vol. 6978, pp. 453–462. Springer, Heidelberg (2011)
Svoboda, D., Kozubek, M., Stejskal, S.: Generation of Digital Phantoms of Cell Nuclei and Simulation of Image Formation in 3D Image Cytometry. Cytometry Part A 75A(6), 494–509 (2009)
Trussell, H., Hunt, B.: Image restoration of space variant blurs by sectioned methods. In: IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 1978, vol. 3, pp. 196–198 (1978)
Verveer, P.J.: Computational and optical methods for improving resolution and signal quality in fluorescence microscopy. Ph.D. thesis, Technische Universiteit Te Delft (1998)
Press, W.H., Teukolsky, S.A., Vettrling, W.T., Flannery, B.P.: Numerical Recipes in C, 2nd edn., ch. 7. Cambridge University Press (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Karas, P., Svoboda, D., Zemčík, P. (2012). GPU Optimization of Convolution for Large 3-D Real Images. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P., Zemčík, P. (eds) Advanced Concepts for Intelligent Vision Systems. ACIVS 2012. Lecture Notes in Computer Science, vol 7517. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-33140-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-33140-4_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-33139-8
Online ISBN: 978-3-642-33140-4
eBook Packages: Computer ScienceComputer Science (R0)