Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL

Ferrer, Roger; Planas, Judit; Bellens, Pieter; Duran, Alejandro; Gonzalez, Marc; Martorell, Xavier; Badia, Rosa M.; Ayguade, Eduard; Labarta, Jesus

doi:10.1007/978-3-642-19595-2_15

Roger Ferrer¹⁷,
Judit Planas¹⁷,
Pieter Bellens¹⁷,
Alejandro Duran¹⁷,
Marc Gonzalez^17,18,
Xavier Martorell^17,18,
Rosa M. Badia^17,19,
Eduard Ayguade^17,18 &
…
Jesus Labarta^17,18

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6548))

Included in the following conference series:

International Workshop on Languages and Compilers for Parallel Computing

1083 Accesses
15 Citations

Abstract

In this paper, we present OMPSs, a programming model based on OpenMP and StarSs, that can also incorporate the use of OpenCL or CUDA kernels. We evaluate the proposal on three different architectures, SMP, Cell/B.E. and GPUs, showing the wide usefulness of the approach. The evaluation is done with four different benchmarks, Matrix Multiply, BlackScholes, Perlin Noise, and Julia Set. We compare the results obtained with the execution of the same benchmarks written in OpenCL, in the same architectures. The results show that OMPSs greatly outperforms the OpenCL environment. It is more flexible to exploit multiple accelerators. And due to the simplicity of the annotations, it increases programmer’s productivity.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Design and Preliminary Evaluation of Omni OpenACC Compiler for Massive MIMD Processor PEZY-SC

A Case Study for Performance Portability Using OpenMP 4.5

Early Experiences with the OpenMP Accelerator Model

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

AMD Corporation. The AMD Fusion Family of APUs, http://fusion.amd.com
AMD/ATI. OpenCL: The Open Standard for Parallel Programming of GPUs and Multi–core CPUs (2010), http://www.amd.com/us/products/technologies/stream-technology/opencl/Pages/opencl.aspx
Ayguade, E., Badia, R.M., Cabrera, D., Duran, A., Gonzalez, M., Igual, F., Jimenez, D., Labarta, J., Martorell, X., Mayo, R., Perez, J.M., Quintana-Orti, E.S.: A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures. In: Müller, M.S., de Supinski, B.R., Chapman, B.M. (eds.) IWOMP 2009. LNCS, vol. 5568, pp. 154–167. Springer, Heidelberg (2009)
Chapter Google Scholar
Ayguadé, E., Copty, N., Duran, A., Hoeflinger, J., Lin, Y., Massaioli, F., Su, E., Unnikrishnan, P., Zhang, G.: A proposal for task parallelism in openMP. In: Chapman, B., Zheng, W., Gao, G.R., Sato, M., Ayguadé, E., Wang, D. (eds.) IWOMP 2007. LNCS, vol. 4935, pp. 1–12. Springer, Heidelberg (2008)
Chapter Google Scholar
Cooper, P., Dolinsky, U., Donaldson, A.F., Richards, A., Riley, C., Russell, G.: Offload – automating code migration to heterogeneous multicore systems. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds.) HiPEAC 2010. LNCS, vol. 5952, pp. 337–352. Springer, Heidelberg (2010)
Chapter Google Scholar
Dolbeau, R., Bihan, S., Bodin, F.: HMPP: A Hybrid Multi-core Parallel Programming Environment. In: Workshop on General Processing Using GPUs (2006)
Google Scholar
Eichenberger, A.E., O’Brien, K., O’Brien, K.M., Wu, P., Chen, T., Oden, P.H., Prener, D.A., Shepherd, J.C., So, B., Sura, Z., Wang, A., Zhang, T., Zhao, P., Gschwind, M., Archambault, R., Gao, Y., Koo, R.: Using advanced compiler technology to exploit the performance of the cell broadband engine\(^{\mbox{(tm)}}\) architecture. IBM Systems Journal 45(1), 59–84 (2006)
Article Google Scholar
IBM Corporation. OpenCL (2010), http://www.alphaworks.ibm.com/tech/opencl
Intel Corporation. Intel Unveils Product Plans for HPC (May 2010), http://www.intel.com/pressroom/archive/releases/2010/20100531comp.htm
Kindratenko, V., Enos, J., Shi, G., Showerman, M., Stone, G.A.J., Phillips, J., Hwu, W.: GPU Clusters for High-Performance Computing. In: IEEE Int. Conf. on Cluster Comp. Workshop on Parallel Programming on Accelerator Clusters (2009)
Google Scholar
Knight, T.J., Park, J.Y., Ren, M., Houston, M., Erez, M., Fatahalian, K., Aiken, A., Dally, W.J., Hanrahan, P.: Compilation for explicitly managed memory hierarchies. In: Proceedings of the 2007 ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (2007)
Google Scholar
Linderman, M., Collins, J., Wang, H., Meng, T.: Merge: A Programming Model for Heterogeneous Multi-core Systems. In: Proc. of the 14th Int. Conf. on Arch. Support for Prog. Languages and Operating Systems (ASPLOS) (March 2009)
Google Scholar
NVIDIA Corporation. NVIDIA CUDA Compute Unified Device Architecture Version 2.0 (2008)
Google Scholar
NVIDIA Corporation. OpenCL (2010), http://www.nvidia.com/object/cuda_opencl_new.html
O’Brien, K., O’Brien, K.M., Sura, Z., Chen, T., Zhang, T.: Supporting openmp on cell. International Journal of Parallel Programming 36(3), 289–311 (2008)
Article MATH Google Scholar
OpenMP Architecture Review Board. OpenMP Application Program Interface. Version 3.0 (May 2008)
Google Scholar
Perez, J.M., Bellens, P., Badia, R.M., Labarta, J.: CellSs: Making it easier to program the Cell Broadband Engine processor. IBM Journal of Research and Development 51(5), 593–604 (2007)
Article Google Scholar
RapidMind. RapidMind Multi-core Development Platform, http://www.rapidmind.com/pdfs/RapidmindDatasheet.pdf
Ueng, S.-Z., Lathara, M., Baghsorkhi, S.S., Hwu, W.-m.W.: CUDA-Lite: Reducing GPU Programming Complexity. In: Amaral, J.N. (ed.) LCPC 2008. LNCS, vol. 5335, pp. 1–15. Springer, Heidelberg (2008)
Chapter Google Scholar
Wang, P., Collins, J., Chinya, G., Jiang, H., Tian, X., Girkar, M., Yang, N., Lueh, G.-Y., Wang, H.: EXOCHI: Architecture and programming environment for a heterogeneous multi-core multithreaded system. In: Proc. of PLDI, pp. 156–166 (2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Barcelona Supercomputing Center, Jordi Girona, 29, Barcelona, Spain
Roger Ferrer, Judit Planas, Pieter Bellens, Alejandro Duran, Marc Gonzalez, Xavier Martorell, Rosa M. Badia, Eduard Ayguade & Jesus Labarta
Departament d’Arquitectura de Computadors, Univ. Politècnica de Catalunya, Jordi Girona, 1–3, Barcelona, Spain
Marc Gonzalez, Xavier Martorell, Eduard Ayguade & Jesus Labarta
IIIA, Artificial Intelligence Research Institute, CSIC, Spanish National Research Council, Spain
Rosa M. Badia

Authors

Roger Ferrer
View author publications
You can also search for this author in PubMed Google Scholar
Judit Planas
View author publications
You can also search for this author in PubMed Google Scholar
Pieter Bellens
View author publications
You can also search for this author in PubMed Google Scholar
Alejandro Duran
View author publications
You can also search for this author in PubMed Google Scholar
Marc Gonzalez
View author publications
You can also search for this author in PubMed Google Scholar
Xavier Martorell
View author publications
You can also search for this author in PubMed Google Scholar
Rosa M. Badia
View author publications
You can also search for this author in PubMed Google Scholar
Eduard Ayguade
View author publications
You can also search for this author in PubMed Google Scholar
Jesus Labarta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Rice University, 6100 Main Street, 77005-1892, Houston, TX, USA
Keith Cooper , John Mellor-Crummey & Vivek Sarkar , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ferrer, R. et al. (2011). Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL. In: Cooper, K., Mellor-Crummey, J., Sarkar, V. (eds) Languages and Compilers for Parallel Computing. LCPC 2010. Lecture Notes in Computer Science, vol 6548. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19595-2_15

Download citation

DOI: https://doi.org/10.1007/978-3-642-19595-2_15
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19594-5
Online ISBN: 978-3-642-19595-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL

Abstract

Chapter PDF

Similar content being viewed by others

Design and Preliminary Evaluation of Omni OpenACC Compiler for Massive MIMD Processor PEZY-SC

A Case Study for Performance Portability Using OpenMP 4.5

Early Experiences with the OpenMP Accelerator Model

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Optimizing the Exploitation of Multicore Processors and GPUs with OpenMP and OpenCL

Abstract

Chapter PDF

Similar content being viewed by others

Design and Preliminary Evaluation of Omni OpenACC Compiler for Massive MIMD Processor PEZY-SC

A Case Study for Performance Portability Using OpenMP 4.5

Early Experiences with the OpenMP Accelerator Model

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation