Abstract
The Cell BE processor provides both scalable computation power and flexibility, and it is already being adopted for many computational intensive applications like aerospace, defense, medical imaging and gaming. Despite of its merits, it also presents many challenges, as it is now widely known that is very difficult to program the Cell BE in an efficient manner. Hence, the creation of an efficient software development framework is becoming the key challenge for this computational platform.
We have developed a novel software toolkit, called Cellflow, which enables developers to quickly build multi-task applications for Cell-based platform. We support programmers from the initial stage of their work, through a development-time software infrastructure, to the final stage of the application development, proposing a safe and easy-to-use explicit parallel programming model.
A fundamental component of the software toolkit is the off-line allocator and scheduler that manages hardware resources while optimizing performance metrics such as execution time, allocation costs, power. The optimization engine receives as input a task graph representing an application, the hardware resources and produces an optimal allocation and scheduling. We have developed various approaches, either based on decomposition [5] or based on pure Constraint Programming, this latter being the core of this paper. We have identified instance features that guide toward the choice of the best solver for the instance at hand.
Experimental result show that Constraint Programming (possibly combined with Integer Programming) is a proper tool for dealing with this kind of applications achieving very good performance.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Policella, N., Cesta, A., Oddi, A., Smith, S.F.: From precedence constraint posting to partial order schedulesA CSP approach to Robust Scheduling. AI Communications 20(3), 163–180 (2007)
Laborie, P.: Complete MCS-Based Search: Application to Resource Constrained Project Scheduling. In: Proc. of IJCAI 2005, pp. 181–186 (2005)
Benini, L., Bertozzi, D., Guerri, A., Milano, M.: Allocation and scheduling for MPSOCs via decomposition and no-good generation. In: van Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp. 107–121. Springer, Heidelberg (2005)
Benini, L., Bertozzi, D., Guerri, A., Milano, M.: Allocation, Scheduling and Voltage Scaling on Energy Aware MPSoCs. In: Beck, J.C., Smith, B.M. (eds.) CPAIOR 2006. LNCS, vol. 3990, pp. 44–58. Springer, Heidelberg (2006)
Benini, L., Lombardi, M., Mantovani, M., Milano, M., Ruggiero, M.: Multi-stage Benders Decomposition for Optimizing Multicore Architectures. In: Perron, L., Trick, M.A. (eds.) CPAIOR 2008. LNCS, vol. 5015, pp. 36–50. Springer, Heidelberg (2008)
Bockmayr, A., Pisaruk, N.: Detecting infeasibility and generating cuts for MIP using CP. In: Int. Workshop Integration AI OR Techniques Constraint Programming Combin. Optim. Problems CP-AI-OR 2003, Montreal, Canada (2003)
Grossmann, I.E., Jain, V.: Algorithms for hybrid milp/cp models for a class of optimization problems. INFORMS Journal on Computing 13, 258–276 (2001)
Hooker, J.N., Ottosson, G.: Logic-based benders decomposition. Mathematical Programming 96, 33–60 (2003)
Hooker, J.N.: A hybrid method for planning and scheduling. In: Wallace, M. (ed.) CP 2004. LNCS, vol. 3258, pp. 305–316. Springer, Heidelberg (2004)
Hooker, J.N.: Planning and scheduling to minimize tardiness. In: van Beek, P. (ed.) CP 2005. LNCS, vol. 3709, pp. 314–327. Springer, Heidelberg (2005)
Sadykov, R., Wolsey, L.A.: Integer Programming and Constraint Programming in Solving a Multimachine Assignment Scheduling Problem with Deadlines and Release Dates. INFORMS Journal on Computing 18(2), 209–217 (2006)
Ibm CELL Broadband Engine software development kit, http://www.alphaworks.ibm.com/tech/cellsw/download
Laborie, P.: Algorithms for propagating resource constraints in AI planning and scheduling: Existing approaches and new results. Journal of Artificial Intelligence 143, 151–188 (2003)
Bellens, P., Perez, J.M., Badia, R.M., Labarta, J.: Cellss: a programming model for the cell be architecture. In: SC 2006: Proceedings of the 2006 ACM/IEEE conference on Supercomputing, p. 86. ACM Press, New York (2006)
Chen, T., Raghavan, R., Dale, J., Iwata, E.: Cell broadband engine architecture and its first implementation. In: IBM White paper (2005)
Chatha, K.S., Vemuri, R.: Hardware-software partitioning and pipelined scheduling of transformative applications, vol. 10, pp. 193–208 (2002)
Fohler, G., Ramamritham, K.: Static scheduling of pipelined periodic tasks in distributed real-time systems. In: Procs. of the 9th EUROMICRO Workshop on Real-Time Systems - EUROMICRO-RTS 1997, Toledo, Spain, pp. 128–135. IEEE, Los Alamitos (1997)
Bakshi, S., Gajski, D.D.: A scheduling and pipelining algorithm for hardware/software systems. In: Proceedings of the 10th international symposium on System synthesis - ISSS 1997, Washington, DC, USA, pp. 113–118. IEEE Computer Society, Los Alamitos (1997)
Eichenberger, A., et al.: Optimizing compiler for the cell processor. In: PACT 2005: Proceedings of the 14th International Conference on Parallel Architectures and Compilation Techniques, Washington, DC, USA, pp. 161–172. IEEE Computer Society, Los Alamitos (2005)
Eichenberger, A.E., et al.: Using advanced compiler technology to exploit the performance of the cell broadband enginetm architecture. IBM Syst. J. 45(1), 59–84 (2006)
Axelsson, J.: Architecture synthesis and partitioning of real-time synthesis: a comparison of 3 heuristic search strategies. In: Procs. of the 5th Intern. Workshop on Hardware/Software Codesign (CODES/CASHE 1997), Braunschweig, Germany, pp. 161–166. IEEE, Los Alamitos (1997)
Eles, P., Peng, Z., Kuchcinski, K., Doboli, A.: System level hardware/software partitioning based on simulated annealing and tabu search. Design Automation for Embedded Systems 2, 5–32 (1997)
Kodase, S., Wang, S., Gu, Z., Shin, K.: Improving scalability of task allocation and scheduling in large distributed real-time systems using shared buffers. In: Procs. of the 9th IEEE Real-Time and Embedded Technology and Applications Symposium (RTAS 2003), Toronto, Canada, pp. 181–188. IEEE, Los Alamitos (2003)
Eles, P., Peng, Z., Kuchcinski, K., Doboli, A., Pop, P.: Scheduling of conditional process graphs for the synthesis of embedded systems, Paris, France, pp. 132–139 (1998)
Kuchcinski, K., Szymanek, R.: A constructive algorithm for memory-aware task assignment and scheduling. In: Procs of the Ninth International Symposium on Hardware/Software Codesign - CODES 2001, Copenhagen, Denmark, pp. 147–152. ACM Press, New York (2001)
Kuchcinski, K.: Embedded system synthesis by timing constraint solving. IEEE Transactions on CAD 13, 537–551 (1994)
Flachs, B., et al.: A streaming processing unit for a cell processor. In: IEEE International Solid-State Circuits Conference, 2005 (ISSCC 2005). Digest of Technical Papers, pp. 134–135 (2005)
Hofstee, H.: Cell broadband engine architecture from 20,000 feet. In: IBM White paper (2005)
Kistler, M., Perrone, M., Petrini, F.: Cell multiprocessor communication network: Built for speed. IEEE Micro. 26(3), 10–23 (2006)
Maeda, S., Asano, S., Shimada, T., Awazu, K., Tago, H.: A real-time software platform for the cell processor. IEEE Micro. 25(5), 20–29 (2005)
Palazzari, P., Baldini, L., Coli, M.: Synthesis of pipelined systems for the contemporaneous execution of periodic and aperiodic tasks with hard real-time constraints. In: 18th International Parallel and Distributed Processing Symposium - IPDPS 2004, pp. 121–128 (2004)
Pham, D., et al.: The design and implementation of a first-generation cell processor. In: IEEE International Solid-State Circuits Conference ISSCC 2005, vol. 1, pp. 184–592 (2005)
Ohara, M., Inoue, H., Sohda, Y., Komatsu, H., Nakatani, T.: MPI microtask for programming the Cell Broadband Engine processor. IBM System Journal 45(1) (2006)
Zhang, D., Li, Q.J., Rabbah, R., Amarasinghe, S.: A Lightweight Streaming Layer forMulticore Execution. In: Proceedings of Workshop on Design, Architecture and Simulation of Chip Multi-Processors, dasCMP 2007 (2007)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Benini, L., Lombardi, M., Milano, M., Ruggiero, M. (2008). A Constraint Programming Approach for Allocation and Scheduling on the CELL Broadband Engine. In: Stuckey, P.J. (eds) Principles and Practice of Constraint Programming. CP 2008. Lecture Notes in Computer Science, vol 5202. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-85958-1_2
Download citation
DOI: https://doi.org/10.1007/978-3-540-85958-1_2
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-85957-4
Online ISBN: 978-3-540-85958-1
eBook Packages: Computer ScienceComputer Science (R0)