Abstract
Verification has grown to dominate the cost of electronic system design, consuming about 60% of design effort. Among several verification techniques, logic simulation remains the major verification technique. Speeding up logic simulation results in great savings and shorter time-to-market. We parallelize logic simulation using Graphics Processing Units (GPUs). In the past, GPUs were special-purpose application accelerators, suitable only for conventional graphics applications. The new generations of GPU architecture provide easier programmability and increased generality while maintaining the tremendous memory bandwidth and computational power of traditional GPUs. We develop a parallel cycle-based logic simulation algorithm that uses And Inverter Graphs (AIGs) as design representations. AIGs have proven to be an effective representation for various design automation applications, and we obtain similar benefits for speeding up logic simulation. We develop two clustering algorithms that partition the gates in the designs into independent blocks. Our algorithms exploit the massively parallel GPU architecture featuring thousands of concurrent threads, fast memory, and memory coalescing for optimizations. We demonstrate up-to 5x and 21x speedups on several benchmarks using our simulation system with the first and second clustering algorithms, respectively. Our work ultimately results in significant reduction in the overall design cycle.
Article PDF
Similar content being viewed by others
Avoid common mistakes on your manuscript.
References
ABC web site. http://www.eecs.berkeley.edu/~alanmi/abc/
AIGER Format web site. http://fmv.jku.at/aiger/
Alpert C.J., Kahng A.B.: Recent directions in netlist partitioning: a survey. Int. VLSI J. 19(1–2), 1–81 (1995)
Bailey M.L., Briner J.V. Jr, Chamberlain R.D.: Parallel logic simulation of VLSI systems. ACM Comput. Surv. 26(3), 255–294 (1994)
Bergeron J.: Writing Testbenches—Functional Verification of HDL Models. Springer, Berlin (2003)
Brayton, R., Mishchenko, A.: ABC: an academic industrial-strength verification tool. In: Proceedings of the International Conference on Computer-Aided Verification (CAV) (2010)
Catanzaro, B., Keutzer, K., Su, B.Y.: Parallelizing CAD: a timely research agenda for EDA. In: Proceedings of the Design Automation Conference (DAC), pp. 12–17. ACM (2008)
Chatterjee, D., DeOrio, A., Bertacco, V.: Event-driven gate-level simulation with GP-GPUs. In: Proceedings of the Design Automation Conference (DAC), pp. 557–562 (2009)
Chatterjee, D., DeOrio, A., Bertacco, V.: GCS: High-performance gate-Level simulation with GPGPUs. In: Proceedings of the Conference on Design Automation and Test in Europe (DATE), pp. 1332–1337 (2009)
Chatterjee S., Mishchenko A., Brayton R.K., Wang X., Kam T.: Reducing structural bias in technology mapping. IEEE TCAD 25(12), 2894–2903 (2010)
Che S., Boyer M., Meng J., Tarjan D., Sheaffer J.W., Skadron K.: A performance study of general-purpose applications on graphics processors using CUDA. J. Parallel Distrib. Comput. 68(10), 1370–1380 (2008)
Croix, J.F., Khatri, S.P.: Introduction to GPU programming for EDA. In: Proceedings of the International Conference on Computer Aided Design (ICCAD), pp. 276–280. ACM (2009)
NVIDIA CUDA web site. http://www.nvidia.com/CUDA
Deng, Y.S., Wang, B.D., Mu, S.: Taming irregular EDA applications on GPUs. In: Proceedings of the International Conference on Computer Aided Design (ICCAD), pp. 539–546. ACM (2009)
Hering, K.: A Parallel LCC Simulation System. In: Proceedings of the International Parallel and Distributed Processing Symposium (IPDPS) (2002)
Hering, K., Reilein, R., Trautmann, S.: Cone clustering principles for parallel logic simulatio. In: International Workshop on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems, pp. 93–100 (2002)
IWLS 2005 Benchmarks. http://www.iwls.org/iwls2005/benchmarks.html
Johannes, F.M.: Partitioning of VLSI circuits and systems. In: Proceedings of the Design Automation Conference (DAC), pp. 83–87. ACM (1996)
Meister G.: A survey on parallel logic simulation Technical report. Department of Computer Engineering, University of Saarland, Saarland (1993)
Mishchenko, A., Brayton, R., Jang, S.: Global delay optimization using structural choices. In: FPGA ’10: Proceedings of the 18th Annual ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 181–184. ACM (2010)
Mishchenko, A., Chatterjee, S., Brayton, R.: DAG-aware AIG rewriting: a fresh look at combinational logic synthesis. In: Proceedings of the Design Automation Conference (DAC) (2006)
Nguyen H.: Gpu Gems 3. Addison-Wesley Professional, Reading (2007)
OpenCL web site. http://www.khronos.org/opencl/
Opencores Benchmarks. http://www.opencores.org
Perinkulam, A.: Logic Simulation Using Graphics Processors. Master’s thesis, University of Massachusetts Amherst (2007)
Pfister, G.: The Yorktown simulation engine: introduction. In: Proceedings of the Design Automation Conference (DAC) (1982)
Sen, A., Aksanli, B., Bozkurt, M., Mert, M.: Parallel cycle based logic simulation using graphics processing units. In: Proceedings of the International Symposium on Parallel and Distributed Computing (ISPDC) (2010)
Zhu, Q., Kitchen, N., Kuehlmann, A., Sangiovanni-Vincentelli, A.: SAT sweeping with local observability don’t-cares. In: Proceedings of the Design Automation Conference (DAC) (2006)
Author information
Authors and Affiliations
Corresponding author
Additional information
This article is an extended version of a conference paper that appeared at ISPDC 2010 [27].
Rights and permissions
About this article
Cite this article
Sen, A., Aksanli, B. & Bozkurt, M. Speeding Up Cycle Based Logic Simulation Using Graphics Processing Units. Int J Parallel Prog 39, 639–661 (2011). https://doi.org/10.1007/s10766-011-0164-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10766-011-0164-7