Coarse-Grained Reconfigurable Array (CGRA)

Li, Zhaoying; Wijerathne, Dhananjaya; Mitra, Tulika

doi:10.1007/978-981-15-6401-7_50-1

Zhaoying Li²,
Dhananjaya Wijerathne² &
Tulika Mitra²

192 Accesses
3 Citations

Abstract

Coarse-grained reconfigurable array (CGRA) is a promising class of spatial accelerator that offers high performance, energy efficiency, as well as flexibility to support a wide range of application domains. CGRAs can bridge the gap between efficient but inflexible domain-specific accelerators and flexible but inefficient general-purpose processors. A CGRA is essentially an array of word-level processing elements connected via on-chip interconnect. Both the processing elements and the interconnect can be reconfigured per cycle following the on-chip configuration memory content. Thus the compiler needs to map the compute-intensive loop kernels of the application onto the CGRA in a spatio-temporal fashion by setting up the configuration memory. The simplicity and parallelism of the architecture coupled with the efficacy of the compiler enable the CGRA to reach the dual goal of hardware-like efficiency with software-like programmability. We present a comprehensive review of the CGRAs starting with the historical context, sketching the architectural landscape, and providing an extensive overview of the compilation approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Ahn M, Yoon JW, Paek Y, Kim Y, Kiemb M, Choi K (2006) A spatial mapping algorithm for heterogeneous coarse-grained reconfigurable architectures. In: Proceedings of the Conference on Design, Automation and Test in Europe: Proceedings. European Design and Automation Association, pp 363–368
Google Scholar
Alle M, Varadarajan K, Ramesh RC, Nimmy J, Fell A, Rao A, Nandy S, Narayan R (2008) Synthesis of application accelerators on runtime reconfigurable hardware. In: 2008 International Conference on Application-Specific Systems, Architectures and Processors. IEEE, Munich, Germany, pp 13–180
Google Scholar
Amdahl GM (1967) Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the Spring Joint Computer Conference, 18–20 Apr, 1967, pp 483–485
Google Scholar
Bandara TK, Wijerathne D, Mitra T, Peh LS (2022) REVAMP: a systematic framework for heterogeneous CGRA realization. In: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, Lausanne, Switzerland
Google Scholar
Bansal N, Gupta S, Dutt N, Nicolau A (2003) Analysis of the performance of coarse-grain reconfigurable architectures with different processing element configurations. Proc. of Workshop on Application Specific Processors, vol. 12
Google Scholar
Baumgarte V, Ehlers G, May F, Nückel A, Vorbach M, Weinhardt M (2003) PACT XPP – a self-reconfigurable data processing architecture. J Supercomput 26(2):167–184
Article MATH Google Scholar
Betz V, Rose J (1997) VPR: a new packing, placement and routing tool for FPGA research. In: International Workshop on Field Programmable Logic and Applications. Springer, Berlin Heidelberg, 1, pp 213–222
Google Scholar
Brenner JA, Fekete SP, Van Der Veen JC (2009) A minimization version of a directed subgraph homeomorphism problem. Math Methods Oper Res 69(2):281–296
Article MathSciNet MATH Google Scholar
Burns GF, Jacobs M, Lindwer M, Vandewiele B (2004) Exploiting parallelism, while managing complexity using Silicon Hive programming tools. White paper vol. 42, p. 43, 2004.
Google Scholar
Cao P, Liu B, Yang J, Yang J, Zhang M, Shi L (2017) Context management scheme optimization of coarse-grained reconfigurable architecture for multimedia applications. IEEE Trans Very Large Scale Integr (VLSI) Syst 17, 2321–2331
Article Google Scholar
Carballo J-A, Chan W-TJ , Gargini PA, Kahng AB, Nath S (2014) ITRS 2.0: toward a re-framing of the semiconductor technology roadmap. In: 2014 IEEE 32nd International Conference on Computer Design (ICCD). IEEE, pp 139–146
Google Scholar
Chaudhuri S, Hetzel A (2017) SAT-based compilation to a non-vonNeumann processor. In: Proceedings of the 36th International Conference on Computer-Aided Design. IEEE Press, Irvine, CA, USA, pp 675–682
Google Scholar
Chen L, Mitra T (2014) Graph minor approach for application mapping on CGRAs. ACM Trans Reconfig Technol Syst (TRETS) 7(3):1–25
Article Google Scholar
Chen Y-H, Yang T-J, Emer J, Sze V (2019) Eyeriss v2: a flexible accelerator for emerging deep neural networks on mobile devices. IEEE J Emerg Sel Top Circuits Syst 9(2):292–308
Article Google Scholar
Chin SA, Anderson JH (2018) An architecture-agnostic integer linear programming approach to CGRA mapping. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, pp 1–6
Google Scholar
Chin SA, Sakamoto N, Rui A, Zhao J, Kim JH, Hara-Azumi Y, Anderson J (2017) CGRA-ME: a unified framework for CGRA modelling and exploration. In: 2017 IEEE 28th International Conference on Application-Specific Systems, Architectures and Processors (ASAP). IEEE, Seattle, WA, USA, pp 184–189
Chapter Google Scholar
Choi K (2011) Coarse-grained reconfigurable array: architecture and application mapping. IPSJ Trans Syst LSI Des Methodol 4:31–46
Article Google Scholar
Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv (csuR) 34(2):171–210
Article Google Scholar
Dally WJ, Turakhia Y, Han S (2020) Domain-specific hardware accelerators. Commun ACM 63(7):48–57
Article Google Scholar
DARPA software defined hardware (2019). Online. Available: https://www.darpa.mil/program/software-defined-hardware
Dave S, Balasubramanian M, Shrivastava A (2018) RAMP: resource-aware mapping for CGRAs. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6
Google Scholar
Dennard RH, Gaensslen FH, Yu H-N, Rideout VL, Bassous E, LeBlanc AR (1974) Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J Solid-State Circuits 9(5):256–268
Article Google Scholar
De Sutter B, Coene P, Vander Aa T, Mei B (2008) Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays. In: Proceedings of the 2008 ACM SIGPLAN-SIGBED Conference on Languages, Compilers and Tools for Embedded System, ser. LCTES’08. ACM, Tucson, Arizona, USA, pp 151–160
Google Scholar
Di Battista G, Patrignani M, Vargiu F (1998) A split&push approach to 3D orthogonal drawing. In: International Symposium on Graph Drawing. Springer, Berlin, Heidelberg, pp 87–101
Chapter Google Scholar
Eisenbeis C, Lelait S, Marmol B (1995) The meeting graph: a new model for loop cyclic register allocation. In: Proceedings of the 1995 International Federation for Information Processing Working Group, pp 264–267
Google Scholar
Emani M, Vishwanath V, Adams C, Papka ME, Stevens R, Florescu L, Jairath S, Liu W, Nama T, Sujeeth A (2021) Accelerating scientific applications with SambaNova reconfigurable dataflow architecture. Comput Sci Eng 23(2):114–119
Article Google Scholar
Fleming KE, Glossop KD, Steely SC Jr, Tang J, Gara AG et al (2020) Processors, methods, and systems with a configurable spatial accelerator. US Patent 10,558,575, 11 Feb 2020
Google Scholar
Fortune S, Hopcroft J, Wyllie J (1980) The directed subgraph homeomorphism problem. Theor Comput Sci 10(2):111–121
Article MathSciNet MATH Google Scholar
Friedman S, Carroll A, Van Essen B, Ylvisaker B, Ebeling C, Hauck S (2009) SPR: an architecture-adaptive CGRA mapping tool. In: Proceedings of the 17th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, ser. FPGA’09. ACM, pp 191–200
Google Scholar
Fujii T, Toi T, Tanaka T, Togawa K, Kitaoka T, Nishino K, Nakamura N, Nakahara H, Motomura M (2018) New generation dynamically reconfigurable processor technology for accelerating embedded AI applications. In: 2018 IEEE Symposium on VLSI Circuits. IEEE, Honolulu, HI, USA, pp 41–42
Chapter Google Scholar
Gao M, Kozyrakis C (2016) HRL: efficient and flexible reconfigurable logic for near-data processing. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, Barcelona, spain, pp 126–137
Google Scholar
Ghorpade J, Parande J, Kulkarni M, Bawaskar A (2012) GPGPU processing in CUDA architecture. arXiv preprint arXiv:1202.4347
Google Scholar
Hameed R, Qadeer W, Wachs M, Azizi O, Solomatnikov A, Lee BC, Richardson S, Kozyrakis C, Horowitz M (2010) Understanding sources of inefficiency in general-purpose chips. In Proceedings of the 37th Annual International Symposium on Computer Architecture, pp 37–47
Google Scholar
Hamzeh M, Shrivastava A, Vrudhula S (2012) EPIMap: using epimorphism to map applications on CGRAs. In: Proceedings of the 49th Annual Design Automation Conference, pp 1284–1291
Google Scholar
Hamzeh M, Shrivastava A, Vrudhula S (2013) REGIMap: register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs). In: Proceedings of the 50th Annual Design Automation Conference, pp 1–10
Google Scholar
Han K, Ahn J, Choi K (2013) Power-efficient predication techniques for acceleration of control flow execution on CGRA. ACM Trans Architecture Code Optim (TACO) 10(2):1–25
Article Google Scholar
Hatanaka A, Bagherzadeh N (2007) A modulo scheduling algorithm for a coarse-grain reconfigurable array template. In: Proceedings of the 21st International Parallel and Distributed Processing Symposium, ser. IPDPS’07. IEEE, Long Beach, CA, USA, pp 1–8
Google Scholar
Hennessy JL, Patterson DA (2011) Computer architecture: a quantitative approach. Elsevier, Amsterdam
MATH Google Scholar
Jafri SMAH, Tajammul MA, Hemani A, Paul K, Plosila J, Ellervee P, Tenuhnen H (2015) Polymorphic configuration architecture for CGRAs. IEEE Trans Very Large Scale Integr (VLSI) Syst 24(1):403–407
Article Google Scholar
Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, Bates S, Bhatia S, Boden N, Borchers A et al (2017) In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, pp 1–12
Google Scholar
Kågström B, Ling P, Van Loan C (1998) GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark. ACM Trans Math Softw (TOMS) 24(3):268–302
Article MATH Google Scholar
Karunaratne M, Mohite AK, Mitra T, Peh L-S (2017) HyCUBE: a CGRA with reconfigurable single-cycle multi-hop interconnect. In: Design Automation Conference (DAC), 2017 54th ACM/EDAC/IEEE. IEEE, Austin, TX, USA, pp 1–6
Google Scholar
Karunaratne M, Tan C, Kulkarni A, Mitra T, Peh L-S (2018) DNestMap: mapping deeply-nested loops on ultra-low power CGRAs. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6
Google Scholar
Karunaratne M, Wijerathne D, Mitra T, Peh L-S (2019) 4D-CGRA: introducing branch dimension to spatio-temporal application mapping on CGRAs. In: 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, Westminster, CO, USA, pp 1–8
Google Scholar
Kim Y, Lee J, Shrivastava A, Yoon J, Paek Y (2010) Memory-aware application mapping on coarse-grained reconfigurable arrays. In: International Conference on High-Performance Embedded Architectures and Compilers. Springer, pp 171–185
Google Scholar
Kim Y, Lee J, Shrivastava A, Paek Y (2010) Operation and data mapping for CGRAs with multi-bank memory. ACM SIGPLAN Not 45(4):17–26
Article Google Scholar
Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680
Article MathSciNet MATH Google Scholar
Kuon I, Rose J (2007) Measuring the gap between FPGAs and ASICs. IEEE Trans Comput-Aided Des Integr Circuits Syst 26(2):203–215
Article Google Scholar
Kuon I, Tessier R, Rose J (2008) FPGA architecture: survey and challenges. Now Publishers Inc., Hanover, MA 02339 USA
Google Scholar
Kwong J, Chandrakasan AP (2011) An energy-efficient biomedical signal processing platform. IEEE J Solid-State Circuits 46(7):1742–1753
Article Google Scholar
Lee P, Kedem ZM (1990) Mapping nested loop algorithms into multidimensional systolic arrays. IEEE Trans Parallel Distrib Syst 1(1):64–76
Article Google Scholar
Lee G, Choi K, Dutt ND (2011) Mapping multi-domain applications onto coarse-grained reconfigurable architectures. IEEE Trans Comput-Aided Des Integr Circuits Syst 30(5):637–650
Article Google Scholar
Lee J, Seo S, Lee H, Sim HU (2014) Flattening-based mapping of imperfect loop nests for CGRAs. In: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis. ACM, Uttar Pradesh, India, p 9
Google Scholar
Lee H, Nguyen D, Lee J (2015) Optimizing stream program performance on CGRA-based systems. In: Proceedings of the 52nd Annual Design Automation Conference, pp 1–6
Google Scholar
Li S, Ebeling C (2008) QuickRoute: a fast routing algorithm for pipelined architectures. In: Proceedings on Field-Programmable Technology, 2004. IEEE International Conference. IEEE, Brisbane, NSW, Australia, pp 73–80
Google Scholar
Li Z, Wijerathne D, Chen X, Pathania A, Mitra T (2021) ChordMap: automated mapping of streaming applications onto CGRA. IEEE Trans Comput-Aided Des Integr Circuits Syst 41(2):306–319
Article Google Scholar
Li Z, Wu D, Wijerathne D, Mitra T (2022) LISA: graph neural network based portable mapping on spatial accelerators. In: 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Seoul, Korea (South)
Google Scholar
Liu D, Yin S, Liu L, Wei S (2013) Polyhedral model based mapping optimization of loop nests for CGRAs. In: Proceedings of the 50th Annual Design Automation Conference. ACM, San Francisco, CA, USA, p 19
Google Scholar
Liu D, Yin S, Luo G, Shang J, Liu L, Wei S, Feng Y, Zhou S (2018) Data-flow graph mapping optimization for CGRA with deep reinforcement learning. IEEE Trans Comput-Aided Des Integr Circuits Syst 38(12):2271–2283
Article Google Scholar
Liu L, Zhu J, Li Z, Lu Y, Deng Y, Han J, Yin S, Wei S (2019) A survey of coarse-grained reconfigurable architecture and design: taxonomy, challenges, and applications. ACM Comput Surv (CSUR) 52(6):1–39
Article Google Scholar
Lu W, Yan G, Li J, Gong S, Han Y, Li X (2017) FlexFlow: a flexible dataflow accelerator architecture for convolutional neural networks. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, Austin, TX, USA, pp 553–564
Chapter Google Scholar
McMurchie L, Ebeling C (2008) PathFinder: a negotiation-based performance-driven router for FPGAs. In: Reconfigurable computing. Elsevier, Burlington, Massachusetts, pp 365–381
Chapter Google Scholar
Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2002) DRESC: a retargetable compiler for coarse-grained reconfigurable architectures. In: 2002 IEEE International Conference on Field-Programmable Technology, 2002 (FPT). Proceedings. IEEE, Hong Kong, China, pp 166–173
Google Scholar
Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2003a) ADRES: an architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. In: Proceedings of the 13th International Conference on Field Programmable Logic and Application, ser. FPL’03. Springer, Berlin Heidelberg, pp 61–70
Google Scholar
Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2003b) Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. In: Proceedings of the 2003 Conference on Design, Automation and Test in Europe, ser. DATE’03. IEEE, Munich, Germany, pp 296–301
Google Scholar
Mitra T (2015) Heterogeneous multi-core architectures. Inf Media Technol 10(3):383–394
Google Scholar
Moore GE et al (1998) Cramming more components onto integrated circuits. Proceedings of the IEEE 86(1): 82–85
Article Google Scholar
Nicol C (2017) A coarse grain reconfigurable array (CGRA) for statically scheduled data flow computing. Wave Computing White Paper
Google Scholar
Nowatzki T, Gangadhar V, Ardalani N, Sankaralingam K (2017) Stream-dataflow acceleration. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, Toronto, ON, Canada, pp 416–429
Google Scholar
Park H, Fan K, Mahlke SA, Oh T, Kim H, Kim H-S (2008a) Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, ser. PACT’08. ACM, Toronto, Ontario, Canada, pp 166–176
Google Scholar
Park H, Fan K, Mahlke SA, Oh T, Kim H, Kim H-S (2008b) Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp 166–176
Google Scholar
Patterson DA (2006) Future of computer architecture. In: Berkeley EECS Annual Research Symposium (BEARS), College of Engineering, UC Berkeley, US
Google Scholar
Podobas A, Sano K, Matsuoka S (2020) A survey on coarse-grained reconfigurable architectures from a performance perspective. IEEE Access 8:146719–146743
Article Google Scholar
Prabhakar R, Zhang Y, Koeplinger D, Feldman M, Zhao T, Hadjis S, Pedram A, Kozyrakis C, Olukotun K (2017) Plasticine: a reconfigurable architecture for parallel patterns. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, Toronto, ON, Canada, pp 389–402
Google Scholar
Rashid M, Imran M, Jafri AR, Al-Somani TF (2019) Flexible architectures for cryptographic algorithms – a systematic literature review. J Circuits Syst Comput 28(03):1930003
Article Google Scholar
Rau BR (1994) Iterative modulo scheduling: an algorithm for software pipelining loops. In: Proceedings of the 27th Annual International Symposium on Microarchitecture. ACM, San José, CA, USA, pp 63–74
Google Scholar
Robertson N, Seymour PD (1990) Graph minors. IX. Disjoint crossed paths. J Comb Theory Ser B 49(1):40–77
Article MathSciNet MATH Google Scholar
Shao YS, Reagen B, Wei G-Y, Brooks D (2015) The Aladdin approach to accelerator design and modeling. IEEE Micro 35(3):58–70
Article Google Scholar
Singh H, Lee MH, Lu G, Kurdahi FJ, Bagherzadeh N, Chaves Filho EM (2000) MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans Comput 49(5):465–481
Article Google Scholar
Singh H, Lee M-H, Lu G, Kurdahi FJ, Bagherzadeh N, Chaves Filho EM (2000) MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans Comput 49(5):465–481
Article Google Scholar
Suh D, Kwon K, Kim S, Ryu S, Kim J (2012) Design space exploration and implementation of a high performance and low area coarse grained reconfigurable processor. In: 2012 International Conference on Field-Programmable Technology. IEEE, Seoul, Korea (South), pp 67–70
Chapter Google Scholar
Tu F, Yin S, Ouyang P, Tang S, Liu L, Wei S (2017) Deep convolutional neural network architecture with reconfigurable computation patterns. IEEE Trans Very Large Scale Integr (VLSI) Syst 25(8):2220–2233
Article Google Scholar
Tuhin MAA, Norvell TS (2008) Compiling parallel applications to coarse-grained reconfigurable architectures. In: 2008 Canadian Conference on Electrical and Computer Engineering. IEEE, Niagara Falls, ON, Canada, pp 001723–001728
Google Scholar
Venkataramani S, Choi J, Srinivasan V, Wang W, Zhang J, Schaal M, Serrano MJ, Ishizaki K, Inoue H, Ogawa E et al (2019) DeepTools: compiler and execution runtime extensions for rapid AI accelerator. IEEE Micro 39(5):102–111
Article Google Scholar
Wang Y, Li P, Zhang P, Zhang C, Cong J (2013) Memory partitioning for multidimensional arrays in high-level synthesis. In: Proceedings of the 50th Annual Design Automation Conference, pp 1–8
Google Scholar
Wang B, Karunarathne M, Kulkarni A, Mitra T, Peh L-S (2019) HyCUBE: a 0.9 V 26.4 MOPS/mW, 290 pJ/op, power efficient accelerator for IoT applications. In: 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC). IEEE, Austin, TX, USA, pp 133–136
Google Scholar
Wijerathne D, Li Z, Karunarathne M, Pathania A, Mitra T (2019) Cascade: high throughput data streaming via decoupled access-execute CGRA. ACM Trans Embed Comput Syst (TECS) 18(5s):1–26
Article Google Scholar
Wijerathne D, Li Z, Pathania A, Mitra T, Thiele L (2021a) HiMap: fast and scalable high-quality mapping on CGRA via hierarchical abstraction. IEEE Trans Comput-Aided Des Integr Circuits Syst 41(10):3290–3303
Article Google Scholar
Wijerathne D, Li Z, Pathania A, Mitra T, Thiele L (2021b) HiMap: fast and scalable high-quality mapping on CGRA via hierarchical abstraction. pp 1192–1197
Google Scholar
Wijerathne D, Li Z, Karunaratne M, Peh L-S, Mitra T (2022a) Morpher: an open-source integrated compilation and simulation framework for CGRA. In: Workshop on Open-Source EDA Technology (WOSET)
Google Scholar
Wijerathne D, Li Z, Bandara TK, Mitra T (2022b) PANORAMA: divide-and-conquer approach for mapping complex loop kernels on CGRA. In: 2022 59th ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6
Google Scholar
Yin S, Yao X, Liu D, Liu L, Wei S (2015) Memory-aware loop mapping on coarse-grained reconfigurable architectures. IEEE Trans Very Large Scale Integr (VLSI) Syst 24(5):1895–1908
Article Google Scholar
Yin s, Yao x, Lu T, Liu L, Wei S (2016a) Joint loop mapping and data placement for coarse-grained reconfigurable architecture with multi-bank memory. In: Proceedings of the 35th International Conference on Computer-Aided Design, pp 1–8
Google Scholar
Yin S, Lin X, Liu L, Wei S (2016b) Exploiting parallelism of imperfect nested loops on coarse-grained reconfigurable architectures. IEEE Trans Parallel Distrib Syst 27(11):3199–3213
Article Google Scholar
Yin S, Liu D, Sun L, Liu L, Wei S (2017a) DFGNet: mapping dataflow graph onto CGRA by a deep learning approach. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, Baltimore, MD, USA, pp 1–4
Google Scholar
Yin S, Yao X, Lu T, Liu D, Gu J, Liu L, Wei S (2017b)Conflict-free loop mapping for coarse-grained reconfigurable architecture with multi-bank memory. IEEE Trans Parallel Distrib Syst 28(9):2471–2485
Article Google Scholar
Yoo J, Yan L, El-Damak D, Altaf MAB, Shoeb AH, Chandrakasan AP (2012) An 8-channel scalable EEG acquisition SoC with patient-specific seizure classification and recording processor. IEEE J Solid-State Circuits 48(1):214–228
Article Google Scholar
Yoon JW, Shrivastava A, Park S, Ahn M, Paek Y (2009) A graph drawing based spatial mapping algorithm for coarse-grained reconfigurable architectures. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(11):1565–1578
Article Google Scholar
Zalamea J, Llosa J, Ayguadé E, Valero M (2001) MIRS: modulo scheduling with integrated register spilling. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, Berlin, Heidelberg, pp 239–253
Google Scholar
Zhong G, Venkataramani V, Liang Y, Mitra T, Niar S (2014) Design space exploration of multiple loops on FPGAs using high level synthesis. In: 2014 IEEE 32nd International Conference on Computer Design (ICCD). IEEE, Seoul, Korea (South), pp 456–463
Chapter Google Scholar
Zhong G, Prakash A, Liang Y, Mitra T, Niar S (2016) Lin-Analyzer: a high-level performance analysis tool for FPGA-based accelerators. In: 53rd ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6
Google Scholar
Zhong G, Prakash A, Wang S, Liang Y, Mitra T, Niar S (2017) Design space exploration of FPGA-based accelerators with multi-level parallelism. In: Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. IEEE, Dresden, Germany, pp 1141–1146
Chapter Google Scholar

Download references

Acknowledgements

This work is partially supported by the National Research Foundation, Singapore, under its Competitive Research Programme Award NRF-CRP23-2019-0003.

Author information

Authors and Affiliations

National University of Singapore, Singapore, Singapore
Zhaoying Li, Dhananjaya Wijerathne & Tulika Mitra

Authors

Zhaoying Li
View author publications
You can also search for this author in PubMed Google Scholar
Dhananjaya Wijerathne
View author publications
You can also search for this author in PubMed Google Scholar
Tulika Mitra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tulika Mitra .

Editor information

Editors and Affiliations

Sch of Computer Science & Engineering, Nanyang Technological University, Singapore, Singapore
Anupam Chattopadhyay

Section Editor information

Computer Science, KAUST, 4700 King Abdullah University of Science and Technology, 23955, Thuwal, Saudi Arabia
Suhaib Fahmy

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Li, Z., Wijerathne, D., Mitra, T. (2023). Coarse-Grained Reconfigurable Array (CGRA). In: Chattopadhyay, A. (eds) Handbook of Computer Architecture. Springer, Singapore. https://doi.org/10.1007/978-981-15-6401-7_50-1

Download citation

DOI: https://doi.org/10.1007/978-981-15-6401-7_50-1
Published: 25 November 2022
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6401-7
Online ISBN: 978-981-15-6401-7
eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics