Skip to main content

Coarse-Grained Reconfigurable Array (CGRA)

  • Living reference work entry
  • First Online:
Handbook of Computer Architecture

Abstract

Coarse-grained reconfigurable array (CGRA) is a promising class of spatial accelerator that offers high performance, energy efficiency, as well as flexibility to support a wide range of application domains. CGRAs can bridge the gap between efficient but inflexible domain-specific accelerators and flexible but inefficient general-purpose processors. A CGRA is essentially an array of word-level processing elements connected via on-chip interconnect. Both the processing elements and the interconnect can be reconfigured per cycle following the on-chip configuration memory content. Thus the compiler needs to map the compute-intensive loop kernels of the application onto the CGRA in a spatio-temporal fashion by setting up the configuration memory. The simplicity and parallelism of the architecture coupled with the efficacy of the compiler enable the CGRA to reach the dual goal of hardware-like efficiency with software-like programmability. We present a comprehensive review of the CGRAs starting with the historical context, sketching the architectural landscape, and providing an extensive overview of the compilation approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  • Ahn M, Yoon JW, Paek Y, Kim Y, Kiemb M, Choi K (2006) A spatial mapping algorithm for heterogeneous coarse-grained reconfigurable architectures. In: Proceedings of the Conference on Design, Automation and Test in Europe: Proceedings. European Design and Automation Association, pp 363–368

    Google Scholar 

  • Alle M, Varadarajan K, Ramesh RC, Nimmy J, Fell A, Rao A, Nandy S, Narayan R (2008) Synthesis of application accelerators on runtime reconfigurable hardware. In: 2008 International Conference on Application-Specific Systems, Architectures and Processors. IEEE, Munich, Germany, pp 13–180

    Google Scholar 

  • Amdahl GM (1967) Validity of the single processor approach to achieving large scale computing capabilities. In: Proceedings of the Spring Joint Computer Conference, 18–20 Apr, 1967, pp 483–485

    Google Scholar 

  • Bandara TK, Wijerathne D, Mitra T, Peh LS (2022) REVAMP: a systematic framework for heterogeneous CGRA realization. In: 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). ACM, Lausanne, Switzerland

    Google Scholar 

  • Bansal N, Gupta S, Dutt N, Nicolau A (2003) Analysis of the performance of coarse-grain reconfigurable architectures with different processing element configurations. Proc. of Workshop on Application Specific Processors, vol. 12

    Google Scholar 

  • Baumgarte V, Ehlers G, May F, Nückel A, Vorbach M, Weinhardt M (2003) PACT XPP – a self-reconfigurable data processing architecture. J Supercomput 26(2):167–184

    Article  MATH  Google Scholar 

  • Betz V, Rose J (1997) VPR: a new packing, placement and routing tool for FPGA research. In: International Workshop on Field Programmable Logic and Applications. Springer, Berlin Heidelberg, 1, pp 213–222

    Google Scholar 

  • Brenner JA, Fekete SP, Van Der Veen JC (2009) A minimization version of a directed subgraph homeomorphism problem. Math Methods Oper Res 69(2):281–296

    Article  MathSciNet  MATH  Google Scholar 

  • Burns GF, Jacobs M, Lindwer M, Vandewiele B (2004) Exploiting parallelism, while managing complexity using Silicon Hive programming tools. White paper vol. 42, p. 43, 2004.

    Google Scholar 

  • Cao P, Liu B, Yang J, Yang J, Zhang M, Shi L (2017) Context management scheme optimization of coarse-grained reconfigurable architecture for multimedia applications. IEEE Trans Very Large Scale Integr (VLSI) Syst 17, 2321–2331

    Article  Google Scholar 

  • Carballo J-A, Chan W-TJ , Gargini PA, Kahng AB, Nath S (2014) ITRS 2.0: toward a re-framing of the semiconductor technology roadmap. In: 2014 IEEE 32nd International Conference on Computer Design (ICCD). IEEE, pp 139–146

    Google Scholar 

  • Chaudhuri S, Hetzel A (2017) SAT-based compilation to a non-vonNeumann processor. In: Proceedings of the 36th International Conference on Computer-Aided Design. IEEE Press, Irvine, CA, USA, pp 675–682

    Google Scholar 

  • Chen L, Mitra T (2014) Graph minor approach for application mapping on CGRAs. ACM Trans Reconfig Technol Syst (TRETS) 7(3):1–25

    Article  Google Scholar 

  • Chen Y-H, Yang T-J, Emer J, Sze V (2019) Eyeriss v2: a flexible accelerator for emerging deep neural networks on mobile devices. IEEE J Emerg Sel Top Circuits Syst 9(2):292–308

    Article  Google Scholar 

  • Chin SA, Anderson JH (2018) An architecture-agnostic integer linear programming approach to CGRA mapping. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, pp 1–6

    Google Scholar 

  • Chin SA, Sakamoto N, Rui A, Zhao J, Kim JH, Hara-Azumi Y, Anderson J (2017) CGRA-ME: a unified framework for CGRA modelling and exploration. In: 2017 IEEE 28th International Conference on Application-Specific Systems, Architectures and Processors (ASAP). IEEE, Seattle, WA, USA, pp 184–189

    Chapter  Google Scholar 

  • Choi K (2011) Coarse-grained reconfigurable array: architecture and application mapping. IPSJ Trans Syst LSI Des Methodol 4:31–46

    Article  Google Scholar 

  • Compton K, Hauck S (2002) Reconfigurable computing: a survey of systems and software. ACM Comput Surv (csuR) 34(2):171–210

    Article  Google Scholar 

  • Dally WJ, Turakhia Y, Han S (2020) Domain-specific hardware accelerators. Commun ACM 63(7):48–57

    Article  Google Scholar 

  • DARPA software defined hardware (2019). Online. Available: https://www.darpa.mil/program/software-defined-hardware

  • Dave S, Balasubramanian M, Shrivastava A (2018) RAMP: resource-aware mapping for CGRAs. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6

    Google Scholar 

  • Dennard RH, Gaensslen FH, Yu H-N, Rideout VL, Bassous E, LeBlanc AR (1974) Design of ion-implanted MOSFET’s with very small physical dimensions. IEEE J Solid-State Circuits 9(5):256–268

    Article  Google Scholar 

  • De Sutter B, Coene P, Vander Aa T, Mei B (2008) Placement-and-routing-based register allocation for coarse-grained reconfigurable arrays. In: Proceedings of the 2008 ACM SIGPLAN-SIGBED Conference on Languages, Compilers and Tools for Embedded System, ser. LCTES’08. ACM, Tucson, Arizona, USA, pp 151–160

    Google Scholar 

  • Di Battista G, Patrignani M, Vargiu F (1998) A split&push approach to 3D orthogonal drawing. In: International Symposium on Graph Drawing. Springer, Berlin, Heidelberg, pp 87–101

    Chapter  Google Scholar 

  • Eisenbeis C, Lelait S, Marmol B (1995) The meeting graph: a new model for loop cyclic register allocation. In: Proceedings of the 1995 International Federation for Information Processing Working Group, pp 264–267

    Google Scholar 

  • Emani M, Vishwanath V, Adams C, Papka ME, Stevens R, Florescu L, Jairath S, Liu W, Nama T, Sujeeth A (2021) Accelerating scientific applications with SambaNova reconfigurable dataflow architecture. Comput Sci Eng 23(2):114–119

    Article  Google Scholar 

  • Fleming KE, Glossop KD, Steely SC Jr, Tang J, Gara AG et al (2020) Processors, methods, and systems with a configurable spatial accelerator. US Patent 10,558,575, 11 Feb 2020

    Google Scholar 

  • Fortune S, Hopcroft J, Wyllie J (1980) The directed subgraph homeomorphism problem. Theor Comput Sci 10(2):111–121

    Article  MathSciNet  MATH  Google Scholar 

  • Friedman S, Carroll A, Van Essen B, Ylvisaker B, Ebeling C, Hauck S (2009) SPR: an architecture-adaptive CGRA mapping tool. In: Proceedings of the 17th ACM/SIGDA International Symposium on Field Programmable Gate Arrays, ser. FPGA’09. ACM, pp 191–200

    Google Scholar 

  • Fujii T, Toi T, Tanaka T, Togawa K, Kitaoka T, Nishino K, Nakamura N, Nakahara H, Motomura M (2018) New generation dynamically reconfigurable processor technology for accelerating embedded AI applications. In: 2018 IEEE Symposium on VLSI Circuits. IEEE, Honolulu, HI, USA, pp 41–42

    Chapter  Google Scholar 

  • Gao M, Kozyrakis C (2016) HRL: efficient and flexible reconfigurable logic for near-data processing. In: 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, Barcelona, spain, pp 126–137

    Google Scholar 

  • Ghorpade J, Parande J, Kulkarni M, Bawaskar A (2012) GPGPU processing in CUDA architecture. arXiv preprint arXiv:1202.4347

    Google Scholar 

  • Hameed R, Qadeer W, Wachs M, Azizi O, Solomatnikov A, Lee BC, Richardson S, Kozyrakis C, Horowitz M (2010) Understanding sources of inefficiency in general-purpose chips. In Proceedings of the 37th Annual International Symposium on Computer Architecture, pp 37–47

    Google Scholar 

  • Hamzeh M, Shrivastava A, Vrudhula S (2012) EPIMap: using epimorphism to map applications on CGRAs. In: Proceedings of the 49th Annual Design Automation Conference, pp 1284–1291

    Google Scholar 

  • Hamzeh M, Shrivastava A, Vrudhula S (2013) REGIMap: register-aware application mapping on coarse-grained reconfigurable architectures (CGRAs). In: Proceedings of the 50th Annual Design Automation Conference, pp 1–10

    Google Scholar 

  • Han K, Ahn J, Choi K (2013) Power-efficient predication techniques for acceleration of control flow execution on CGRA. ACM Trans Architecture Code Optim (TACO) 10(2):1–25

    Article  Google Scholar 

  • Hatanaka A, Bagherzadeh N (2007) A modulo scheduling algorithm for a coarse-grain reconfigurable array template. In: Proceedings of the 21st International Parallel and Distributed Processing Symposium, ser. IPDPS’07. IEEE, Long Beach, CA, USA, pp 1–8

    Google Scholar 

  • Hennessy JL, Patterson DA (2011) Computer architecture: a quantitative approach. Elsevier, Amsterdam

    MATH  Google Scholar 

  • Jafri SMAH, Tajammul MA, Hemani A, Paul K, Plosila J, Ellervee P, Tenuhnen H (2015) Polymorphic configuration architecture for CGRAs. IEEE Trans Very Large Scale Integr (VLSI) Syst 24(1):403–407

    Article  Google Scholar 

  • Jouppi NP, Young C, Patil N, Patterson D, Agrawal G, Bajwa R, Bates S, Bhatia S, Boden N, Borchers A et al (2017) In-datacenter performance analysis of a tensor processing unit. In: Proceedings of the 44th Annual International Symposium on Computer Architecture, pp 1–12

    Google Scholar 

  • Kågström B, Ling P, Van Loan C (1998) GEMM-based level 3 BLAS: high-performance model implementations and performance evaluation benchmark. ACM Trans Math Softw (TOMS) 24(3):268–302

    Article  MATH  Google Scholar 

  • Karunaratne M, Mohite AK, Mitra T, Peh L-S (2017) HyCUBE: a CGRA with reconfigurable single-cycle multi-hop interconnect. In: Design Automation Conference (DAC), 2017 54th ACM/EDAC/IEEE. IEEE, Austin, TX, USA, pp 1–6

    Google Scholar 

  • Karunaratne M, Tan C, Kulkarni A, Mitra T, Peh L-S (2018) DNestMap: mapping deeply-nested loops on ultra-low power CGRAs. In: 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6

    Google Scholar 

  • Karunaratne M, Wijerathne D, Mitra T, Peh L-S (2019) 4D-CGRA: introducing branch dimension to spatio-temporal application mapping on CGRAs. In: 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). IEEE, Westminster, CO, USA, pp 1–8

    Google Scholar 

  • Kim Y, Lee J, Shrivastava A, Yoon J, Paek Y (2010) Memory-aware application mapping on coarse-grained reconfigurable arrays. In: International Conference on High-Performance Embedded Architectures and Compilers. Springer, pp 171–185

    Google Scholar 

  • Kim Y, Lee J, Shrivastava A, Paek Y (2010) Operation and data mapping for CGRAs with multi-bank memory. ACM SIGPLAN Not 45(4):17–26

    Article  Google Scholar 

  • Kirkpatrick S, Gelatt CD, Vecchi MP (1983) Optimization by simulated annealing. Science 220(4598):671–680

    Article  MathSciNet  MATH  Google Scholar 

  • Kuon I, Rose J (2007) Measuring the gap between FPGAs and ASICs. IEEE Trans Comput-Aided Des Integr Circuits Syst 26(2):203–215

    Article  Google Scholar 

  • Kuon I, Tessier R, Rose J (2008) FPGA architecture: survey and challenges. Now Publishers Inc., Hanover, MA 02339 USA

    Google Scholar 

  • Kwong J, Chandrakasan AP (2011) An energy-efficient biomedical signal processing platform. IEEE J Solid-State Circuits 46(7):1742–1753

    Article  Google Scholar 

  • Lee P, Kedem ZM (1990) Mapping nested loop algorithms into multidimensional systolic arrays. IEEE Trans Parallel Distrib Syst 1(1):64–76

    Article  Google Scholar 

  • Lee G, Choi K, Dutt ND (2011) Mapping multi-domain applications onto coarse-grained reconfigurable architectures. IEEE Trans Comput-Aided Des Integr Circuits Syst 30(5):637–650

    Article  Google Scholar 

  • Lee J, Seo S, Lee H, Sim HU (2014) Flattening-based mapping of imperfect loop nests for CGRAs. In: Proceedings of the 2014 International Conference on Hardware/Software Codesign and System Synthesis. ACM, Uttar Pradesh, India, p 9

    Google Scholar 

  • Lee H, Nguyen D, Lee J (2015) Optimizing stream program performance on CGRA-based systems. In: Proceedings of the 52nd Annual Design Automation Conference, pp 1–6

    Google Scholar 

  • Li S, Ebeling C (2008) QuickRoute: a fast routing algorithm for pipelined architectures. In: Proceedings on Field-Programmable Technology, 2004. IEEE International Conference. IEEE, Brisbane, NSW, Australia, pp 73–80

    Google Scholar 

  • Li Z, Wijerathne D, Chen X, Pathania A, Mitra T (2021) ChordMap: automated mapping of streaming applications onto CGRA. IEEE Trans Comput-Aided Des Integr Circuits Syst 41(2):306–319

    Article  Google Scholar 

  • Li Z, Wu D, Wijerathne D, Mitra T (2022) LISA: graph neural network based portable mapping on spatial accelerators. In: 2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA). IEEE, Seoul, Korea (South)

    Google Scholar 

  • Liu D, Yin S, Liu L, Wei S (2013) Polyhedral model based mapping optimization of loop nests for CGRAs. In: Proceedings of the 50th Annual Design Automation Conference. ACM, San Francisco, CA, USA, p 19

    Google Scholar 

  • Liu D, Yin S, Luo G, Shang J, Liu L, Wei S, Feng Y, Zhou S (2018) Data-flow graph mapping optimization for CGRA with deep reinforcement learning. IEEE Trans Comput-Aided Des Integr Circuits Syst 38(12):2271–2283

    Article  Google Scholar 

  • Liu L, Zhu J, Li Z, Lu Y, Deng Y, Han J, Yin S, Wei S (2019) A survey of coarse-grained reconfigurable architecture and design: taxonomy, challenges, and applications. ACM Comput Surv (CSUR) 52(6):1–39

    Article  Google Scholar 

  • Lu W, Yan G, Li J, Gong S, Han Y, Li X (2017) FlexFlow: a flexible dataflow accelerator architecture for convolutional neural networks. In: 2017 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, Austin, TX, USA, pp 553–564

    Chapter  Google Scholar 

  • McMurchie L, Ebeling C (2008) PathFinder: a negotiation-based performance-driven router for FPGAs. In: Reconfigurable computing. Elsevier, Burlington, Massachusetts, pp 365–381

    Chapter  Google Scholar 

  • Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2002) DRESC: a retargetable compiler for coarse-grained reconfigurable architectures. In: 2002 IEEE International Conference on Field-Programmable Technology, 2002 (FPT). Proceedings. IEEE, Hong Kong, China, pp 166–173

    Google Scholar 

  • Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2003a) ADRES: an architecture with tightly coupled VLIW processor and coarse-grained reconfigurable matrix. In: Proceedings of the 13th International Conference on Field Programmable Logic and Application, ser. FPL’03. Springer, Berlin Heidelberg, pp 61–70

    Google Scholar 

  • Mei B, Vernalde S, Verkest D, De Man H, Lauwereins R (2003b) Exploiting loop-level parallelism on coarse-grained reconfigurable architectures using modulo scheduling. In: Proceedings of the 2003 Conference on Design, Automation and Test in Europe, ser. DATE’03. IEEE, Munich, Germany, pp 296–301

    Google Scholar 

  • Mitra T (2015) Heterogeneous multi-core architectures. Inf Media Technol 10(3):383–394

    Google Scholar 

  • Moore GE et al (1998) Cramming more components onto integrated circuits. Proceedings of the IEEE 86(1): 82–85

    Article  Google Scholar 

  • Nicol C (2017) A coarse grain reconfigurable array (CGRA) for statically scheduled data flow computing. Wave Computing White Paper

    Google Scholar 

  • Nowatzki T, Gangadhar V, Ardalani N, Sankaralingam K (2017) Stream-dataflow acceleration. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, Toronto, ON, Canada, pp 416–429

    Google Scholar 

  • Park H, Fan K, Mahlke SA, Oh T, Kim H, Kim H-S (2008a) Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, ser. PACT’08. ACM, Toronto, Ontario, Canada, pp 166–176

    Google Scholar 

  • Park H, Fan K, Mahlke SA, Oh T, Kim H, Kim H-S (2008b) Edge-centric modulo scheduling for coarse-grained reconfigurable architectures. In: Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, pp 166–176

    Google Scholar 

  • Patterson DA (2006) Future of computer architecture. In: Berkeley EECS Annual Research Symposium (BEARS), College of Engineering, UC Berkeley, US

    Google Scholar 

  • Podobas A, Sano K, Matsuoka S (2020) A survey on coarse-grained reconfigurable architectures from a performance perspective. IEEE Access 8:146719–146743

    Article  Google Scholar 

  • Prabhakar R, Zhang Y, Koeplinger D, Feldman M, Zhao T, Hadjis S, Pedram A, Kozyrakis C, Olukotun K (2017) Plasticine: a reconfigurable architecture for parallel patterns. In: 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA). IEEE, Toronto, ON, Canada, pp 389–402

    Google Scholar 

  • Rashid M, Imran M, Jafri AR, Al-Somani TF (2019) Flexible architectures for cryptographic algorithms – a systematic literature review. J Circuits Syst Comput 28(03):1930003

    Article  Google Scholar 

  • Rau BR (1994) Iterative modulo scheduling: an algorithm for software pipelining loops. In: Proceedings of the 27th Annual International Symposium on Microarchitecture. ACM, San José, CA, USA, pp 63–74

    Google Scholar 

  • Robertson N, Seymour PD (1990) Graph minors. IX. Disjoint crossed paths. J Comb Theory Ser B 49(1):40–77

    Article  MathSciNet  MATH  Google Scholar 

  • Shao YS, Reagen B, Wei G-Y, Brooks D (2015) The Aladdin approach to accelerator design and modeling. IEEE Micro 35(3):58–70

    Article  Google Scholar 

  • Singh H, Lee MH, Lu G, Kurdahi FJ, Bagherzadeh N, Chaves Filho EM (2000) MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans Comput 49(5):465–481

    Article  Google Scholar 

  • Singh H, Lee M-H, Lu G, Kurdahi FJ, Bagherzadeh N, Chaves Filho EM (2000) MorphoSys: an integrated reconfigurable system for data-parallel and computation-intensive applications. IEEE Trans Comput 49(5):465–481

    Article  Google Scholar 

  • Suh D, Kwon K, Kim S, Ryu S, Kim J (2012) Design space exploration and implementation of a high performance and low area coarse grained reconfigurable processor. In: 2012 International Conference on Field-Programmable Technology. IEEE, Seoul, Korea (South), pp 67–70

    Chapter  Google Scholar 

  • Tu F, Yin S, Ouyang P, Tang S, Liu L, Wei S (2017) Deep convolutional neural network architecture with reconfigurable computation patterns. IEEE Trans Very Large Scale Integr (VLSI) Syst 25(8):2220–2233

    Article  Google Scholar 

  • Tuhin MAA, Norvell TS (2008) Compiling parallel applications to coarse-grained reconfigurable architectures. In: 2008 Canadian Conference on Electrical and Computer Engineering. IEEE, Niagara Falls, ON, Canada, pp 001723–001728

    Google Scholar 

  • Venkataramani S, Choi J, Srinivasan V, Wang W, Zhang J, Schaal M, Serrano MJ, Ishizaki K, Inoue H, Ogawa E et al (2019) DeepTools: compiler and execution runtime extensions for rapid AI accelerator. IEEE Micro 39(5):102–111

    Article  Google Scholar 

  • Wang Y, Li P, Zhang P, Zhang C, Cong J (2013) Memory partitioning for multidimensional arrays in high-level synthesis. In: Proceedings of the 50th Annual Design Automation Conference, pp 1–8

    Google Scholar 

  • Wang B, Karunarathne M, Kulkarni A, Mitra T, Peh L-S (2019) HyCUBE: a 0.9 V 26.4 MOPS/mW, 290 pJ/op, power efficient accelerator for IoT applications. In: 2019 IEEE Asian Solid-State Circuits Conference (A-SSCC). IEEE, Austin, TX, USA, pp 133–136

    Google Scholar 

  • Wijerathne D, Li Z, Karunarathne M, Pathania A, Mitra T (2019) Cascade: high throughput data streaming via decoupled access-execute CGRA. ACM Trans Embed Comput Syst (TECS) 18(5s):1–26

    Article  Google Scholar 

  • Wijerathne D, Li Z, Pathania A, Mitra T, Thiele L (2021a) HiMap: fast and scalable high-quality mapping on CGRA via hierarchical abstraction. IEEE Trans Comput-Aided Des Integr Circuits Syst 41(10):3290–3303

    Article  Google Scholar 

  • Wijerathne D, Li Z, Pathania A, Mitra T, Thiele L (2021b) HiMap: fast and scalable high-quality mapping on CGRA via hierarchical abstraction. pp 1192–1197

    Google Scholar 

  • Wijerathne D, Li Z, Karunaratne M, Peh L-S, Mitra T (2022a) Morpher: an open-source integrated compilation and simulation framework for CGRA. In: Workshop on Open-Source EDA Technology (WOSET)

    Google Scholar 

  • Wijerathne D, Li Z, Bandara TK, Mitra T (2022b) PANORAMA: divide-and-conquer approach for mapping complex loop kernels on CGRA. In: 2022 59th ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6

    Google Scholar 

  • Yin S, Yao X, Liu D, Liu L, Wei S (2015) Memory-aware loop mapping on coarse-grained reconfigurable architectures. IEEE Trans Very Large Scale Integr (VLSI) Syst 24(5):1895–1908

    Article  Google Scholar 

  • Yin s, Yao x, Lu T, Liu L, Wei S (2016a) Joint loop mapping and data placement for coarse-grained reconfigurable architecture with multi-bank memory. In: Proceedings of the 35th International Conference on Computer-Aided Design, pp 1–8

    Google Scholar 

  • Yin S, Lin X, Liu L, Wei S (2016b) Exploiting parallelism of imperfect nested loops on coarse-grained reconfigurable architectures. IEEE Trans Parallel Distrib Syst 27(11):3199–3213

    Article  Google Scholar 

  • Yin S, Liu D, Sun L, Liu L, Wei S (2017a) DFGNet: mapping dataflow graph onto CGRA by a deep learning approach. In: 2017 IEEE International Symposium on Circuits and Systems (ISCAS). IEEE, Baltimore, MD, USA, pp 1–4

    Google Scholar 

  • Yin S, Yao X, Lu T, Liu D, Gu J, Liu L, Wei S (2017b)Conflict-free loop mapping for coarse-grained reconfigurable architecture with multi-bank memory. IEEE Trans Parallel Distrib Syst 28(9):2471–2485

    Article  Google Scholar 

  • Yoo J, Yan L, El-Damak D, Altaf MAB, Shoeb AH, Chandrakasan AP (2012) An 8-channel scalable EEG acquisition SoC with patient-specific seizure classification and recording processor. IEEE J Solid-State Circuits 48(1):214–228

    Article  Google Scholar 

  • Yoon JW, Shrivastava A, Park S, Ahn M, Paek Y (2009) A graph drawing based spatial mapping algorithm for coarse-grained reconfigurable architectures. IEEE Trans Very Large Scale Integr (VLSI) Syst 17(11):1565–1578

    Article  Google Scholar 

  • Zalamea J, Llosa J, Ayguadé E, Valero M (2001) MIRS: modulo scheduling with integrated register spilling. In: International Workshop on Languages and Compilers for Parallel Computing. Springer, Berlin, Heidelberg, pp 239–253

    Google Scholar 

  • Zhong G, Venkataramani V, Liang Y, Mitra T, Niar S (2014) Design space exploration of multiple loops on FPGAs using high level synthesis. In: 2014 IEEE 32nd International Conference on Computer Design (ICCD). IEEE, Seoul, Korea (South), pp 456–463

    Chapter  Google Scholar 

  • Zhong G, Prakash A, Liang Y, Mitra T, Niar S (2016) Lin-Analyzer: a high-level performance analysis tool for FPGA-based accelerators. In: 53rd ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, San Francisco, CA, USA, pp 1–6

    Google Scholar 

  • Zhong G, Prakash A, Wang S, Liang Y, Mitra T, Niar S (2017) Design space exploration of FPGA-based accelerators with multi-level parallelism. In: Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. IEEE, Dresden, Germany, pp 1141–1146

    Chapter  Google Scholar 

Download references

Acknowledgements

This work is partially supported by the National Research Foundation, Singapore, under its Competitive Research Programme Award NRF-CRP23-2019-0003.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Tulika Mitra .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Singapore Pte Ltd.

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Li, Z., Wijerathne, D., Mitra, T. (2023). Coarse-Grained Reconfigurable Array (CGRA). In: Chattopadhyay, A. (eds) Handbook of Computer Architecture. Springer, Singapore. https://doi.org/10.1007/978-981-15-6401-7_50-1

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-6401-7_50-1

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-6401-7

  • Online ISBN: 978-981-15-6401-7

  • eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics