Compute-in-Memory Architecture

Jiang, Hongwu; Huang, Shanshi; Yu, Shimeng

doi:10.1007/978-981-15-6401-7_62-1

Hongwu Jiang²,
Shanshi Huang² &
Shimeng Yu²

725 Accesses

Abstract

In the era of big data and artificial intelligence, hardware advancement in throughput and energy efficiency is essential for both cloud and edge computations. Because of the merged data storage and computing units, compute-in-memory is becoming one of the desirable choices for data-centric applications to mitigate the memory wall bottleneck in von-Neumann architecture. In this chapter, the recent architectural designs and underlying circuit/device technologies for compute-in-memory are surveyed. The related design challenges and prospects are also discussed to provide an in-depth understanding of interactions between algorithms/architectures and circuits/devices. The chapter is organized hierarchically: the overview of the field (Introduction section); the principle of compute-in-memory (section “DNN Basics and Corresponding CIM Principle”); the latest architecture and algorithm techniques including network model, data flow, pipeline design, and quantization approaches (section “Architecture and Algorithm Techniques for CIM”); the related hardware support including embedded memory technologies such as static random access memories and emerging nonvolatile memories, as well as the peripheral circuit designs with a focus on the analog-to-digital converters (section “Hardware Implementations for CIM Architecture”); a summary and outlook of the compute-in-memory architecture (Conclusion section).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Ambrogio S, Gallot M, Spoon K, Tsai H, Mackin C, Wesson M, Kariyappa S, Narayanan P, Liu CC, Kumar A et al (2019) Reducing the impact of phase-change memory conductance drift on the inference of large-scale hardware neural networks. In: IEEE international electron devices meeting (IEDM)
Google Scholar
Chang HY, Narayanan P, Lewis SC, Farinha NC, Hosokawa K, Mackin C, Tsai H, Ambrogio S, Chen A, Burr GW (2019) AI hardware acceleration with analog memory: microarchitectures for low energy at high speed. IBM J Res Dev 63(6):8–1
Article Google Scholar
Chen Y (2020) ReRAM: history, status, and future. IEEE Trans Electron Devices 67(4):1420–1433
Article Google Scholar
Chen YH, Krishna T, Emer JS, Sze V (2016) Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks. IEEE J Solid State Circuits 52(1):127–138
Article Google Scholar
Chen WH, Li KX, Lin WY, Hsu KH, Li PY, Yang CH, Xue CX, Yang EY, Chen YK, Chang YS, Hsu TH (2018a) A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Chen PY, Peng X, Yu S (2018b) NeuroSim: a circuit-level macro model for benchmarking neuro-inspired architectures in online learning. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37(12):3067–3080
Article Google Scholar
Chi P, Li S, Xu C, Zhang T, Zhao J, Liu Y, Wang Y, Xie Y (2016) Prime: a novel processing-in-memory architecture for neural network computation in reram-based main memory. In: 43rd annual international symposium on computer architecture (ISCA), vol 44, p 27
Google Scholar
Courbariaux M, Bengio Y, David JP (2015) Training deep neural networks with low precision multiplications. In: Workshop contribution at international conference on learning representations (ICLR)
Google Scholar
DeHon A (2000) The density advantage of configurable computing. Computer 33(4):41–49
Article Google Scholar
Dong X, Xu C, Xie Y, Jouppi NP (2012) Nvsim: a circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans Comput-Aided Des Integr Circuits Syst 31(7):994–1007
Article Google Scholar
Elliott DG, Stumm M, Snelgrove WM, Cojocaru C, Mckenzie R (1999) Computational RAM: implementing processors in memory. IEEE Des Test Comput 16(1):32–41
Article Google Scholar
Feng Y, Chen B, Liu J, Sun Z, Hu H, Zhang J, Zhan X, Chen J (2021) Design-technology co-optimizations (DTCO) for general-purpose computing in-memory based on 55nm NOR flash technology. In: IEEE international electron devices meeting (IEDM)
Google Scholar
Gao L, Chen PY, Liu R, Yu S (2016) Physical unclonable function exploiting sneak paths in resistive cross-point array. IEEE Transactions on Electron Devices 63(8):3109–3115
Article Google Scholar
Giannopoulos I, Sebastian A, Le Gallo M, Jonnalagadda V, Sousa M, Boon M (2018) 8-bit precision in-memory multiplication with projected phasechange memory. In: IEEE international electron devices meeting (IEDM)
Google Scholar
Gokmen T, Onen M, Haensch W (2017) Training deep convolutional neural networks with resistive cross-point devices. Front Neurosci 11:538
Article Google Scholar
He Z, Lin J, Ewetz R, Yuan JS, Fan D (2019) Noise injection adaption: end-to-end ReRAM crossbar non-ideal effect adaption for neural network mapping. In: ACM/IEEE design automation conference (DAC)
Google Scholar
Huang S, Jiang H, Peng X, Li W, Yu S (2020a) XOR-CIM: compute-in-memory SRAM architecture with embedded XOR encryption. In: IEEE/ACM international conference on computer-aided design (ICCAD)
Google Scholar
Huang S, Sun X, Peng X, Jiang H, Yu S (2020b) Overcoming challenges for achieving high in-situ training accuracy with emerging memories. In: Design, Automation & Test in Europe Conference & Exhibition (DATE)
Google Scholar
Huang S, Peng X, Jiang H, Luo Y, Yu S (2021) Exploiting process variations to protect machine learning inference engine from chip cloning. In: IEEE international symposium on circuits and systems (ISCAS)
Google Scholar
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: Conference on neural information processing systems (NIPS)
MATH Google Scholar
Ikegawa S, Mancoff FB, Janesky J, Aggarwal S (2020) Magnetoresistive random access memory: present and future. IEEE Transactions on Electron Devices 67(4):1407–1419
Article Google Scholar
Imani M, Gupta S, Kim Y, Rosing T (2019) FloatPIM: in-memory acceleration of deep neural network training with high precision. In: ACM/IEEE 46th annual international symposium on computer architecture (ISCA)
Google Scholar
Jiang H, Peng X, Huang S, Yu S (2020a) CIMAT: a compute-in-memory architecture for on-chip training based on transpose SRAM arrays. IEEE Trans Comput 69(7):944–954
MATH Google Scholar
Jiang H, Huang S, Peng X, Su JW, Chou Y-C, Huang WH, Liu TW, Liu R, Chang MF, Yu S (2020b) A two-way SRAM array based accelerator for deep neural network on-chip training. In: ACM/IEEE design automation conference (DAC)
Google Scholar
Jiang H, Li W, Huang S, Cosemans S, Catthoor F, Yu S (2021) Analog-to-digital converter design exploration for compute-in-memory accelerators. IEEE Des Test 39(2):48–55
Article Google Scholar
Jouppi NP et al (2017) In-datacenter performance analysis of a tensor processing unit. In: ACM/IEEE international symposium on computer architecture (ISCA)
Google Scholar
Kawahara A, Azuma R, Ikeda Y, Kawai K, Katoh Y, Tanabe K, Nakamura T, Sumimoto Y, Yamada N, Nakai N, Sakamoto S, Hayakawa Y, Tsuji K, Yoneda S, Himeno A, Origasa K, Shimakawa K, Takagi T, Mikawa T, Aono K (2012) An 8Mb multi-layered cross-point ReRAM macro with 443MB/s write throughput. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Keckler S, Kunle O, Hofstee P (2009) Multicore processors and systems. Springer, US
Book MATH Google Scholar
Khwa W-S, Chen JJ, Li JF, Si X, Yang EY, Sun X, Liu R, Chen PY, Li Q, Yu S, Chang MF (2018) A 65nm 4Kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Kim T, Lee S (2020) Evolution of phase-change memory for the storage-class memory and beyond. IEEE Trans Electron Devices 67(4):1394–1406
Article Google Scholar
Kim D, She X, Rahman NM, Chekuri VCK, Mukhopadhyay S (2020) Processing-in-memory-based on-chip learning with spike-time-dependent plasticity in 65-nm cmos. IEEE Solid-State Circuits Letters 3:278–281
Article Google Scholar
Li C, Hu M, Li Y, Jiang H, Ge N, Montgomery E, Zhang J, Song W, Dávila N, Graves CE, Li Z (2018a) Analogue signal and image processing with large memristor crossbars. Nat Electron 1(1):52–59
Article Google Scholar
Li Y, Kim S, Sun X, Solomon P, Gokmen T, Tsai H, Koswatta S, Ren Z, Mo R, Yeh CC, Haensch W, Leobandung E (2018b) Capacitor-based cross-point array for analog neural network with record symmetry and linearity. In: IEEE symposium on VLSI technology
Google Scholar
Li W, Huang S, Sun X, Jiang H, Yu S (2021) Secure-RRAM: a 40nm 16kb compute-in-memory macro with reconfigurability, sparsity control, and embedded security. In: IEEE custom integrated circuits conference (CICC)
Google Scholar
Lin MY, Cheng HY, Lin WT, Yang TH, Tseng IC, Yang CL, Hu HW, Chang HS, Li HP, Chang MF (2018) DL-RSIM: a simulation framework to enable reliable ReRAM-based accelerators for deep learning. In: IEEE/ACM international conference on computer-aided design (ICCAD)
Google Scholar
Liu R, Peng X, Sun X, Khwa WS, Si X, Chen JJ, Li JF, Chang MF, Yu S (2018) Parallelizing SRAM arrays with customized bit-cell for binary neural networks. In: ACM/IEEE design automation conference (DAC)
Google Scholar
Liu Q, Gao B, Yao P, Wu D, Chen J, Pang Y, Zhang W, Liao Y, Xue CX, Chen WH, Tang J (2020) A fully integrated analog ReRAM based 78.4 TOPS/W compute-in-memory chip with fully parallel MAC computing. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Long Y, She X, Mukhopadhyay S (2019) Design of reliable DNN accelerator with un-reliable ReRAM. In: Design, Automation & Test in Europe Conference & Exhibition (DATE)
Google Scholar
Lue HT, Hsu PK, Wei ML, Yeh TH, Du PY, Chen WC, Wang KC, Lu CY (2019) Optimal design methods to transform 3D NAND flash into a high-density, high-bandwidth and low-power nonvolatile computing in memory (nvCIM) accelerator for deep-learning neural networks (DNN). In: IEEE international electron devices meeting (IEDM)
Google Scholar
Luo Y, Peng X, Hatcher R, Rakshit T, Kittl J, Rodder MS, Seo JS, Yu S (2020) A variation robust inference engine based on STT-MRAM with parallel read-out. In: IEEE international symposium on circuits and systems (ISCAS)
Google Scholar
Mikolajick T, Schroeder U, Slesazeck S (2020) The past, the present, and the future of ferroelectric memories. IEEE Transactions on Electron Devices 67(4):1434–1443
Article Google Scholar
Peng X, Huang S, Luo Y, Sun X, Yu S (2019) DNN+NeuroSim: an end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies. In: IEEE international electron devices meeting (IEDM)
Google Scholar
Peng X, Liu R, Yu S (2020) Optimizing weight mapping and data flow for convolutional neural networks on processing-in-memory architectures. IEEE Trans Circuits Syst I Regul Pap 67(4):1333–1343
Article Google Scholar
Presley RK, Haggard RL (1994) A fixed point implementation of the backpropagation learning algorithm. In: Proceedings of SOUTHEASTCON
Google Scholar
Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) XNOR-net: ImageNet classification using binary convolutional neural networks. In: European conference on computer vision (ECCV)
Google Scholar
Ronen R, Eliahu A, Leitersdorf O, Peled N, Korgaonkar K, Chattopadhyay A, Perach B, Kvatinsky S (2022) The bitlet model: a parameterized analytical model to compare PIM and CPU systems. ACM J Emerg Technol Comput Syst 18(2):1–29
Article Google Scholar
Shafiee A, Nag A, Muralimanohar N, Balasubramonian R, Strachan JP, Hu M, Williams RS, Srikumar V (2016) ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In: ACM/IEEE international symposium on computer architecture (ISCA), vol 44, p 14
Google Scholar
Shim W, Yu S (2021) Technological design of 3D NAND based compute-in-memory architecture for GB-scale deep neural network. IEEE Electron Device Lett 42(2):160–163
Article Google Scholar
Si X et al (2020) A 28nm 64Kb 6T SRAM computing-in-memory macro with 8b MAC operation for AI edge chips. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR)
Google Scholar
Song L, Qian X, Li H, Chen Y (2017) PipeLayer: a pipelined RRAM-based accelerator for deep learning. In: IEEE international symposium on high performance computer architecture (HPCA)
Google Scholar
Song T, Jung J, Rim W, Kim H, Kim Y, Park C, Do J, Park S, Cho S, Jung H, Kwon B, Choi H-S, Choi J, Yoon JS (2018) A 7nm FinFET SRAM using EUV lithography with dual write-driver-assist circuitry for low-voltage applications. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Su JW et al (2020) A 28nm 64kb inference-training two-way transpose multibit 6T SRAM compute-in-memory macro for AI edge chips. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Sun X, Yu S (2019) Impact of non-ideal characteristics of resistive synaptic devices on implementing convolutional neural networks. IEEE J Emerg Sel Top Circuits Syst 9(3):570–579
Article Google Scholar
Sun X, Wang P, Ni K, Datta S, Yu S (2018) Exploiting hybrid precision for training and inference: a 2T-1FeFET based analog synaptic weight cell. In: IEEE international electron devices meeting (IEDM)
Google Scholar
Ueyoshi K, Ando K, Hirose K, Takamaeda-Yamazaki S, Kadomoto J, Miyata T, Hamada M, Kuroda T, Motomura M (2018) QUEST: a 7.49 TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Vanhoucke V, Senior A, Mao MZ (2011) Improving the speed of neural networks on CPUs. In: Conference on neural information processing systems (NIPS)
Google Scholar
Wang J, Wang X, Eckert C, Subramaniyan A, Das R, Blaauw D, Sylvester D (2019) A 28-nm compute SRAM with bit-serial logic/arithmetic operations for programmable in-memory vector computing. IEEE J Solid State Circuits 55(1):76–86
Article Google Scholar
Wilton Steven JE, Jouppi NP (1996) CACTI: an enhanced cache access and cycle time model. IEEE J Solid State Circuits 31(5):677–688
Article Google Scholar
Wu S, Li G, Chen F, Shi L (2018) Training and inference with integers in deep neural networks. In: International conference on learning representations (ICLR)
Google Scholar
Xia L, Li B, Tang T, Gu P, Yin X, Huangfu W, Chen PY, Yu S, Cao Y, Wang Y, Xie Y, Yang H (2016) MNSIM: simulation platform for memristor-based neuromorphic computing system. In: IEEE/ACM design automation and test in Europe Conference & Exhibition (DATE)
Google Scholar
Xue CX et al (2019) A 1Mb multibit ReRAM computing-in-memory macro with 14.6ns parallel MAC computing time for CNN-based AI edge processors. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Yeap G, Lin SS, Chen YM, Shang HL, Wang PW, Lin HC, Peng YC, Sheu JY, Wang M, Chen X, Yang BR (2019) 5nm CMOS production technology platform featuring full-fledged EUV, and high mobility channel FinFETs with densest 0.021 μm2 SRAM cells for mobile SoC and high performance computing applications. In: IEEE international electron devices meeting (IEDM)
Google Scholar
Yin S, Jiang Z, Seo JS, Seok M (2020a) XNOR-SRAM: in-memory computing SRAM macro for binary/ternary deep neural networks. IEEE J Solid State Circuits 55(6):1733–1743
Google Scholar
Yin S, Sun X, Yu S, Seo JS (2020b) High-throughput in-memory computing for binary deep neural networks with monolithically integrated RRAM and 90nm CMOS. IEEE Trans Electron Devices 67(10):4185–4192
Article Google Scholar
Yoon JH, Chang M, Khwa WS, Chih YD, Chang MF, Raychowdhury A (2021) A 40nm 64Kb 56.67 TOPS/W read-disturb-tolerant compute-in-memory/digital RRAM macro with active-feedback-based read and in-situ write verification. In: IEEE international solid-state circuits conference (ISSCC)
Google Scholar
Yu S (2018) Neuro-inspired computing with emerging nonvolatile memorys. Proc IEEE 106(2):260–285
Article Google Scholar
Yu S, Chen PY, Cao Y, Xia L, Wang Y, Wu H (2015) Scaling-up resistive synaptic arrays for neuro-inspired architecture: challenges and prospect. In: IEEE international electron devices meeting (IEDM)
Google Scholar
Zhou S, Wu Y, Ni Z, Zhou X, Wen H, Zou Y (2016) Dorefa-net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160
Google Scholar
Zhu R, Tang Z, Ye S, Huang Q, Guo L, Chang S (2021) Memristor-based image enhancement: high efciency and robustness. IEEE Trans Electron Devices 68(2):602–609
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, USA
Hongwu Jiang, Shanshi Huang & Shimeng Yu

Authors

Hongwu Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Shanshi Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shimeng Yu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shimeng Yu .

Editor information

Editors and Affiliations

Sch of Computer Science & Engineering, Nanyang Technological University, Singapore, Singapore
Anupam Chattopadhyay

Section Editor information

Computer Science and Engineering, Nanyang Technological University, Singapore, Singapore
Mohamed M. Sabry Aly

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Jiang, H., Huang, S., Yu, S. (2023). Compute-in-Memory Architecture. In: Chattopadhyay, A. (eds) Handbook of Computer Architecture. Springer, Singapore. https://doi.org/10.1007/978-981-15-6401-7_62-1

Download citation

DOI: https://doi.org/10.1007/978-981-15-6401-7_62-1
Received: 14 October 2022
Accepted: 08 December 2022
Published: 26 May 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6401-7
Online ISBN: 978-981-15-6401-7
eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering

Publish with us

Policies and ethics

Compute-in-Memory Architecture

Abstract

Access this chapter

Similar content being viewed by others

Memory devices and applications for in-memory computing

In-Memory Computing Architectures for Big Data and Machine Learning Applications

On the Reliability of Computing-in-Memory Accelerators for Deep Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Navigation

Compute-in-Memory Architecture

Abstract

Access this chapter

Similar content being viewed by others

Memory devices and applications for in-memory computing

In-Memory Computing Architectures for Big Data and Machine Learning Applications

On the Reliability of Computing-in-Memory Accelerators for Deep Neural Networks

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Copyright information

About this entry

Cite this entry

Download citation

Publish with us

Search

Navigation