Abstract
In the era of big data and artificial intelligence, hardware advancement in throughput and energy efficiency is essential for both cloud and edge computations. Because of the merged data storage and computing units, compute-in-memory is becoming one of the desirable choices for data-centric applications to mitigate the memory wall bottleneck in von-Neumann architecture. In this chapter, the recent architectural designs and underlying circuit/device technologies for compute-in-memory are surveyed. The related design challenges and prospects are also discussed to provide an in-depth understanding of interactions between algorithms/architectures and circuits/devices. The chapter is organized hierarchically: the overview of the field (Introduction section); the principle of compute-in-memory (section “DNN Basics and Corresponding CIM Principle”); the latest architecture and algorithm techniques including network model, data flow, pipeline design, and quantization approaches (section “Architecture and Algorithm Techniques for CIM”); the related hardware support including embedded memory technologies such as static random access memories and emerging nonvolatile memories, as well as the peripheral circuit designs with a focus on the analog-to-digital converters (section “Hardware Implementations for CIM Architecture”); a summary and outlook of the compute-in-memory architecture (Conclusion section).
Similar content being viewed by others
References
Ambrogio S, Gallot M, Spoon K, Tsai H, Mackin C, Wesson M, Kariyappa S, Narayanan P, Liu CC, Kumar A et al (2019) Reducing the impact of phase-change memory conductance drift on the inference of large-scale hardware neural networks. In: IEEE international electron devices meeting (IEDM)
Chang HY, Narayanan P, Lewis SC, Farinha NC, Hosokawa K, Mackin C, Tsai H, Ambrogio S, Chen A, Burr GW (2019) AI hardware acceleration with analog memory: microarchitectures for low energy at high speed. IBM J Res Dev 63(6):8–1
Chen Y (2020) ReRAM: history, status, and future. IEEE Trans Electron Devices 67(4):1420–1433
Chen YH, Krishna T, Emer JS, Sze V (2016) Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks. IEEE J Solid State Circuits 52(1):127–138
Chen WH, Li KX, Lin WY, Hsu KH, Li PY, Yang CH, Xue CX, Yang EY, Chen YK, Chang YS, Hsu TH (2018a) A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors. In: IEEE international solid-state circuits conference (ISSCC)
Chen PY, Peng X, Yu S (2018b) NeuroSim: a circuit-level macro model for benchmarking neuro-inspired architectures in online learning. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems 37(12):3067–3080
Chi P, Li S, Xu C, Zhang T, Zhao J, Liu Y, Wang Y, Xie Y (2016) Prime: a novel processing-in-memory architecture for neural network computation in reram-based main memory. In: 43rd annual international symposium on computer architecture (ISCA), vol 44, p 27
Courbariaux M, Bengio Y, David JP (2015) Training deep neural networks with low precision multiplications. In: Workshop contribution at international conference on learning representations (ICLR)
DeHon A (2000) The density advantage of configurable computing. Computer 33(4):41–49
Dong X, Xu C, Xie Y, Jouppi NP (2012) Nvsim: a circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans Comput-Aided Des Integr Circuits Syst 31(7):994–1007
Elliott DG, Stumm M, Snelgrove WM, Cojocaru C, Mckenzie R (1999) Computational RAM: implementing processors in memory. IEEE Des Test Comput 16(1):32–41
Feng Y, Chen B, Liu J, Sun Z, Hu H, Zhang J, Zhan X, Chen J (2021) Design-technology co-optimizations (DTCO) for general-purpose computing in-memory based on 55nm NOR flash technology. In: IEEE international electron devices meeting (IEDM)
Gao L, Chen PY, Liu R, Yu S (2016) Physical unclonable function exploiting sneak paths in resistive cross-point array. IEEE Transactions on Electron Devices 63(8):3109–3115
Giannopoulos I, Sebastian A, Le Gallo M, Jonnalagadda V, Sousa M, Boon M (2018) 8-bit precision in-memory multiplication with projected phasechange memory. In: IEEE international electron devices meeting (IEDM)
Gokmen T, Onen M, Haensch W (2017) Training deep convolutional neural networks with resistive cross-point devices. Front Neurosci 11:538
He Z, Lin J, Ewetz R, Yuan JS, Fan D (2019) Noise injection adaption: end-to-end ReRAM crossbar non-ideal effect adaption for neural network mapping. In: ACM/IEEE design automation conference (DAC)
Huang S, Jiang H, Peng X, Li W, Yu S (2020a) XOR-CIM: compute-in-memory SRAM architecture with embedded XOR encryption. In: IEEE/ACM international conference on computer-aided design (ICCAD)
Huang S, Sun X, Peng X, Jiang H, Yu S (2020b) Overcoming challenges for achieving high in-situ training accuracy with emerging memories. In: Design, Automation & Test in Europe Conference & Exhibition (DATE)
Huang S, Peng X, Jiang H, Luo Y, Yu S (2021) Exploiting process variations to protect machine learning inference engine from chip cloning. In: IEEE international symposium on circuits and systems (ISCAS)
Hubara I, Courbariaux M, Soudry D, El-Yaniv R, Bengio Y (2016) Binarized neural networks. In: Conference on neural information processing systems (NIPS)
Ikegawa S, Mancoff FB, Janesky J, Aggarwal S (2020) Magnetoresistive random access memory: present and future. IEEE Transactions on Electron Devices 67(4):1407–1419
Imani M, Gupta S, Kim Y, Rosing T (2019) FloatPIM: in-memory acceleration of deep neural network training with high precision. In: ACM/IEEE 46th annual international symposium on computer architecture (ISCA)
Jiang H, Peng X, Huang S, Yu S (2020a) CIMAT: a compute-in-memory architecture for on-chip training based on transpose SRAM arrays. IEEE Trans Comput 69(7):944–954
Jiang H, Huang S, Peng X, Su JW, Chou Y-C, Huang WH, Liu TW, Liu R, Chang MF, Yu S (2020b) A two-way SRAM array based accelerator for deep neural network on-chip training. In: ACM/IEEE design automation conference (DAC)
Jiang H, Li W, Huang S, Cosemans S, Catthoor F, Yu S (2021) Analog-to-digital converter design exploration for compute-in-memory accelerators. IEEE Des Test 39(2):48–55
Jouppi NP et al (2017) In-datacenter performance analysis of a tensor processing unit. In: ACM/IEEE international symposium on computer architecture (ISCA)
Kawahara A, Azuma R, Ikeda Y, Kawai K, Katoh Y, Tanabe K, Nakamura T, Sumimoto Y, Yamada N, Nakai N, Sakamoto S, Hayakawa Y, Tsuji K, Yoneda S, Himeno A, Origasa K, Shimakawa K, Takagi T, Mikawa T, Aono K (2012) An 8Mb multi-layered cross-point ReRAM macro with 443MB/s write throughput. In: IEEE international solid-state circuits conference (ISSCC)
Keckler S, Kunle O, Hofstee P (2009) Multicore processors and systems. Springer, US
Khwa W-S, Chen JJ, Li JF, Si X, Yang EY, Sun X, Liu R, Chen PY, Li Q, Yu S, Chang MF (2018) A 65nm 4Kb algorithm-dependent computing-in-memory SRAM unit-macro with 2.3ns and 55.8TOPS/W fully parallel product-sum operation for binary DNN edge processors. In: IEEE international solid-state circuits conference (ISSCC)
Kim T, Lee S (2020) Evolution of phase-change memory for the storage-class memory and beyond. IEEE Trans Electron Devices 67(4):1394–1406
Kim D, She X, Rahman NM, Chekuri VCK, Mukhopadhyay S (2020) Processing-in-memory-based on-chip learning with spike-time-dependent plasticity in 65-nm cmos. IEEE Solid-State Circuits Letters 3:278–281
Li C, Hu M, Li Y, Jiang H, Ge N, Montgomery E, Zhang J, Song W, Dávila N, Graves CE, Li Z (2018a) Analogue signal and image processing with large memristor crossbars. Nat Electron 1(1):52–59
Li Y, Kim S, Sun X, Solomon P, Gokmen T, Tsai H, Koswatta S, Ren Z, Mo R, Yeh CC, Haensch W, Leobandung E (2018b) Capacitor-based cross-point array for analog neural network with record symmetry and linearity. In: IEEE symposium on VLSI technology
Li W, Huang S, Sun X, Jiang H, Yu S (2021) Secure-RRAM: a 40nm 16kb compute-in-memory macro with reconfigurability, sparsity control, and embedded security. In: IEEE custom integrated circuits conference (CICC)
Lin MY, Cheng HY, Lin WT, Yang TH, Tseng IC, Yang CL, Hu HW, Chang HS, Li HP, Chang MF (2018) DL-RSIM: a simulation framework to enable reliable ReRAM-based accelerators for deep learning. In: IEEE/ACM international conference on computer-aided design (ICCAD)
Liu R, Peng X, Sun X, Khwa WS, Si X, Chen JJ, Li JF, Chang MF, Yu S (2018) Parallelizing SRAM arrays with customized bit-cell for binary neural networks. In: ACM/IEEE design automation conference (DAC)
Liu Q, Gao B, Yao P, Wu D, Chen J, Pang Y, Zhang W, Liao Y, Xue CX, Chen WH, Tang J (2020) A fully integrated analog ReRAM based 78.4 TOPS/W compute-in-memory chip with fully parallel MAC computing. In: IEEE international solid-state circuits conference (ISSCC)
Long Y, She X, Mukhopadhyay S (2019) Design of reliable DNN accelerator with un-reliable ReRAM. In: Design, Automation & Test in Europe Conference & Exhibition (DATE)
Lue HT, Hsu PK, Wei ML, Yeh TH, Du PY, Chen WC, Wang KC, Lu CY (2019) Optimal design methods to transform 3D NAND flash into a high-density, high-bandwidth and low-power nonvolatile computing in memory (nvCIM) accelerator for deep-learning neural networks (DNN). In: IEEE international electron devices meeting (IEDM)
Luo Y, Peng X, Hatcher R, Rakshit T, Kittl J, Rodder MS, Seo JS, Yu S (2020) A variation robust inference engine based on STT-MRAM with parallel read-out. In: IEEE international symposium on circuits and systems (ISCAS)
Mikolajick T, Schroeder U, Slesazeck S (2020) The past, the present, and the future of ferroelectric memories. IEEE Transactions on Electron Devices 67(4):1434–1443
Peng X, Huang S, Luo Y, Sun X, Yu S (2019) DNN+NeuroSim: an end-to-end benchmarking framework for compute-in-memory accelerators with versatile device technologies. In: IEEE international electron devices meeting (IEDM)
Peng X, Liu R, Yu S (2020) Optimizing weight mapping and data flow for convolutional neural networks on processing-in-memory architectures. IEEE Trans Circuits Syst I Regul Pap 67(4):1333–1343
Presley RK, Haggard RL (1994) A fixed point implementation of the backpropagation learning algorithm. In: Proceedings of SOUTHEASTCON
Rastegari M, Ordonez V, Redmon J, Farhadi A (2016) XNOR-net: ImageNet classification using binary convolutional neural networks. In: European conference on computer vision (ECCV)
Ronen R, Eliahu A, Leitersdorf O, Peled N, Korgaonkar K, Chattopadhyay A, Perach B, Kvatinsky S (2022) The bitlet model: a parameterized analytical model to compare PIM and CPU systems. ACM J Emerg Technol Comput Syst 18(2):1–29
Shafiee A, Nag A, Muralimanohar N, Balasubramonian R, Strachan JP, Hu M, Williams RS, Srikumar V (2016) ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In: ACM/IEEE international symposium on computer architecture (ISCA), vol 44, p 14
Shim W, Yu S (2021) Technological design of 3D NAND based compute-in-memory architecture for GB-scale deep neural network. IEEE Electron Device Lett 42(2):160–163
Si X et al (2020) A 28nm 64Kb 6T SRAM computing-in-memory macro with 8b MAC operation for AI edge chips. In: IEEE international solid-state circuits conference (ISSCC)
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: International conference on learning representations (ICLR)
Song L, Qian X, Li H, Chen Y (2017) PipeLayer: a pipelined RRAM-based accelerator for deep learning. In: IEEE international symposium on high performance computer architecture (HPCA)
Song T, Jung J, Rim W, Kim H, Kim Y, Park C, Do J, Park S, Cho S, Jung H, Kwon B, Choi H-S, Choi J, Yoon JS (2018) A 7nm FinFET SRAM using EUV lithography with dual write-driver-assist circuitry for low-voltage applications. In: IEEE international solid-state circuits conference (ISSCC)
Su JW et al (2020) A 28nm 64kb inference-training two-way transpose multibit 6T SRAM compute-in-memory macro for AI edge chips. In: IEEE international solid-state circuits conference (ISSCC)
Sun X, Yu S (2019) Impact of non-ideal characteristics of resistive synaptic devices on implementing convolutional neural networks. IEEE J Emerg Sel Top Circuits Syst 9(3):570–579
Sun X, Wang P, Ni K, Datta S, Yu S (2018) Exploiting hybrid precision for training and inference: a 2T-1FeFET based analog synaptic weight cell. In: IEEE international electron devices meeting (IEDM)
Ueyoshi K, Ando K, Hirose K, Takamaeda-Yamazaki S, Kadomoto J, Miyata T, Hamada M, Kuroda T, Motomura M (2018) QUEST: a 7.49 TOPS multi-purpose log-quantized DNN inference engine stacked on 96MB 3D SRAM using inductive-coupling technology in 40nm CMOS. In: IEEE international solid-state circuits conference (ISSCC)
Vanhoucke V, Senior A, Mao MZ (2011) Improving the speed of neural networks on CPUs. In: Conference on neural information processing systems (NIPS)
Wang J, Wang X, Eckert C, Subramaniyan A, Das R, Blaauw D, Sylvester D (2019) A 28-nm compute SRAM with bit-serial logic/arithmetic operations for programmable in-memory vector computing. IEEE J Solid State Circuits 55(1):76–86
Wilton Steven JE, Jouppi NP (1996) CACTI: an enhanced cache access and cycle time model. IEEE J Solid State Circuits 31(5):677–688
Wu S, Li G, Chen F, Shi L (2018) Training and inference with integers in deep neural networks. In: International conference on learning representations (ICLR)
Xia L, Li B, Tang T, Gu P, Yin X, Huangfu W, Chen PY, Yu S, Cao Y, Wang Y, Xie Y, Yang H (2016) MNSIM: simulation platform for memristor-based neuromorphic computing system. In: IEEE/ACM design automation and test in Europe Conference & Exhibition (DATE)
Xue CX et al (2019) A 1Mb multibit ReRAM computing-in-memory macro with 14.6ns parallel MAC computing time for CNN-based AI edge processors. In: IEEE international solid-state circuits conference (ISSCC)
Yeap G, Lin SS, Chen YM, Shang HL, Wang PW, Lin HC, Peng YC, Sheu JY, Wang M, Chen X, Yang BR (2019) 5nm CMOS production technology platform featuring full-fledged EUV, and high mobility channel FinFETs with densest 0.021 μm2 SRAM cells for mobile SoC and high performance computing applications. In: IEEE international electron devices meeting (IEDM)
Yin S, Jiang Z, Seo JS, Seok M (2020a) XNOR-SRAM: in-memory computing SRAM macro for binary/ternary deep neural networks. IEEE J Solid State Circuits 55(6):1733–1743
Yin S, Sun X, Yu S, Seo JS (2020b) High-throughput in-memory computing for binary deep neural networks with monolithically integrated RRAM and 90nm CMOS. IEEE Trans Electron Devices 67(10):4185–4192
Yoon JH, Chang M, Khwa WS, Chih YD, Chang MF, Raychowdhury A (2021) A 40nm 64Kb 56.67 TOPS/W read-disturb-tolerant compute-in-memory/digital RRAM macro with active-feedback-based read and in-situ write verification. In: IEEE international solid-state circuits conference (ISSCC)
Yu S (2018) Neuro-inspired computing with emerging nonvolatile memorys. Proc IEEE 106(2):260–285
Yu S, Chen PY, Cao Y, Xia L, Wang Y, Wu H (2015) Scaling-up resistive synaptic arrays for neuro-inspired architecture: challenges and prospect. In: IEEE international electron devices meeting (IEDM)
Zhou S, Wu Y, Ni Z, Zhou X, Wen H, Zou Y (2016) Dorefa-net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160
Zhu R, Tang Z, Ye S, Huang Q, Guo L, Chang S (2021) Memristor-based image enhancement: high efciency and robustness. IEEE Trans Electron Devices 68(2):602–609
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2023 Springer Nature Singapore Pte Ltd.
About this entry
Cite this entry
Jiang, H., Huang, S., Yu, S. (2023). Compute-in-Memory Architecture. In: Chattopadhyay, A. (eds) Handbook of Computer Architecture. Springer, Singapore. https://doi.org/10.1007/978-981-15-6401-7_62-1
Download citation
DOI: https://doi.org/10.1007/978-981-15-6401-7_62-1
Received:
Accepted:
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-6401-7
Online ISBN: 978-981-15-6401-7
eBook Packages: Springer Reference EngineeringReference Module Computer Science and Engineering