Abstract
In this paper, a neuromorphic crossbar circuit with binary memristors is proposed for speech recognition. The binary memristors which are based on filamentary-switching mechanism can be found more popularly and are easy to be fabricated than analog memristors that are rare in materials and need a more complicated fabrication process. Thus, we develop a neuromorphic crossbar circuit using filamentary-switching binary memristors not using interface-switching analog memristors. The proposed binary memristor crossbar can recognize five vowels with 4-bit 64 input channels. The proposed crossbar is tested by 2,500 speech samples and verified to be able to recognize 89.2% of the tested samples. From the statistical simulation, the recognition rate of the binary memristor crossbar is estimated to be degraded very little from 89.2% to 80%, though the percentage variation in memristance is increased very much from 0% to 15%. In contrast, the analog memristor crossbar loses its recognition rate significantly from 96% to 9% for the same percentage variation in memristance.
Avoid common mistakes on your manuscript.
Background
The memristors that had been mathematically predicted by Leon O. Chua in 1971 as the fourth basic circuit element [1] were experimentally found in 2008 [2]. Since the first prediction of memristors, they have been thought as a potential candidate for future neuromorphic computing systems. Among the many advantages of memristors, particularly, the nonlinear charge-flux relationship is important in mimicking synaptic plasticity of biological neuronal systems such as human brains [3–7].
In realizing memristor-based synaptic systems, a crossbar circuit that is made of only passive memristors can be thought of as the densest and simplest architecture among various synaptic circuits that have been developed previously. If a crossbar circuit is made of both memristors and selectors such as transistors and diodes, this kind of hybrid-type crossbar circuit is difficult to be stacked layer by layer. Thus, the pure crossbar circuit with only passive memristors can be a key element to implement the densest and simplest three-dimensional architecture of neuromorphic systems.
A conceptual diagram of a neuromorphic speech-recognition system is shown in Figure 1. In Figure 1, a voice signal enters the cochlea first. In the cochlea, the voice input is divided into many different channels according to the voice's frequencies. Basically, the cochlea is modeled as a group of band-pass filters, where the voice input is divided and filtered by a band-pass filter array with the frequency range from 20 Hz to 20 KHz [8, 9]. Each channel in the band-pass filter array can deliver a different band signal to the crossbar circuit as shown in Figure 1. Here, we assume that our goal is recognizing five vowels: ‘a’, ‘i’, ‘u’, ‘e’, and ‘o’, from the input of a human voice. To do so, the voice input is filtered and sampled as the cochlea does. Then, the filtered and sampled signals go into the memristor crossbar circuit as shown in Figure 1, where the voice input is compared with the previously trained patterns of five different vowels which are already stored in the memristor crossbar array. By doing so, we can decide which vowel among the five different vowels is the best match with the voice input to the crossbar array.
In realizing a memristor crossbar circuit, we can use either analog memristors [10, 11] or binary memristors [12–17] as shown in Figure 2a,b. For the analog memristors in Figure 2a, their memristance value can be changed gradually and not abruptly due to the interface-switching mechanism. In the interface-switching behavior, the interface between the low-resistance region and the high-resistance region can be controlled precisely according to an applied voltage or current. As a result, we can store not only binary data but also analog data on the interface-switching memristors with high accuracy. However, materials that show the interface-switching behavior are not so popular, and the accuracy in controlling the memristance value is still considered to be a big concern. Also, even a small amount of memristance variation can degrade the overall accuracy severely in analog-memristor-based neuromorphic systems. On the contrary, most memristors are known that they are based on the filamentary-switching mechanism. In filamentary switching, memristors can have either a high resistance state (HRS) or a low resistance state (LRS) as represented in Figure 2b. By doing so, we can store only ‘1’ or ‘0’ on the filamentary-switching binary memristors.
In addition to the advantage of popularity of filamentary-switching materials, binary memristors can be much more tolerant against statistical variations compared to analog memristors. This is due to the fact that HRS can still be much higher than LRS, in spite of the large amount of statistical variation in LRS and HRS.
In this paper, we propose a binary memristor crossbar circuit for recognizing five different vowels. The block diagram and the detailed circuit schematic are shown and explained in the following section. In addition, the circuit simulation and statistical simulation are performed, and the simulation results are discussed and finally summarized in this paper [18].
Methods
Figure 3 shows a block diagram of the binary memristor crossbar circuit for recognizing five vowels: ‘a’, ‘i’, ‘u’, ‘e’, and ‘o’. The voice input is divided into 64 channels according to the voice's frequencies. The magnitude of each channel is sampled and digitized by 4 bits. The band-pass filtering, sampling, and digitization for the voice input are implemented by MATLAB simulation in this paper. The 4-bit 64 channel inputs that are obtained by MATLAB simulation are applied to the binary memristor crossbar array as shown in Figure 3. For recognizing five vowels, we need not only 4-bit 64 channel inputs but also their inverted values. Thus, the total number of channel inputs is as many as 128 with 64 channels of the true signals and 64 channels of the inverted signals. Each channel is composed of 4-bit binary values. In Figure 3, Ia,0 is the current of the ‘x1’ column in the crossbar array for recognizing ‘a’. Ia,1 is the current of the ‘x2’ column in the crossbar array for recognizing ‘a’. Similarly, Ia,2 and Ia,3 are the currents of the ‘x4’ and ‘x8’ columns in the ‘a’ crossbar array. Here, ‘x1’ means that the weight of this column current is as much as 1. In Figure 3, ‘x2’, ‘x4’, and ‘x8’ mean that the weight values are 2, 4, and 8, respectively, for the corresponding columns in the ‘a’ crossbar array. Here, Ia can be calculated with the weighted summation of 8Ia,3 + 4Ia,2 + 2Ia,1 + Ia,0. Similarly, Iu is the weighted summation of 8Iu,3 + 4Iu,2 + 2Iu,1 + Iu,0 for recognizing ‘u’. Io is the weighted summation of 8Io,3 + 4Io,2 + 2Io,1 + Io,0 for recognizing ‘o’. The currents of Ia, Ii, Iu, Ie, and Io are compared with each other in the winner-take-all circuit [19] to decide which vowel is the best match with the voice input as shown in Figure 3. Outputa, Outputi, Outputu, Outpute, and Outputo are the output signals of the winner-take-all circuit.
Figure 4a shows the detailed schematic of the binary memristor crossbar circuit. Here, 64 input channels are applied to the crossbar circuit. Each channel has 4-bit binary values and each binary value is divided into true and inverted signals as shown in Figure 4a. M1,0, M1,1, M1,2, and M1,3 are memristors of the ‘x1’ column, ‘x2’ column, ‘x4’ column, and ‘x8’ column, respectively, for the crossbar array of vowel ‘a’. These four memristors are connected to the true signal of channel 1. Similarly, M2,0, M2,1, M2,2, and M2,3 are memristors of the ‘x1’ column, ‘x2’ column, ‘x4’ column, and ‘x8’ column, respectively, which are connected to the inverted signal of channel 1.
The weighted summation of Ia is calculated with 8Ia,3 + 4Ia,2 + 2Ia,1 + Ia,0, as explained just earlier. The circuit for performing the weighted summation is implemented by current mirror circuits as shown in Figure 4a. For example, to realize the weight of ‘1’, we use the current mirror circuit, which is composed of M7 and M8. Here, M7 and M8 should have the same size. By doing so, Ia,0 of M7 can be copied to M8. If the weight is 2, the size of M6 should be twice larger than M5. Thereby, the current of M6 can be twice larger than Ia,1. For the weight factor of 4, M4 should be four times larger than M3. For the weight factor of 8, M2 should be eight times larger than M1. The currents of M2, M4, M6, and M8 can be summated by Kirchhoff's current law. The capacitor Ca can be discharged by the weighted summation of Ia, which comes from M2, M4, M6, and M8. If the weighted summation of Ia is large, Ca can be discharged to GND very fast. Here, GND means the ground potential. If the weighted summation of Ia is small, it takes longer time to discharge Ca to GND. M9 is the precharge PMOS, which becomes on when the clock (CLK) signal is low. If M9 is on, the VCa node is precharged by VDD. When the CLK signal is high, M9 is off. At this time, VCa can be discharged by the weighted summation of Ia that comes from M2, M4, M6, and M8.
Figure 4b shows the winner-take-all circuit that can decide which capacitor becomes discharged the fastest among the five capacitors of Ca, Ci, Cu, Ce, and Co. The five capacitors of Ca, Ci, Cu, Ce, and Co are corresponding to the five vowels ‘a’, ‘i’, ‘u’, ‘e’, and ‘o’, respectively. Using the winner-take-all circuit, we can figure out that a certain vowel corresponding to the fastest-discharged capacitor is the best match with the input of a human voice. VCa, VCi, VCu, VCe, and VCo are the voltages on capacitors Ca, Ci, Cu, Ce, and Co, respectively. Here, I1, I2, I3, I4, and I5 are the comparators. In this case, I1 compares VCa with VREF. VREF is a reference voltage to the comparators. If VCa becomes lower than VREF, Da becomes high. Similarly, I2, I3, I4, and I5 compare VCi, VCu, VCe, and VCo with VREF. Di, Du, De, and Do become high when VCi, VCu, VCe, and VCo are lower than VREF. I6, I7, and I8 are the OR gates. I9 and I10 with the delay line of τ constitute a pulse generator circuit. FF1, FF2, FF3, FF4, and FF5 are D flip-flop circuits. Outputa, Outputi, Outputu, Outpute, and Outputo are the output signals of five D flip-flops from FF1 to FF5.
In Figure 4a, we may be concerned that the reverse current through LRS and HRS may degrade the recognition rate. To elaborate on this reverse current more, we assume two cases of memristor crossbar circuit that are matched and unmatched as shown in Figure 5a,b, respectively. In Figure 5a, Vi,0 and Vi,1 are 0 and 1, respectively. These inputs match the stored memristance values of M1, M2, M3, and M4. Here, HRS means high resistance state and LRS is low resistance state. The current summation of Ia can be calculated with Ia = I2,a + I3,a − I1,a − I4,a. I2,a and I3,a are the forward currents through M2 and M3 that are LRS. I1,a and I4,a are the reverse currents through M1 and M4 that are HRS. In calculating this current summation, Ia can be expressed simply with Ia ≈ I2,a + I3,a because the reverse currents of I1,a and I4,a are much smaller than the forward currents of I2,a and I3,a. As we know, HRS is much larger than LRS; thus, we can ignore I1,a and I4,a in calculating Ia. From this explanation, we can know that the reverse current through HRS can affect Ia very little.
Now, we can consider Figure 5b, where the input voltages of Vi,0 and Vi,1 do not match with the stored memristance of M5, M6, M7, and M8. The current summation of Ib in Figure 5b can be expressed with Ib = I2,b + I3,b − I1,b − I4,b. Here, I2,b and I3,b are the forward currents through HRS. I1,b and I4,b are the reverse currents through LRS. If we compare the matched column's current of Ia in Figure 5a with the unmatched column's current of Ib, we can be sure that Ia is much larger than Ib. Thus, we can think that the reverse current does not degrade the recognition rate.
The simulated waveforms of VCa, VCi, VCu, VCe, and VCo are shown in Figure 6. Here, VCa seems to be discharged by GND faster than the other capacitor nodes of VCi, VCu, VCe, and VCo. It means that the voice input matches with the vowel ‘a’ better than the other vowels. The timing diagram of important signals in Figure 4a,b is shown in Figure 7. When the CLK signal is low, all the capacitor nodes of VCa, VCi, VCu, VCe, and VCo are precharged by VDD. At this time, VCa, VCi, VCu, VCe, and VCo are higher than VREF; thus, Da, Di, Du, De, and Do can be low. When the CLK becomes high, five capacitors of Ca, Ci, Cu, Ce, and Co can be discharged by Ia, Ii, Iu, Ie, and Io, respectively. Among Ia, Ii, Iu, Ie, and Io, if Ia is the largest amount of current, VCa is discharged by GND faster than VCi, VCu, VCe, and VCo. If VCa becomes lower than VREF, Da becomes high. As explained earlier, because VCa is the fastest falling node among the five capacitive nodes, Da can also be the fastest rising signal among Da, Di, Du, De, and Do. The fastest rising signal of Da can generate the locking pulse that can be used as the clock signal of D flip-flop circuits of FF1, FF2, FF3, FF4, and FF5. By doing so, we can decide which vowel is the best match to the voice input. The first-rising signal of Da makes Outputa high, as shown in Figure 7. The other output signals, such as Outputi, Outputu, Outpute, and Outputo, are prevented from rising from low to high by the locking pulse that is generated by the first-rising signal of Da.
Results and discussion
In this work, the memristor-CMOS hybrid circuits were simulated by Cadence Spectre software. Here, memristors were modeled by Verilog-A [20, 21], and CMOS SPICE parameters were obtained from Samsung's 0.13-μm CMOS technology. The training and recalling process of the memristor crossbar array are shown in Figure 8a. In this paper, we used 100 samples for training a crossbar array to learn the vowel ‘a’. Similarly, we used 400 samples for the crossbar array to learn four vowels: ‘i’, ‘u’, ‘e’, and ‘o’. By the training process, we can find the best memristance values of the crossbar array for maximizing the recognition rate of five vowels: ‘a’, ‘i’, ‘u’, ‘e’, and ‘o’ [18]. The memristance values that are found by the training process were written to the crossbar array circuit by the VDD/3 write scheme that is known better in mitigating the half-selected cell problem compared to the VDD/2 write scheme [22].
For the training process, we have to convert the original speech signal to a 4-bit 64-channel digitized signal. In a biological system, the cochlea in the human ear can perform this conversion function. In this paper, we used MATLAB software that performs the same conversion function with the human cochlea. The cochlea function that is simulated by MATLAB software is shown in Figure 8b. The function of the cochlea can be modeled by preprocessing, framing, windowing, discrete Fourier transforming (DFT), band-pass filtering, and digitization [23]. For the digitization process, 64 outputs from 64 band-pass filters are converted to 4-bit binary signals and they are delivered to the rows of the memristor crossbar array. For the band-pass filtering, the nonlinear frequency scale which is known as the mel scale is used [23]. In the mel scale, the frequency scale is linear up to 1,000 Hz and is logarithmic when the input voice has a higher frequency than 1,000 Hz [23].Figure 9 shows the simulation results for the recognition rate of the proposed binary memristor crossbar circuit. In this case, we tested 2,500 input voices for recognizing five different vowels. Each vowel is tested by 500 different voices. The average recognition rate of five different vowels is estimated to be around 89.2%. Among the five vowels, the recognition rate of ‘u’ is the highest at 95.2% while the vowel ‘e’ has the lowest recognition rate, as low as 84%.
Figure 10a shows the statistical variation of memristance in HRS and LRS with the standard deviation (=σ) of 10%. The statistical variation was obtained by Monte Carlo simulation that was also provided by Cadence software. This statistical simulation is very important because real memristors are susceptible to process variation. To analyze how tolerant the proposed binary memristor crossbar is against the memristance variation, we tested various cases of memristance variation from 0% to 15%. In Figure 10b, we compared the proposed binary memristor crossbar circuit with the analog memristor crossbar one increasing the percentage variation in memristance from 0% to 15%.
When the memristance variation is as low as 0%, the recognition rate of the analog memristor array is higher by 6.8% than the binary memristor array. This is due to the fact that the proposed binary memristor crossbar has a 4-bit resolution; thus, it loses some amount of accuracy compared to the analog memristor crossbar. As the percentage of variation in memristance is increased, the recognition rate of analog memristor crossbar becomes degraded very rapidly. For example, when the percentage variation in memristance becomes 5%, the recognition rate of the analog crossbar is decreased from 96% to 23%. On the contrary, the binary memristor crossbar can keep almost the same amount of recognition rate for five vowels. For a percentage variation as severe as 15%, the analog crossbar shows a recognition rate as low as 9%. However, the binary crossbar still keeps the recognition rate as high as 80%, indicating that it is only degraded by 9.2% compared to the percentage variation of 0%. This strong tolerance of the binary memristor crossbar is due to the fact that the accuracy of the information stored in binary memristors can be little affected by the percentage variation in memristance. Memristance of LRS can still be much smaller and cannot become larger than that of HRS, even though the percentage variation in LRS is very large. This is the reason why the binary memristor crossbar can maintain the recognition rate over 80% regardless of the percentage variation in memristance.
Conclusions
In this paper, the binary memristor crossbar circuit was proposed for neuromorphic application of speech recognition. Compared with analog memristors that are rare in available materials and need a complicated fabrication process, binary memristors which are based on the filamentary-switching mechanism are found more popularly and easy to be fabricated. Thus, we developed the neuromorphic crossbar circuit using filamentary-switching binary memristors instead of interface-switching analog memristors. The proposed binary memristor crossbar could recognize five vowels with 64 input channels and a 4-bit resolution. The proposed crossbar array was tested by 2,500 speech samples and verified to be able to recognize 89.2% of the total tested samples. Moreover, the recognition rate of the binary memristor crossbar is degraded very little only from 89.2% to 80%, even though the percentage statistical variation in memristance is increased from 0% to 15%. In contrast, the analog memristor crossbar is degraded significantly from 96% to 9% with the same percentage variation in memristance.
Authors’ information
SNT and SJH are Ph.D. and M.S. students, respectively, who are studying in the School of Electrical Engineering, Kookmin University, Seoul, South Korea. KSM is a professor in the School of Electrical Engineering, Kookmin University, Seoul, South Korea.
References
Chua LO: Memristor—the missing circuit element. IEEE Trans Circuit Theory 1971, CT-18(5):507–519.
Strukov DB, Snider GS, Stewart DR, Williams RS: The missing memristor found. Nature 2008, 453: 80–83. 10.1038/nature06932
Jo SH, Chang T, Ebong I, Bhadviya BB, Mazumder P, Lu W: Nanoscale memristor device as synapse in neuromorphic systems. Nano Lett 2010, 10: 1297–1301. 10.1021/nl904092h
Kim H, Pd Sad M, Yang C, Roska T, Chua LO: Neural synapse weighting with a pulse-based memristor circuit. IEEE Trans Circuit Syst 2012, 59(1):148–158.
Hu M, Li H, Wu Q, Rose GS, Chen Y: Memristor crossbar based hardware realization of BSB recall function. Int Joint Conf Neural Netw 2012, 1–7.
Adhikari SP, Yang C, Kim H, Chua LO: Memristor bridge synapse-based neural network and its learning. Trans Neural Netw Learn Syst 2012, 23(9):1426–1435.
Howard G, Gale E, Bull L, Costello BDL, Adamatzky A: Evolution of plastic learning in spiking networks via memristive connections. IEEE Trans Evol Comput 2012, 16(5):711–729.
Remus JJ, Collins LM: The effects of noise on speech recognition in cochlear implant subjects: predictions and analysis using acoustic models. EURASIP J Appl Signal Process 2005, 2005(18):2979–2990. 10.1155/ASP.2005.2979
Stakhovskaya O, Sridhar D, Bonham BH, Leake PA: Frequency map for the human cochlear spiral ganglion: implications for cochlear implants. J Assoc Res Otolaryngol 2007, 8: 220–233. 10.1007/s10162-007-0076-9
Park S, Kim H, Choo M, Noh J, Sheri A, Jung S, Seo K, Park J, Kim S, Lee W, Shin J, Lee D, Choi G, Woo J, Cha E, Jang J, Park C, Jeon M, Lee B, Lee BH, Hwang H: RRAM-based synapse for neuromorphic system with pattern recognition function. IEDM Tech Dig 2012, 2012: 10.2.1–10.2.4.
Park S, Sheri A, Kim J, Noh J, Jang J, Jeon M, Lee B, Lee BR, Lee BH, Hwang H: Neuromorphic speech systems using advanced ReRAM-based synapse. IEDM Tech Dig 2013, 2013: 25.6.1–25.6.4.
Yu S, Wong HSP: Modeling the switching dynamics of programmable-metallization-cell (PMC) memory and its application as synapse device for a neuromorphic computation system. IEDM Tech Dig 2010, 2010: 22.1.1–22.1.4.
Kuzum D, Jeyasingh RGD, Wong HSP: Energy efficient programming of nanoelectronic synaptic devices for large-scale implementation of associative and temporal sequence learning. IEDM Tech Dig 2011, 2011: 30.3.1–30.3.4.
Suri M, Bichler O, Querlioz D, Cueto O, Perniola L, Sousa V, Vuillaume D, Gamrat C, DeSalvo B: Phase change memory as synapse for ultra-dense neuromorphic systems: Application to complex visual pattern extraction. IEDM Tech Dig 2011, 2011: 4.4.1–4.4.4.
Yu S, Gao B, Fang Z, Yu H, Kang J, Wong HSP: A neuromorphic visual system using RRAM synaptic devices with Sub-pJ energy and tolerance to variability: Experimental characterization and large-scale modeling. IEDM Tech Dig 2012, 2012: 10.4.1–10.4.4.
Suri M, Bichler O, Querlioz D, Palma G, Vianello E, Vuillaume D, Gamrat C, DeSalvo B: CBRAM devices as binary synapses for low-power stochastic neuromorphic systems: Auditory (cochlea) and visual (retina) cognitive processing applications. IEDM Tech Dig 2012, 2012: 10.3.1–10.3.4.
Suri M, Querlioz D, Bichler O, Palma G, Vianello E, Vuillaume D, Gamrat C, DeSalvo B: Bio-inspired stochastic computing using binary CBRAM synapses. IEEE Trans Electron Devices 2013, 60(7):2402–2409.
Ham SJ, Shin SH, Min KS: Training and recalling of nanoscale memristor-based neuromorphic circuit for speech recognition. Collaborative Conference on 3D & Materials Research (CC3DMR): June 23–27 2014; Incheon/Seoul, South Korea 2014, 403.
Zhu X, Yang X, Wu C, Wu J, Yi X: Hamming network circuits based on CMOS/memristor hybrid design. IEICE Electron Express 2013, 10(12):1–9.
Choi JM, Sin SH, Min KS: Practical implementation of memristor emulator circuit on printed circuit board. J Inst Korean Electrical Electron Eng 2013, 17(3):324–331.
Truong SN, Min KS: New memristor-based crossbar array architecture with 50-% area reduction and 48-% power saving for matrix-vector multiplication of analog neuromorphic computing. J Semiconductor Technol Sci 2014, 14(3):356–363. 10.5573/JSTS.2014.14.3.356
Ham SJ, Mo HS, Min KS: Low-power VDD/3 write scheme with inversion coding circuit for complementary memristor array. IEEE Trans Nanotechnology 2013, 12(5):851–857.
Muda L, Begam M, Elamvazuthi I: Voice recognition algorithms using Mel frequency cepstral coefficient (MFCC) and dynamic time warping (DTW) techniques. J Comput 2010, 2(3):138–143.
Acknowledgements
The work was financially supported by NRF-2011-220-D00089, NRF-2011-0030228, NRF-2013K1A3A1A25038533, NRF-2013R1A1A2A10064812, and BK Plus with the Educational Research Team for Creative Engineers on Material-Device-Circuit Co-Design (Grant No: 22A20130000042), funded by the National Research Foundation of Korea (NRF), and by Global Scholarship Program for Foreign Graduate Students at Kookmin Univ. The CAD tools were supported by IC Design Education Center (IDEC), Daejeon, Korea.
Author information
Authors and Affiliations
Corresponding author
Additional information
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
All authors have contributed to the submitted manuscript of the present work. KSM defined the research topic. SNT and SJH designed the circuit and performed the simulation. KSM wrote the paper. All authors read and approved the submitted manuscript.
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder.
To view a copy of this licence, visit https://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Truong, S.N., Ham, SJ. & Min, KS. Neuromorphic crossbar circuit with nanoscale filamentary-switching binary memristors for speech recognition. Nanoscale Res Lett 9, 629 (2014). https://doi.org/10.1186/1556-276X-9-629
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/1556-276X-9-629