# Challenge of Nonvolatile Logic LSI Using MTJ-Based Logic-in-Memory Architecture

Takahiro Hanyu

## 1 Introduction

In this chapter, a new architecture, called "nonvolatile logic-in-memory (NV-LIM) architecture," is presented, where the NV-LIM architecture could overcome performance wall and power wall due to the present CMOS-only-based logic-LSI processors  $[1-3]$ . Figure [1a](#page-1-0) shows a conventional logic-LSI architecture, where logic and memory modules are separately implemented together and these modules are connected each other through global interconnections. Even if the device feature size is scaled down in accordance with the semiconductor technology roadmap, the global interconnections are not shorten, rather than are getting longer, which resulting in longer delay and higher power dissipation due to inside wires. In addition, since on-chip memory modules are "volatile", they always consume the static power to maintain the stored data.

On the other hand, several emerging storage devices are getting developed to overcome the weak points of conventional semiconductor memories; dynamic random-access memory (DRAM) and static random-access memory (SRAM). Especially, magnetoresistive random-access memory (MRAM) that has already undergone a few incarnations, is now converging on a scheme for upending the memory business. Spin-transfer torque (STT) MRAM promises speed and reliability comparable to that of SRAM, where SRAM is the quick-access memory embedded inside microprocessors, along with the "nonvolatility" of flash, the storage of smartphones and other portables. Since magnetic tunnel junction (MTJ) device, the key element of MRAM, is easily distributed over a logic-circuit plane by using a three-dimensional (3D) stack structure as shown in Fig. [1b](#page-1-0), performance degradation due to intra-chip global wires could be drastically

W. Zhao, G. Prenat (eds.), Spintronics-based Computing, DOI 10.1007/978-3-319-15180-9\_5

T. Hanyu  $(\boxtimes)$ 

Research Institute of Electrical Communication, Tohoku University, Sendai, Japan e-mail: [hanyu@irec.tohoku.ac.jp](mailto:hanyu@irec.tohoku.ac.jp)

<sup>©</sup> Springer International Publishing Switzerland 2015

<span id="page-1-0"></span>

mitigated, which leads to a high- performance, ultra-low-power and highly reliable (or highly resilient) logic LSI.

One of the most useful methods to cut off the leakage power is to use power gating. Figure [2a](#page-2-0) shows a time chart of power dissipation in conventional logic LSI without power gating. If you apply the power gating in the conventional logic LSI, a part of standby power can be eliminated, but two additional operations, "back-up" and "boost-up" procedures, must be performed before and after applying the power gating, which may discourage to apply the power-gating technique as shown in Fig. [2b](#page-2-0). In contrast, non-volatility is a good combination of applying the power gating, which ideally eliminates the wasted power dissipation as shown in Fig. [2c](#page-2-0).

Figure [3a](#page-3-0) shows nonvolatile VLSI processor architecture, where a high-density and high-speed MRAMs and nonvolatile flip-flops are used to simply realize a nonvolatile logic LSI [\[4](#page-17-0), [5](#page-17-0)]. When you could merge a part of nonvolatile on-chip memory into logic-circuit modules as shown in Fig. [3b,](#page-3-0) it would be possible to improve the performance of the nonvolatile logic LSI. In the following description, some concrete design examples using MTJ-based nonvolatile NV-LIM architecture such as nonvolatile field programmable gate array  $(FPGA)$   $[6–14]$  $[6–14]$ , nonvolatile ternary content-addressable memory (TCAM) [[15–22\]](#page-17-0), and nonvolatile randomaccess logic-LSI unit (MCU) [[23,](#page-17-0) [24\]](#page-18-0) are demonstrated and their usefulness is discussed.

<span id="page-2-0"></span>

Power : greatly reduced, Delay : shorter

Fig. 2 Combination of power-gating and nonvolatile logic techniques; (a) Conventional CPU without power gating, (b) Conventional CPU with power gating, (c) NV-LIM CPU with power gating

<span id="page-3-0"></span>

Fig. 3 Configuration of nonvolatile logic LSIs; (a) 1st-generation nonvolatile logic-LSI architecture, (b) 2nd-generation nonvolatile logic-LSI architecture

## 2 Design Example of NV-LIM-Based FPGA

Field programmable gate array (FPGA) is a key device to quickly realize prototyping systems, where their specification and function are directly programmable by users, while power consumption as well as hardware cost is a serious problem in expanding application fields of FPGAs, especially in the field of mobile and portable applications  $[25]$  $[25]$ . The use of MTJ devices could solve the powerdissipation problem. Figure [4](#page-4-0) shows the overall structure of a nonvolatile FPGA, where each lookup table (LUT) circuit in the configuration logic block (CLB) stores logical configuration data into MTJ devices. Therefore, whenever an LUT circuit is in a standby mode, its power supply could be shut down, which completely eliminates the wasted standby power dissipation.

Although the use of MTJ devices makes the LUT circuit nonvolatile, its hardware cost is rather than increased when you simply replaces conventional SRAM-cellbased volatile storage elements with nonvolatile ones, because every MTJ-based nonvolatile storage element generally requires sense amplifier (SA) as shown in Fig. [5a](#page-4-0). In order to reduce the hardware overhead, MTJ devices are merged into the combinational logic circuit as shown in Fig. [5b,](#page-4-0) where the technique is the circuitlevel NV-LIM architecture [[6](#page-17-0)]. As a result, the LUT circuit becomes compact because only a single SA is required in the proposed LUT circuit. Figure [6a](#page-5-0) shows the circuit diagram of the MTJ-based two-input nonvolatile LUT circuit, and Fig. [6b](#page-5-0) shows a fabricated two-input nonvolatile LUT-circuit test chip and its features. The immediate wakeup behavior of the nonvolatile LUT circuit has been confirmed by the measured waveforms as shown in Fig. [7](#page-5-0).

<span id="page-4-0"></span>

Fig. 5 Design philosophy of a compact nonvolatile LUT circuit; (a) Conventional approach, (b) proposed NV-LIM architecture-based approach

<span id="page-5-0"></span>

Fig. 7 Immediate wakeup behavior of the 2-input nonvolatile LUT circuit

In the practical FPGA, the number of input variables in the LUT function must be four or more, while the variation of the resistance values of MTJs devices becomes critical, because multi-input LUT circuit requires many MTJ devices and MOS transistors, where they are connected serially. For the stable and reliable



Fig. 8 Resistance-variation compensation technique using redundant MTJ devices

operation of the multi-input LUT circuit, we insert "redundant" MTJ devices to adjust the operating point of the LUT circuit. Figure  $8$  shows a design example of the multi-input nonvolatile LUT circuit, where both twice number of MTJs and three additional MTJs are inserted into the LUT-selection tree and the LUTreference tree, respectively [\[7](#page-17-0), [8](#page-17-0)]. Figure [9](#page-7-0) summarizes the comparison of multiinput LUT circuits. It is clearly demonstrated that the proposed the NV-LIM-based NV-LUT circuit is implemented.

Not only LUT circuit but also other components, switch block (SB) and connection block (CB), in FPGA chip are efficiently implemented by using the circuitlevel NV-LIM architecture. Since the write-current characteristic of MTJ device is left-right asymmetric as shown in Fig. [10a](#page-7-0), only a single MOS transistor with a large width is shared by every NV latch, while each NV latch has only MOS transistors with a small width as shown in Fig. [10b](#page-7-0), which greatly reduces the effective chip area of routers [[9](#page-17-0)]. Figure [11](#page-8-0) shows a fabricated nonvolatile FPGA chip, where almost 1,000 tiles (each tile consists of LE, CB and SB, that is, the minimum set of basic components in FPGA chip) are integrated in the area of  $3.4 \times 2.0$  mm<sup>2</sup> under 90 nm CMOS/perpendicular-MTJ technologies. This highdensity integration of nonvolatile FPGA chip has firstly succeeded by using the

<span id="page-7-0"></span>

Fig. 9 Comparison of multi-input LUT circuits



Fig. 10 Compact nonvolatile routing switch sharing write driver

NV-LIM architecture [[10,](#page-17-0) [11](#page-17-0)]. As a future prospect, it is also important to design nonvolatile logic LSI using three-terminal MTJ (3T-MTJ) device [[12–14](#page-17-0)], because write current path is separated from read current path in the 3T-MTJ device [\[26](#page-18-0), [27](#page-18-0)], which greatly mitigates the circuit-design restricts of nonvolatile logic LSIs. Figure [12](#page-8-0) shows a 3-T MTJ-based nonvolatile LE. The use of 3-T MTJ devices makes the LE compact and improved the switching speed, because readcurrent level could be appropriately determined independent of write-current level.

<span id="page-8-0"></span>

Fig. 11 Resistance-variation compensation technique using redundant MTJ devices



Fig. 12 Design example of a 3-T MTJ-based logic element using NV-LIM architecture

## 3 Design Example of NV-LIM-Based TCAM

As a typical nonvolatile special-purpose logic-LSI example using NV-LIM structure, MTJ-based non-volatile ternary content-addressable memories (TCAM) have been designed and fabricated [\[15–22](#page-17-0)]. TCAM is a functional memory for high<span id="page-9-0"></span>speed data retrieval that performs a fully parallel search and fully parallel comparison between an input key and stored words. Currently, its high bit cost and high power dissipation, higher than those of standard semiconductor memories such as static random access memory limits the fields to which TCAM can be applied. Figure 13 shows the truth table of a TCAM cell function. Its rich functionality makes data search powerful and flexible but with conventional CMOS realization there is an associated cost of a complicated logic circuit with two-bit storage elements. Figure 14 shows the design philosophy of realizing the TCAM cell circuit



**Fig. 13** Truth table of the TCAM cell function



Fig. 14 Design philosophy of making a compact TCAM cell circuit; (a) conventional TCAM cell structure, (b) conventional NV-TCAM cell structure, (c) MTJ device merging storage and logic functions, (d) proposed NV-TCAM cell structure

compactly with a non-volatile storage capability. In the case of both conventional volatile TCAM cell structure and conventional non-volatile TCAM cell structure without using LIM architecture shown in Fig.  $14a$ , b, respectively, the bit cost is high. In contrast, when two-bit storage elements are merged into a logic-circuit part by using the LIM architecture as shown in Fig. [14c,](#page-9-0) the proposed TCAM cell structure becomes compact and non-volatile as in Fig. [14d](#page-9-0). Figure 15a, b compare a conventional volatile TCAM cell circuit and the proposed non-volatile one, respectively. The conventional CMOS-based volatile TCAM cell circuit consumes 12 MOS transistors (12T-TCAM circuit structure) while the proposed one takes just 4 MOS transistors with two MTJ devices (4T-2MTJ circuit structure) [[15–20\]](#page-17-0). Note that MTJs do not affect the total TCAM cell-circuit one, because MTJs are fabricated onto the CMOS plane. Compact realization due to NV-LIM architecture has the advantage of improving the performance of the circuit by inserting a driver as shown in Fig. 15c. Figure [16](#page-11-0) summarizes the comparison of TCAM word circuits with 144 cells. By the appropriate division of the TCAM word circuit, the activation ratio of the TCAM can be minimized. Figure [17](#page-11-0) shows the variety of the segmentbased TCAM word-circuit structures. In the case of the 3-segment-based NV-TCAM word-circuit structure, where the first segment, the second-segment and the rest consist of 3-bit, 7-bit and 134-bit cells respectively, its average activation ratio becomes as low as 2.8 %, which indicates that 97 % or more TCAM cells can be in standby mode on average by the fine-grained power gating



Fig. 15 TCAM cell-circuit design; (a) conventional volatile TCAM cell circuit, (b) proposed 4T-2MTJ NV-TCAM cell circuit, (c) proposed 7T-2MTJ NV-TCAM cell circuit

<span id="page-11-0"></span>

Fig. 17 Power-gating-oriented TCAM search schemes; (a) fully parallel search scheme, (b) series-parallel search scheme, (c) bit-serial search scheme

[\[15–20](#page-17-0)]. Figure [18](#page-12-0) shows the fabricated non-volatile TCAM test chip under 90-nm CMOS/MTJ technologies, which is used as a high-speed index search engine [[17\]](#page-17-0).

Robustness against soft error due to particle strike is getting an important factor in the practical applications. MTJ device stores one-bit information as a resistance whose value is robust against alpha particle and atmosphere neutron strikes, which significantly lower the probability of single-event upsets (SEUs). The TCAM also becomes robust against delay variations caused by single event transients (SETs) as it is designed based on four-phase dual-rail encoding realized using complementary NAND and NOR-type word circuits as shown in Fig. [19](#page-12-0) [[21,](#page-17-0) [22\]](#page-17-0). The dual-rail

<span id="page-12-0"></span>

Fig. 18 Fabricated nonvolatile TCAM test chip and its features



Fig. 19 Design of Asynchronous dual-rail nonvolatile TCAM word circuit with soft-error tolerance

|                                  | <b>Synchronous</b><br>(CMOS) | <b>Extension of</b><br>ASYNC'13<br>(CMOS) | <b>Proposed</b><br>(CMOS/MTJ) |
|----------------------------------|------------------------------|-------------------------------------------|-------------------------------|
| Cycle time [ns]                  | 3.398                        | N/A                                       | 3.410                         |
| (search delay [ns])              | 1.699                        | N/A                                       | $2.330$ (data)                |
| (precharge delay [ns])           | 1.699                        | N/A                                       | $1.060$ (spacer)              |
| Eneegy metric<br>[fJ/bit/search] | 0.580                        | N/A                                       | 0.686                         |
| <b>TCAM</b> cell                 | 24T                          | 48T (2x 24T)                              | 20T-4MTJ                      |
| SEU tolerant in cell             | <b>Yes</b>                   | Yes                                       | <b>Yes</b>                    |
| SEU free in cell                 | No.                          | No                                        | Yes $(<10^{-50})$             |
| Delay-variation tolerant         | No.                          | Yes                                       | Yes                           |
| Soft-error detection             | No                           | No                                        | Yes                           |

Fig. 20 Comparison of TCAMs with soft-error tolerance

TCAM cell is compactly designed using 20 transistors (20T) and 4 MTJ devices stacked on a CMOS layer as opposed to a single-rail 24T TCAM cell that consists of soft-error tolerant storage elements. In addition, soft errors can be detected using the dual-rail signals. As a design example, a 256-word  $\times$  64-bit TCAM is designed under a 90-nm CMOS/MTJ technology and is evaluated with a collected charge caused by a particle strike, which induces the SET and hence the delay variation. Figure 20 summarizes the performance comparison of TCAMs. The proposed TCAM properly operates under the delay variation, while achieving comparable performance to a synchronous single-rail TCAM in which an up to 25 % timing error occurs.

## 4 Design Example of Nonvolatile Random-Access Logic LSI

In order to design and implement MTJ-based nonvolatile random-access logic circuit, we must make a basic gate family using NV-LIM architecture. We have employed a nonvolatile full adder circuit to demonstrate a circuit based on logic-inmemory architecture [\[28](#page-18-0)]. Figure [21](#page-14-0) shows the circuit diagram of the full adder. It consists of SUM-circuit and CARRY-circuit parts, where the symbols  $A(A')$ ; the complement of A) and Ci  $(Ci')$  are the external inputs and the symbol B  $(B')$  is a stored input. The use of a dynamic logic style [\[29](#page-18-0)] (where pre-charged sense amplifier [[30\]](#page-18-0) has been also presented as a high-speed, highly stable and lowpower logic style of nonvolatile logic) controlled by clock signals, CLK and CLK', cuts off the steady current flow from the supply voltage VDD to GND, which reduces the dynamic power dissipation of the circuit. The stored data is programmed by controlling external signals. Complementary stored inputs, B and B', are programmed by using individual current-flow path, which is selectable by

<span id="page-14-0"></span>

Fig. 21 Circuit diagram of a nonvolatile full adder with MTJ-based logic-in-memory architecture

the word lines, WL1, WL2, WL3, and WL4, and the bit lines, BL and BL $\prime$ . For example, in the case of storing  $B = 0$  into the corresponding MTJ in the SUM circuit, the word line WL1 is set to the supply voltage VDD, and BL and BL' are set to GND and VDD, respectively, which makes the current-flow path through the MTJ set up as shown in Fig. 21. All the external inputs and the complementary clock signals are turned off during the above write operation.

Figure [22](#page-15-0) shows the measured waveforms of the SUM circuit chip, where the stored inputs, B and B', are fixed to "0" and "1", respectively and periodic  $1.0$ -Vpeak-to-peak voltage signals are applied to CLK, CLK', A, A', Ci, and Ci', respectively, under periodic turn on and off of  $VDD = 1.0$  V. It can be clearly seen in the traces of Fig.  $22$  that the output  $S<sub>after</sub>$  (S right after power-on) is the same as  $S<sub>before</sub>$  (S just before power-off), which means that stored data remain intact even if VDD is shut down and is turned on again. It should be noted that nonvolatile storage function of the present circuit is realized without employing complex reload/write-back from/into an off-chip nonvolatile storage device.

In order to design practical-scale MTJ-based NV-LIM LSIs, it is important to establish an (semi-)automated design flow. We have developed this flow by combining de facto standard engineering design automation (EDA) tools and new supplementary design tools for precise simulation of MTJ device characteristics as shown in Fig. [23](#page-15-0) [\[31](#page-18-0)]. By using the proposed flow, various MTJ-based NV-LIM circuits can be designed by using HDL, and the corresponding layout including MOS and MTJ/MOS-hybrid cells can be automatically synthesized, as shown in Fig. [24,](#page-16-0) where its layout validity can be completely verified through DRC and LVS.

As a typical example of nonvolatile random-access logic LSI, we have developed the motion-vector prediction unit  $[23, 24]$  $[23, 24]$  $[23, 24]$ . Figure [25](#page-16-0) shows a test-chip

<span id="page-15-0"></span>

Fig. 22 Measured waveforms of the SUM circuit chip with the proposed NV-LIM architecture



Fig. 23 STT-MTJ device model built in SPICE simulator; (a) example of a netlist, (b) corresponding equivalent circuit, (c) simulated waveforms

photomicrograph of the motion-vector prediction unit using 90 nm MTJ/MOS process made on a 300 mm wafer fabrication line. Twenty-five processing elements (Pes) are arranged in a  $5 \times 5$  grid, which reduces the dissipation to one-fourth. The number of MOS transistors is about 0.5 million and that of MTJ devices is about 13,000.

<span id="page-16-0"></span>

Fig. 24 Layout-design example of an NV-LIM-based random logic circuit





<span id="page-17-0"></span>Acknowledgement The author thanks M. Natsui, A. Mochizuki, S. Matsunaga, D. Suzuki, N. Onizawa, S. Ikeda, T. Endoh, and H. Ohno for their great contribution concerning technical support and chip-fabrication support. A part of this research was supported by JSPS FIRST Program.

#### References

- 1. H. Ohno, T. Endoh, T. Hanyu, N. Kasai, S. Ikeda, in International Electron Device Meeting (IEDM), 2010, p. 9.4
- 2. T. Hanyu, SPIN 3(4), 1340014 (2013)
- 3. T. Hanyu, D. Suzuki, A. Mochizuki, M. Natsui, N. Onizawa, T. Sugibayashi, S. Ikeda, T. Endoh, H.I. Ohno, in International Electron Device Meeting (IEDM), 2014, in press
- 4. N. Sakimura, Y. Tsuji, R. Nebashi, H. Honjo, A. Morioka, K. Ishihara, K. Kinoshita, S. Fukami, S. Miura, N. Kasai, T. Endoh, H. Ohno, T. Hanyu, T. Sugibayashi, in IEEE International Solid-State Circuits Conf. (ISSCC), 2014, pp. 184–185
- 5. H. Koike, T. Ohsawa, N. Sakimura, R. Nebashi, Y. Tsuji, A. Morioka, K. Miura, H. Honjo, T. Sugibayashi, S. Ikeda, T. Hanyu, H. Ohno, T. Endoh, in IEEE Asian Solid-State Circuits Conference (ASSCC 2013), 2013, pp. 317–320
- 6. D. Suzuki, M. Natsui, S. Ikeda, H. Hasegawa, K. Miura, J. Hayakawa, T. Endoh, H. Ohno, T. Hanyu, in IEEE Symposium on VLSI Circuits, Dig. Tech. Papers, 2009, p. 80
- 7. D. Suzuki, M. Natsui, T. Hanyu, Trans IEICE-C, J92-C(7), 233–240, (2009) (Japanese).
- 8. D. Suzuki, M. Natsui, T. Endoh, H. Ohno, T. Hanyu, J. Appl. Phys. 111, 07E318 (2012)
- 9. D. Suzuki, D. Suzuki et al., J. Appl. Phys. 115, 17B742 (2014)
- 10. D. Suzuki et al., IEICE ELEX 10, 20130772 (2013)
- 11. D. Suzuki, et al., IEEE Trans Magn (2014, in press).
- 12. D. Suzuki, Y. Lin, M. Natsui, T. Hanyu, Jpn. J. Appl. Phys. 52(4), 04CM04 (2013)
- 13. D. Suzuki, M. Natsui, A. Mochizuki, T. Hanyu, Jpn. J. Appl. Phys. 53(4S), 04EM03 (2014)
- 14. D. Suzuki, N. Sakimura, M. Natsui, A. Mochizuki, T. Sugibayashi, T. Endoh, H. Ohno, T. Hanyu, IEICE Electronics Express (2014, in press).
- 15. S. Matsunaga, A. Katsumata, M. Natsui, S. Fukami, T. Endoh, H. Ohno, T. Hanyu, in IEEE Symp VLSI Circuits, 2011, pp. 298–299
- 16. S. Matsunaga, S. Miura, H. Honjou, K. Kinoshita, S. Ikeda, T. Endoh, H. Ohno, T. Hanyu, in IEEE Symp VLSI Circuits, 2012, pp. 44–45
- 17. S. Matsunaga, N. Sakimura, R. Nebashi, Y. Tsuji, A. Morioka, T. Sugibayashi, S. Miura, H. Honjo, K. Kinoshita, H. Sato, S. Fukami, M. Natsui, A. Mochizuki, S. Ikeda, T. Endoh, H. Ohno, T. Hanyu, in IEEE Symp VLSI Circuits, C106-C107, 2013, pp. 106-107
- 18. S. Matsunaga, A. Katsumata, M. Natsui, T. Endoh, H. Ohno, T. Hanyu, Jpn. J. Appl. Phys. 51 (2), 02BM06 (2012)
- 19. S. Matsunaga, A. Katsumata, M. Natsui, T. Endoh, H. Ohno, T. Hanyu, J. Appl. Phys. 111(7), 07E336 (2012)
- 20. S. Matsunaga, M. Natsui, S. Ikeda, K. Miura, T. Endoh, H. Ohno, T. Hanyu, in *Proc. Asia and* South Pacific Design Automation Conf. (ASP-DAC), 2012, pp. 475–476
- 21. S. Matsunaga, A. Mochizuki, T. Endoh, H. Ohno, T. Hanyu, IEICE Electron. Express 11(3), 20131006 (2014)
- 22. N. Onizawa, S. Matsunaga, and T. Hanyu, in 20th IEEE International Symposium on Asynchronous Circuits and Systems (ASYNC 2014), pp. 1–8, 2014
- 23. M. Natsui, D. Suzuki, N. Sakimura, R. Nebashi, Y. Tsuji, A. Morioka, T. Sugibayashi, S. Miura, H. Honjo, K. Kinoshita, S. Ikeda, T. Endoh, H. Ohno, T. Hanyu, in IEEE International Solid-State Circuits Conf. (ISSCC), Dig. Tech. Papers, pp. 194–195, 2013
- <span id="page-18-0"></span>24. M. Natsui, D. Suzuki, N. Sakimura, R. Nebashi, Y. Tsuji, A. Morioka, T. Sugibayashi, S. Miura, H. Honjo, K. Kinoshita, S. Ikeda, T. Endoh, H. Ohno, T. Hanyu, in IEEE Journal of Solid-State Circuits, 2015, in press
- 25. S. Brown, J. Rose, IEEE Des. Test Comput. 13, 42 (1996)
- 26. S. Fukami et al., IEEE TMAG 48, 2152 (2012)
- 27. S. Fukami et al., IEDM Tech. Dig. 72 (2013)
- 28. S. Matsunaga, J. Hayakawa, S. Ikeda, K. Miura, H. Hasegawa, T. Endoh, H. Ohno, T. Hanyu, Appl. Phys. Express 1, 091301 (2008)
- 29. A. Mochizuki, H. Kimura, M. Ibuki, T. Hanyu, IEICE Trans. Fundam. E88-A, 1408–1415 (2005)
- 30. W. Zhao, C. Chappert, V. Javerliac, J.-P. Noziere, IEEE Trans. Magn. 45, 3784–3787 (2009)
- 31. N. Sakimura, R. Nebashi, Y. Tsuji, H. Honjo, T. Sugibayashi, H. Koike, T. Ohsawa, S. Fukami, T. Hanyu, H. Ohno, T. Endoh, in IEEE International Symposium on Circuits and Systems (ISCAS), 2012, pp. 1971–1974.