### TECHNICAL PAPER



# Design of SRAM cell for low power portable healthcare applications

Soumitra Pal<sup>1</sup> · Subhankar Bose<sup>2</sup> · Aminul Islam<sup>2</sup>

Received: 9 September 2018 / Accepted: 9 March 2020 / Published online: 14 March 2020 - Springer-Verlag GmbH Germany, part of Springer Nature 2020

### Abstract

Biomedical applications such as body area networks (BANs) require the design of power-efficient SRAM cells for the extended battery lives of BAN sensor nodes. In this work, we have proposed a bit-interleaving supporting, robust, lowpower single-ended 9T (SE9T) bitcell. Design metrics of our bitcell are compared with several bitcells such as the 7T, FD8T and SEDF9T cells for their comparative analysis. The proposed cell shows  $2.87\times/3.36\times$  higher RSNM than that of 7T/FD8T and  $1.05\times/1.5\times/7.0\times$  higher WSNM than that of 7T/FD8T/SEDF9T,  $1.15\times/1.06\times$  and  $1.54\times/1.38\times$  lower distribution in T<sub>RA</sub> and I<sub>READ</sub>, respectively, compared to 7T/FD8T. In addition, the proposed cell shows  $1.15\times11.22\times$ shorter  $T_{WA}$  when compared to SEDF9T/7T. Furthermore, SE9T cell consumes  $10.80\times/17.81\times$  lower write power than that of SEDF9T/FD8T and  $1.52\times/18.37\times$  lower read power than that of 7T/FD8T. It also exhibits  $1.04\times/2.92\times$  lower leakage power dissipation than that of FD8T/7T. All these developments are obtained at a cost of 2.5 $\times$  longer  $T_{WA}$ , 1.73 $\times$ / 1.73 $\times$  longer  $T_{RA}$  when compared to FD8T and 7T/FD8T, and 1.64 $\times$ /1.06 $\times$  higher write power/read power than 7T/ SEDF9T @  $V_{\text{DD}} = 700 \text{ mV}$ .

# 1 Introduction

Rapid advancements in technology has made it possible to continuously monitor the health of a patient with the help of small and precise sensors attached to their body. The detection of blood pressure and low oxygen levels are some of its potential applications (Izumi et al. [2015](#page-10-0)). Such a collection of sensors capable of wireless communication forms a wireless network system, which is popularly known as wireless-body-area-network (WBAN) (Dautov and Tsouri [2016](#page-10-0)).

WBAN nodes generally have a very small form factor  $(< 1 \text{ cm}^3)$  which reduces their on-sensor battery energy and are constrained to a very low power limit (Sharma et al. [2012\)](#page-11-0). However, the data collected by the sensors must be transferred wirelessly to a processing unit, which is a power-hungry process. Thus, to reduce power consumption the amount of data must be minimized (Kwong and

 $\boxtimes$  Aminul Islam aminulislam@bitmesra.ac.in Soumitra Pal spal@connect.ust.hk

Department of Electronic and Computer Engineering, HKUST, Clear Water Bay, Hong Kong

<sup>2</sup> Department of ECE, BIT, Mesra, Ranchi 835215, India

Chandrakasan [2011](#page-10-0)). This is realized by employing signal processing. The processing is done on wireless sensornodes, that require a highly-dense SRAM (static RAM) along with an intelligent processor (Sharma et al. [2012](#page-11-0)). WBAN nodes are mostly "ON" as they continuously collect, process and transfer data, for the real-time diagnosis of patients. Consequently, the dynamic power consumption of SRAM-based cache memories must be considerably low to ensure extended battery lives of WBAN nodes (Sharma et al. [2012\)](#page-11-0).

Since dynamic or active power is a quadratic function of supply voltage  $(V_{DD})$ , it can be significantly reduced by downscaling of  $V_{\text{DD}}$  (Morifuji et al. [2006\)](#page-11-0). Moreover, reduction in supply voltage also leads to decrease in static power consumption, which is responsible for a considerable portion of cumulative power consumption, as it is linearly dependent on  $V_{DD}$  (Gupta et al. [2018](#page-10-0)). Thus, downscaling of  $V_{\text{DD}}$  leads to decrease in overall power consumption.

However, with decrease in supply voltage, operational delay increases which causes substantial amount of energy consumption for every read/write cycle (Nabavi and Sachdev [2018](#page-11-0)). Moreover, as  $V_{\text{DD}}$  reduces,  $(V_{\text{DD}} - V_t)$ also reduces, which leads to severe deterioration of noise margin and renders the circuit unreliable (Maroof and Kong [2017\)](#page-10-0). In addition, further degradation in the stability of SRAM cell may occur due to the extensive presence of <span id="page-1-0"></span>PVT variations and random dopant fluctuations (RDF) in submicron technologies (Nayak et al. [2017\)](#page-11-0).

At low supply voltages, the read SNM (which stands for static noise margin) of conventional 6T bitcell is considerably degraded and hence, it is highly susceptible to read upsets (Pal and Islam [2016a\)](#page-11-0). On a similar note, the 6T cell fails to keep the same driveability of the transistors at low  $V_{\text{DD}}$  leads to a high probability of write failure, as the cell may be unable to reverse the stored data. Moreover, half-select issue is another major problem in the conventional 6T cell, which leads to miswriting in half-selected cells.

Over the past couple of decades, several modifications to the conventional design have been proposed to overcome its limitations. The fully differential 8T cell (FD8T) in Anh-Tuan et al.  $(2011)$  $(2011)$  employs an additional decoupling inverter to prevent half-select disturbance. However, it is essentially a conventional 6T cell and suffers from read upset occurrences. The read-decoupled SRAM cells proposed in Pal and Islam [\(2016a,](#page-11-0) [b](#page-11-0)), Islam and Hasan [\(2012a\)](#page-10-0), Chiu and Hu [\(2014](#page-10-0)), Sharma et al. ([2018\)](#page-11-0) and Ensan et al. [\(2018](#page-10-0)) isolate storage nodes from bit lines. Consequently, they exhibit significant improvements in RSNM. Moreover, by employing an additional tail-transistor in their core cell, the leakage power dissipation of LP9T (Pal and Islam [2016a](#page-11-0)), LP10T Islam and Hasan [\(2012a\)](#page-10-0) and LP11T (Pal and Islam [2016b](#page-11-0)) is considerably curtailed.

However, such improvements are achieved at the expense of very high dynamic power consumption. The most effective way of curtailing dynamic power is the use of single-ended structures as their bitline activity factor is below 0.5 (Aly and Bayoumi [2007\)](#page-10-0). However, in the absence of any write assist mechanism, such single-ended cells are incapable of writing '1' (Tu et al. [2010](#page-11-0)).

Therefore, several distinct write assist schemes have been employed by various single-ended cells like Aly and Bayoumi [\(2007](#page-10-0)), Tu et al. ([2010,](#page-11-0) [2012](#page-11-0)), Pal et al. [\(2019a,](#page-11-0) [b](#page-11-0), [2020a\)](#page-11-0), Farkhani et al. [\(2014](#page-10-0)), Kushwah et al. [\(2017](#page-10-0)), Takeda et al. [\(2006](#page-11-0)), Tawfik and Kursun ([2008\)](#page-11-0) to successfully complete the '1' writing operation. For example, by using a feedback cutting transistor inside its core cell, the 7T cell in Aly and Bayoumi [\(2007](#page-10-0)) and 9T cell in Pal et al. [\(2019a](#page-11-0)) exhibit significant improvements in write static noise margin (WSNM) or write ability. However, in absence of decoupling techniques it is prone to frequent read upsets. Although the write ability of cells in Tu et al. [\(2010](#page-11-0)) and Farkhani et al. [2014\)](#page-10-0) is enhanced due to the use of a core cell with asymmetrical inverter sizing, they are highly susceptible to PVT variations. Authors in Tu et al.  $(2012)$  $(2012)$  have suggested a 9T SRAM cell (SEDF9T) which exhibits an improved write ability by employing a negative bitline scheme at the expense of a high  $V_{\text{DD,min}}$ , when subjected to PVT variations. Read-disturb free 7T cells in Kushwah et al. [\(2017](#page-10-0)) and Takeda et al. ([2006\)](#page-11-0) employ an additional transistor in one of the inverters at the cost of a severely degraded hold stability. Although the dual  $V_t$  SRAM cell in Tawfik and Kursun [\(2008](#page-11-0)) can write '1' successfully, it increases the fabrication complexity considerably. In order to achieve enhancements in both read stability and write ability without hampering each other, a write-assist low power 11T (WALP11T) cell Pal et al. [\(2019c\)](#page-11-0), a data aware power cut off (DAPC) SRAM cell (Chiu and Hu [2014](#page-10-0)), a Schmitt trigger based SRAM cell (Kulkarni et al. [2007](#page-10-0)) and a 12T DWA12T cell (that cuts the loop dynamically) (Pal et al. [2020b](#page-11-0)) have been previously proposed, which also exhibit robust behaviour when subjected to severe process variations. For all their advantages, these designs consume excessive amounts of active power and a considerable area overhead is incurred in their fabrication.

In order to address the various issues faced by the aforementioned design, we have proposed a single-ended 9T (SE9T) SRAM cell (see Fig. 1) in this paper, which not only enhances the read stability and write ability individually, but also minimizes both active and static power consumption. Moreover, it is also half-select disturbance free.

The SE9T bitcell is elaborately described in Sect. [2](#page-2-0). We compare our cell with the cells in the literature in Sect. [3](#page-2-0) and the comparison summary is provided in the next section (i.e., Sect. [4](#page-9-0)). We summarize the paper in Sect. [5](#page-10-0) with appropriate conclusion.



Fig. 1 Sketch of single-ended 9T (SE9T) static random-access memory bit cell, which is proposed in this work

### <span id="page-2-0"></span>2 The SE9T cell and its operation

The core cell of SE9T (see Fig. [1\)](#page-1-0) is comprised of inverter INV1/INV2 formed by MP1/MP2 and MN1/MN2. The transistor MN6 is used as feedback cutting FET, which is placed between INV1 and INV2. The CSL is used as column selection control line. It puts the access transistor MN4 ON while the WL and WLB, which are row-based word lines, put write access transmission gate (TG) composed of MN3 and MP3. MN4 is connected to a single columnar bitline (BL). The read decoupling transistor MN5 has its gate connected to node QB and is connected between nodes X1 and X2. Figure 2 provides a simplified architecture of the proposed cell. MN5 is connected to GND via FET RDT. RDT is larger transistor and is shared by each cell in a particular row and is activated by rowbased signal RWL.

# 2.1 Feedback cutting write operation

At the beginning of the write operation, the column-based CSL and row-based WL are set to  $V_{DD}$  to turn MN4 and MN5 ON, respectively. The RDT is put in nonconducting state by grounded RWL. Row-based write enable signal WE is set to GND to turn OFF the feedback-cutting MN6, which in turn cuts off the feedback path, in presence of which it is not possible to complete the '1' writing operation. Consequently, the inverter, adjacent to the BL, drives the other inverter and eventually write operation is completed. BL is driven by write driver (not shown) to  $V_{DD}$  for writing '1' or driven to GND for writing '0' to storage node Q.

Let us take the case of writing '1' with the assumption that QB holds '1' and Q holds '0' initially prior to this



Fig. 2 Simplified array-based architecture for SE9T SRAM cell

write operation. For this purpose, BL is kept  $\omega$   $V_{\text{DD}}$ . As a result, voltage from BL is passed on to node Q2. As the voltage at Q2 rises, the output of INV1 (QB) rapidly falls from  $V_{\text{DD}}$ . With decrease in  $V_{\text{OB}}$ , the input voltage of INV2 decreases. This turns MP2 ON while MN2 OFF. This causes  $V<sub>O</sub>$  to rise which is the output of INV2. Thus, logic '1' is stored at storage node Q while logic '0' is stored at storage node QB. Similarly, '0' is written in a complementary fashion.

#### 2.2 Single ended decoupled-read operation

During each read operation, WE is set to  $V_{\text{DD}}$  to turn ON the intermediate MN6 to ensure that feedback path between the two inverters exists. The nodes Q and as well as QB are physically separated from the pre-charged BL by setting row-based WL/WLB to  $GND/V<sub>DD</sub>$ , which turns the TG OFF.

At the beginning of read operation, CSL and RWL are activated to turn ON access transistor MN4 and shared read-discharge transistor RDT, respectively. Since the gate of MN5 is connected to storage node QB, BL either discharges or remains charged depending upon the data stored in QB.

A sense amplifier (not shown) is used to sense a 50 mV fall in voltage of BL with respect to a reference voltage, completing a read operation.

### 2.3 Hold operation

CSL and WL/WLB are maintained at GND and  $GND/V<sub>DD</sub>$ respectively during the hold operation to turn OFF access transistor MN4 and TG during the hold operation while BL remains precharged. To retain data through feedback path, WE is maintained at  $V_{\text{DD}}$  to turn ON the MN6. Since RWL is set to GND, the RDT is maintained in OFF state.

# 3 Simulation setup and results

In this work, SPICE and 16-nm PTM ([http://ptm.asu.edu/\)](http://ptm.asu.edu/) have been used for this work. We have compared our design with existing 7T (Aly and Bayoumi [2007\)](#page-10-0) (Fig. [3](#page-3-0)), SEDF9T (Tu et al. [2012](#page-11-0)) (Fig. [4](#page-3-0)) and FD8T (Anh-Tuan et al. [2011](#page-10-0)) (Fig. [5](#page-3-0)) cells to determine its effectiveness. In addition to prevalent read–write conflicts, widespread effects of process variations are also instrumental in determining the sizing of SRAM cells. The RDF induced  $V_t$ shift is related to device dimensions and is given by

$$
\sigma_{vt} \propto \frac{1}{\sqrt{Width \times Length}}.\tag{1}
$$

<span id="page-3-0"></span>

Fig. 3 Schematic of 7T SRAM cell (Aly and Bayoumi [2007](#page-10-0))



Fig. 4 Schematic of SEDF9T SRAM cell (Tu et al. [2012](#page-11-0))



Fig. 5 Transistor-level circuit diagram of FD8T static RAM bitcell (Anh-Tuan et al. [2011\)](#page-10-0)

Thus, Eq. ([1\)](#page-2-0) signifies that  $V_t$  variations decrease considerably with increase in device area Islam and Hasan [\(2012b](#page-10-0)). Consequently, transistors that occupy large areas are highly tolerant of PVT variations. However, it is also a necessity to address the read–write conflict existing in conventional 6T cells to obtain optimum transistor sizing. FD8T basically 6T bitcell with an additional NOT gate. Hence, it is susceptible to flipping of data while reading. The access transistors need to be weaker than driver transistors to deal with this issue. Hence, the ratio of their widths, given by  $\beta_{\text{ratio}}$ , must be larger than 1. On the other hand, the '1' storing node is required to be definitely discharged to zero by the access transistor during the writing mode. However, the PMOSFET keeps trying to maintain the storage node high. This results in, a 'fight' or a conflict arises between the two devices. The  $\gamma_{\rm ratio}$ , (PMOSFET to ACCESS-NMOSFET strength ration) must be chosen appropriately in order to ensure a successful write operation. Therefore, the  $\beta_{\text{ratio}}$  should be maintained between 1.2 and 3 (Pal and Islam  $2016a$ ) while the  $\gamma_{\text{ratio}}$  should be kept below 1.8 (Pal and Islam [2016b](#page-11-0)) to obtain an optimum read and write operation in the conventional 6T cell.

By taking aforementioned constraints into consideration, the PD and PU transistors of the FD8T cell have been assigned a width of 160-nm and 64-nm, respectively, in the core cell, while a 64-nm width has been assigned to all other transistors (see Fig. 5). Thus, a  $\beta_{\text{ratio}}$  of 2.5 and a  $\gamma_{\text{ratio}}$ of 1 are maintained. All other cells used in this work have been apportioned suitable sizing (see Figs. [1,](#page-1-0) 3, 4) to ensure a fair comparison. Since the row-based RDT, employed by the proposed cell, is shared by each cell in a row, a relatively larger width (160-nm) has been assigned to it.

The process/device parameters are becoming no more predictable because of the aggressive technology scaling. Therefore, the design metrics of an SRAM cell are also becoming unpredictable. Thus, influence of process variations on different design metrics need to be investigated (Islam and Hasan [2012a](#page-10-0)). MOSFET parameters (such as L, W, NDEP,  $t_{OX}$ , etc.) and environmental parameters (such as temperature and supply voltage) have been given 10% Gaussian variation with  $3\sigma$  to generate model parameters of MOSFET for 5000 Monte Carlo samples (Pal and Islam [2016b](#page-11-0)).

Capacitance plays a very important role in determining various design parameters of an static RAM bitcell, such as read/write access time and power as well as other parameters. Therefore, by assuming an array size of  $256 \times 16$ , the capacitance associated with BLs, WL, etc. of every cell has been estimated for their various simulations.

### 3.1 Read stability

Read SNM is the smallest magnitude of noise voltage which is capable of flipping data stored in a bitcell while reading (Nayak et al. [2017\)](#page-11-0). Consequently, it is an estimate of the cell's stability during read operation.

<span id="page-4-0"></span>

Fig. 6 a RSNM of various cells and b butterfly curves of row-halfselected SE9T cell during hold, read and write operations @  $V_{\text{DD}} = 0.7 \text{ V}$ 



Fig. 7 RSNM values of comparison cells at different  $V_{DD}$ 

Read SNM is estimated as given in Pal et al. ([2019b\)](#page-11-0) (see Fig. 6a). The read stabilities of various comparison cells at different  $V_{\text{DD}}$  are shown in Fig. 7. FD8T cell exhibits the least RSNM. This is because the FD8T cell is prone to frequent read upsets as it is basically a traditional 6T cell having an extra not gate (Pal and Islam [2016b](#page-11-0)). 7T exhibits slightly enhanced RSNM when compared to FD8T (Pal et al. [2019a\)](#page-11-0).

Amongst all comparison cells, the SEDF9T and SE9T cells show the highest RSNMs. This can be attributed to the read decoupling technique employed by these cells, which physically isolates the storage nodes from the bitline to prevent any capacitive noise involvement from the same and consequently, eliminates the possibility of read upset occurrence (Sharma et al. [2018;](#page-11-0) Tu et al. [2012\)](#page-11-0). Thus, from Table [1](#page-5-0), which provides RSNM values of various comparison bitcells at a  $V_{\text{DD}} = 700 \text{ mV}$ , it is found to exhibit 3.36 $\times$ /  $2.87\times$  higher RSNM than FD8T and 7T, respectively.

# 3.2 Read access time  $(T_{RA})$  and read current  $(I<sub>READ</sub>)$

The read delay  $(T_{RA})$  for cells employing differential reading schemes is estimated as mentioned in Pal et al. [\(2019d](#page-11-0), [e,](#page-11-0) [f,](#page-11-0) [g](#page-11-0)) while the same for single-ended reading cells is estimated as mentioned in Pal et al. [\(2019b,](#page-11-0) [2020a](#page-11-0)).

 $T<sub>RA</sub>$  of various comparison bitcells at various  $V<sub>DD</sub>$  values are illustrated in Fig. [8,](#page-5-0) from which it can be seen that the differential- reading FD8T and 7T achieve the shortest  $T_{RA}$ . However, this is obtained at the expense of degraded read stability, which is detailed earlier in previous subsection.

As opposed to this, the  $T_{RA}$  of single-ended cells like SEDF9T and SE9T is relatively longer. This is because they have more number of transistors in their read path. In addition, read delay is further lengthened as their read buffer possess higher body-effect.

If the situation, where 'Q' and 'QB' hold '0' and '1' respectively, is considered during read mode, then MN5, in the read path, becomes conductive. The transistor RDT is turned ON by active RWL (refer Fig. [1\)](#page-1-0). A positive voltage higher than zero is built at the intermediate node ('X1') of MN4 and MN5, since BL is high. It is seen that initially the voltage at X1 rises to 173 mV and is followed by its gradual fall to 160 mV at the point of reading, when the bitline voltage reaches ( $V_{\text{DD}}$  – 50 mV) (see Fig. [9\)](#page-5-0). Thus, the  $V_{\rm BS}$  (body-to-source voltage) of transistor MN4 becomes negative. Consequently, the drivability of the device is diminished and BL discharges slowly. On the contrary, the node QB of the FD8T cell records a lower increase in voltage, from 0 to 103 mV, due to voltage division effect. Therefore, BL discharges at a faster rate and the corresponding  $T_{RA}$  is shorter than compared to SE9T. SE9T exhibits shorter  $T_{RA}$  when compared to SEDF9T.

The read current  $(I<sub>READ</sub>)$  comparison of various cells at different  $V_{\text{DD}}$  is provided by Fig. [10.](#page-5-0) Given that  $T_{\text{RA}}$  is inversely proportional to read current Pal et al. ([2019b\)](#page-11-0), the proposed cell concedes a penalty in read delay due to a relatively small value of read current. Owing to the same reason, the SEDF9T exhibits smaller  $I_{\text{READ}}$  than the SE9T cell.

| Design<br>metrics         | FC7T (Ensan<br>et al. 2019) | 7T (Aly and<br>Bayoumi 2007) | FD8T (Anh-Tuan<br>et al. $2011$ ) | SEDF9T (Tu<br>et al. $2012$ ) | FC11T (Ensan<br>$et$ al. $2018$ | ULP9T<br>(Moghaddam et al.<br>2016) | This<br>work |
|---------------------------|-----------------------------|------------------------------|-----------------------------------|-------------------------------|---------------------------------|-------------------------------------|--------------|
| $T_{WA}$ (ps)             | 69                          | 69                           | 22.6                              | 63                            | 59                              | 31.7                                | 69           |
| $W_{\text{PWR}}(\mu W)$   | 16.4                        | 12.8                         | 376                               | 228                           | 267                             | 323                                 | 21.1         |
| $T_{\rm RA}$ (ps)         | 125                         | 120                          | 120                               | 253                           | 253                             | 179                                 | 232          |
| $R_{\text{PWR}}(\mu W)$   | 10.8                        | 16.8                         | 215                               | 11                            | 11.7                            | 8.2                                 | 11.7         |
| $H_{\text{PWR}}(\mu W)$   | 0.762                       | 0.631                        | 0.226                             | 0.221                         | 0.547                           | 0.202                               | 0.216        |
| <b>WSNM</b><br>(mV)       | 300                         | 300                          | 210                               | 45                            | 305                             | 230                                 | 300          |
| <b>RSNM</b><br>(mV)       | 52                          | 61                           | 52                                | 175                           | 175                             | 60                                  | 175          |
| <b>HSNM</b><br>(mV)       | 175                         | 173                          | 175                               | 175                           | 175                             | 60                                  | 175          |
| Area $(\mu m^2)$          | 15.9                        | 15.6                         | 21.8                              | 18.6                          | 23.5                            | 22.2                                | 19.2         |
| Normalized<br><b>SAPR</b> | 0.84                        | 0.86                         | 0.004                             | 0.013                         | 0.06                            | 0.02                                | $\mathbf{1}$ |

<span id="page-5-0"></span>Table 1 Comparison among different SRAM cells  $\omega$   $V_{\text{DD}} = 0.7 \text{ V}$ 



Fig. 8  $T_{RA}$  values of comparison cells at different  $V_{DD}$ 



Fig. 9 Voltage at node X1 of SE9T cell during read operation

With downscaling of supply voltage, the performance of submicron SRAM cells are severely limited by increasing process variations (Pal et al. [2018](#page-11-0); Ahmad et al. [2016](#page-10-0)). Therefore, it is necessary to ensure that the cell is robust in



Fig. 10 Read current values of comparison cells at different  $V_{DD}$ 

operation when subjected to harsh situations. As SE9T and SEDF9T cells are read-decoupled in nature, they show significantly lower variability in terms of both  $T_{RA}$  and IREAD when compared to the 7T and FD8T cells.

From Fig. [11,](#page-6-0) which shows the  $T_{RA}$  distribution plots of SE9T and 7T  $\omega$   $V_{\text{DD}}$  = 700 mV, it can be seen that SE9T exhibits 1.15 $\times$  tighter disperse in  $T_{RA}$  compared to 7T. Furthermore, SE9T also shows a  $1.06\times$  narrower spread in  $T_{RA}$  than FD8T cell (see Fig. [12\)](#page-6-0). From Fig. [13,](#page-6-0) which compares the variability in  $I_{\text{READ}}$  distribution of SE9T and FD8T at  $V_{\text{DD}} = 700 \text{ mV}$ , it can be observed that our bitcell exhibits a 1.38 $\times$  tighter disperse in  $I_{\text{READ}}$  than FD8T cell, as well as  $1.54 \times$  shorter spread in  $I_{\text{READ}}$  than 7T cell (see Fig. [14](#page-6-0)). So, the SE9T cell operates robustly under widespread PVT variations.

### 3.3 Analysis of writing capability

The capability of a static random-access memory bitcell to complete a write operation successfully is estimated by WSNM. In other words, it signifies the ability of a static

<span id="page-6-0"></span>

Fig. 11  $T_{RA}$  distribution plot of SE9T and 7T @ 700 mV supply voltage



Fig. 12 Variability in  $T_{RA}$  of different cells at various  $V_{DD}$  values



Fig. 13 Read current distribution plot of SE9T and FD8T @ 700 mV supply voltage



Fig. 14 Variability in  $I_{\text{READ}}$  of different cells at various  $V_{\text{DD}}$  values



Fig. 15 Estimation of write ability of various cells  $\omega$   $V_{\text{DD}} = 700 \text{ mV}$ 

random access memory bitcell to reduce the '1' storing node such that its value becomes lower than the  $V_M$ (switching voltage) of the inverter, which holds '0', for flipping the saved value (Pal and Islam [2016b](#page-11-0)).

Write SNM is graphically calculated as mentioned in Pal et al. [\(2019b](#page-11-0)). Figure 15, shows the WSNM of various cells at  $V_{\text{DD}} = 700$  mV. As is evident, the single-ended SEDF9T shows the lowest WSNM as mentioned in Pal et al. ([2019b\)](#page-11-0). Since the FD8T employs differential writing scheme and its write path consists of fewer transistors, it exhibits a relatively higher WSNM than that of SEDF9T. Although single-ended in nature, the SE9T and 7T cells exhibit considerably higher write ability than FD8T due to the use of feedback-cutting technique. Since, the proposed cell consists of a TG as one of its access transistors in addition to MN4, there is no voltage drop across the TG during the write operation. Consequently, even though the 7T has a single transistor in its access path, the SE9T and the 7T cells exhibit equal WSNM. Therefore, our bitcell exhibits  $7.00\times/1.50\times$  higher WSNM than SEDF9T/FD8T (see Table [1\)](#page-5-0).

### 3.4 Write access time  $(T_{WA})$

FD8T shows the shortest  $T_{WA}$  (see Fig. 16). This is due to its dual-bitline writing scheme. The write delay of singleended cells depends on whether '0' or '1' is being written. '1' writing is particularly difficult to perform and takes more time to complete the write operation (Pal et al. [2019b\)](#page-11-0). Consequently, SEDF9T shows longer delay than FD8T.

The write delay of the 7T cell for writing '1' to storage node Q, is considerably lengthened, as compared with SEDF9T cell, owing to the usages of feedback loop cutting method (Pal et al. [2019b](#page-11-0)). Since the proposed cell employs a feedback-cutting mechanism similar to 7T and consists of a TG in its access path, it shows equal write delay as 7T. Therefore, from Table [1](#page-5-0), which provides the  $T_{WA}$  of various cells  $\omega$  supply voltage = 0.7 V, the SE9T cell shows  $1.12\times$  shorter and  $2.5\times$  longer  $T_{WA}$  than that of SEDF9T and FD8T respectively.

#### 3.5 Dynamic or active power consumption

The power consumed by an SRAM cell due to charging/ discharging of capacitance is defined as  $P_{\text{DYNAMIC}}$  (dynamic power) (Morifuji et al.  $2006$ ). The overall  $P_{DYNAMIC}$ consumption is estimated as the summation of the dissipated power due to assertion of various control signals and the power dissipated due to charging/discharging of bitlines. It is known that  $P_{\text{DYNAMIC}}$  is directly proportional to the effective capacitance. Thus, it is inferred that higher  $P_{\text{DYNAMIC}}$  is required to drive control line if the capacitance associated with it is larger. As a result, an array size of  $256 \times 16$  has been assumed and accordingly, the approximate capacitance associated with BL, WL, etc. of every cell has been estimated for the simulation purpose. Therefore, an estimated capacitance of 17 fF has been assigned to BL of the SE9T cell while the row-based WL/ WE/WEB/RWL and column-based CSL signals have been

assigned estimated capacitances of 0.5 fF/0.7 fF/0.7 fF/0.2 fF and 10 fF,

respectively.

It is seen from Fig. 17 that the 7T and SE9T cells take up significantly smaller power than FD8T. This is because of the usages of single-ended scheme of writing which implies that the BL does not require to be discharged for every write operation and their  $\alpha_{\text{SWITCHING}}$  is maintained below 0.5 (Aly and Bayoumi [2007](#page-10-0)). Given that the majority of dynamic power consumption is constituted by the charging/discharging of bitlines (Wang et al. [2016\)](#page-11-0), the write power consumed by 7T and SE9T is lower than cells like FD8T, which employ dual-bitline structures. WE, WEB and WL control signal lines are row-based. Hence, they are shared by lesser number cells (16 cells). Therefore, capacitance involved with them are smaller as compared to column-based signal lines (256 cells). This results in their lower capacitance and consequently, lower power consumption.

On contrary, the CSL signal is columnar in nature, which implies that the capacitance associated with it is larger. This higher value of capacitance gives rise to higher  $P_{\text{WRITE}}$  (write power consumption of the SE9T cell when compared to 7T, which uses control signal lines, which row-based. Write operation of SEDF9T requires discharging of BL to GND, for both write '1' and write '0' operations. As a result, it consumes higher power than proposed cell. Thus, SE9T consumes considerably lower write power than FD8T and SEDF9T while consuming slightly higher write power than 7T (see Table [1\)](#page-5-0).

Figure [18](#page-8-0) shows read power  $(P_{\text{READ}})$  consumptions of different cells at various supply voltages. It clear from the Fig. [18](#page-8-0), that the FD8T and 7T cells use double-ended reading scheme and hence, they consume higher  $P_{\text{READ}}$ , as compared to SEDF9T and SE9T. This is because SEDF9T and SE9T use single-ended reading scheme.

Since control signals of 7T are row-based, capacitance associated with them are smaller. This results in lower  $P_{\text{READ}}$  than that of FD8T cell because FD8T's CSL is



Fig. 16 Write access time of different cells at various supply voltage



Fig. 17 Write power consumed by different cells at various supply voltages

<span id="page-8-0"></span>

Fig. 18 Read power consumed by different cells at various supply voltages

column-based, hence having larger value of capacitance. Owing to having only single bitline, the SE9T and SEDF9T cells consume lower  $P_{\text{READ}}$ . SEDF9T consumes slightly lower read power than SE9T, wherein multiple row-based control signals like WL, RWL, WE and WEB are asserted.

### 3.6 Static power dissipation

Since leakage power dissipation is a significant concern in submicron technologies, its reduction is one of the major aims of any SRAM design (Chiu and Hu [2014](#page-10-0)). The hold power or leakage power dissipation of various cells at different  $V_{\text{DD}}$  is illustrated in Fig. 19. Moreover, 7T bitcell consumes the highest leakage power dissipation as mentioned in Pal et al. ([2019b\)](#page-11-0).

The leakage power dissipation exhibited by singleended cells like SEDF9T and SE9T is significantly lower compared to dual ended FD8T and 7T (Pal et al. [2019b](#page-11-0)). Further reduction in bitline leakage is obtained in these cells because of transistor stacking in the read path.

From Table [1,](#page-5-0) which shows static power consumption at 700 mV supply voltage, we see that the static power consumption of SEDF9T and SE9T cells is nearly equal. However, the proposed SE9T cell takes up 2.92 times and 1.04 times lower hold power than 7T and FD8T cells.



Fig. 19 Hold power consumed by different cells at various supply voltages

#### 3.7 Mitigation of half-select disturbance

Figure [2](#page-2-0) shows the memory architecture of the proposed SE9T cell. If '1' needs to be stored to 'Q' of bitcell, which is on the top left of the memory architecture, it is completely selected by adjusting the control lines as specified in Sect. [2.1](#page-2-0) and the write operation is successfully completed. All other cells in the same row share the row-based WL\_0/WLB\_0 and WE\_0, set at  $V_{DD}/GND$  and GND respectively, with the selected cell. Consequently, their access TG is turned ON while MN6 is turned OFF. Thus, these cells are row half-selected. However, since CSL is columnar in nature, none of the row half-selected cell share  $CSL_0$ , set at  $V_{DD}$ , with the selected cell. Consequently, their respective CSL signal, kept at GND, maintains the access transistor MN4 in the OFF state and separates nodes that store logic '0' or '1' from bit lines, which avoids wrong-writing in row half-selected cells.

Due to severe leakage in submicron technologies, one may think that miswriting may take place. Figure 20 shows the simulated results (Monte Carlo simulations with 5000 sample) of various node voltages of the cell, which is row half-selected, for both the cases i.e., during write '1' at 'Q' (Fig.  $20a$ ) and write '0' at 'O' (Fig.  $20b$ ) for much longer



Fig. 20 Simulated node voltages of row-half selected SE9T cell while a writing '1' and b writing '0' to 'Q'

<span id="page-9-0"></span>time than  $T_{WA}$ . It can be seen that the 'Q2' voltage does not rise or fall to the switching threshold,  $V_M$ , of the inverter. Therefore, stored data are reserved. This is further proven by the butterfly curve, shown Fig. [6](#page-4-0)b, which is obtained for the cell which is row half-selected during write '1' operation. The figure illustrates that the cell exhibits a considerable magnitude of SNM to resist the flipping of stored data.

Cells which are neither in the same row nor in the same column as the selected cell, are unselected as their respective WL/WLB and CSL signals are deactivated. All the cells in the same column as the selected cell, share  $CSL_0$ , set at  $V_{DD}$ , with it. This turns their access transistor MN4 ON and as a result, these cells are column half-selected. However, given that each of the column half-selected cells is located in a different row, they do not share WL\_0/WLB\_0, set at  $V_{DD}$ , with the selected cell. Consequently, miswriting is prevented in the absence of any write path as the access transistor MN5 is turned OFF by the respective WL signals, set at GND, of column halfselected cells.

Similarly, while read operation, misreading in half-selected cells is prevented due to the row-based RWL and columnar CSL and is reflected by the butterfly curve of the row half-selected cell (see Fig. [6b](#page-4-0)), which exhibits a significant magnitude of SNM. During the hold operation, the butterfly curve of the row-half selected cell shows that it is capable of preserving the stored data (see Fig. [6](#page-4-0)b). Thus, the proposed SE9T cell is half-select disturbance free.

### 3.8 Layout area

The layout view of the SE9T, 7T, FD8T and SEDF9T are displayed in Fig. 21. They have been designed as mentioned in Pal et al. ([2019b\)](#page-11-0). All the cell areas have been normalized with respect to the proposed SE9T, where the row-based RDT has been excluded



Fig. 21 Layout view of SE9T

and higher metal layer is used (not shown) to connect node X2 to the transistor. However, the RDT if sketched within the row pitch in the leftmost side of that row, causes negligible area overhead, because all the cells of that row share it. The area consumed by different cells considered for comparison are listed in Table [1.](#page-5-0) The 7T cell consumes marginally lesser area  $(0.81 \times)$  when compared to SE9T for having fewer transistors. For SEDF9T, the transistors, which decouple the storage node during read operation and for the read path, fit exactly in the place which is left by the upsized PD and consume less area compared to SE9T. An extra inverter causes FD8T to consume more area and TG causes SE9T consume larger area than other cells. However, the extra PMOS (MP3) in FD8T results in relatively more area, which is not directly connected to the core inverter and requires a separate n-well (larger in size) whereas for SE9T the extra PMOS (MP3) is directly connected with the internal cross-coupled inverter.

### 4 Comparison summary

The comparison of SE9T with the previously discussed FD8T, 7T and SEDF9T cells along with three additional designs—feedback-cutting 7T (FC7T) (Ensan et al. [2019](#page-10-0)), ultra-low-power 9T (ULP9T) (Moghaddam et al. [2016](#page-11-0)) and feedback-cutting 11T (FC11T) (Ensan et al. [2018\)](#page-10-0)—has been reported in Table [1.](#page-5-0) For fair comparison of simulation results, all the cells were assigned appropriate sizing and capacitances. As is evident from Table [1](#page-5-0), the SE9T cell shows the higher WSNM than most comparison cells due to the usages of feedback-breaking method. In addition, our proposed cell exhibits considerably higher RSNM than that of FC7T, 7T, FD8T and same as that of other read decoupled cells like SEDF9T and FC11T. Although the ULP9T is read-decoupled in nature, depending on the data stored in its storage nodes, the stacked PMOS in its core cell may be turned OFF, which disconnects the crosscoupled inverters from  $V_{\text{DD}}$  and deteriorates the ability to retain its stored data. Consequently, it exhibits a poor RSNM. Since, the SE9T cell employs single-ended writing schemes, it exhibits longer  $T_{WA}$  than FD8T and ULP9T. The proposed bitcell exhibits a marginally longer write time as compared with SEDF9T owing the use of feedback-cutting mechanism. However, owing to the presence of a TG in its access path it shows similar delay when compared to other single-ended writing cells such as FC7T, 7T and FC11T which employ feedback-cutting techniques as well. Although, SE9T exhibits a longer  $T_{RA}$  than that of FC7T, 7T and FD8T, it shows significantly shorter  $T_{RA}$ when compared to read-decoupled SEDF9T and FC11T while exhibiting slightly longer  $T_{RA}$  than that of ULP9T

<span id="page-10-0"></span>The single-ended writing FC7T, 7T and SE9T cells consume considerably lower write power than differential writing cells such as FD8T and ULP9T. SE9T's columnbased CSL when asserted causes higher write power than the FC7T and 7T cells. Moreover, the proposed cell consumes significantly lower read power than differential reading cells like 7T and FD8T. In addition, its leakage power dissipation is also lower than most of the comparison cells. Although the ULP9T exhibits lower leakage power due to power gating of the stacked PMOS transistor in its core cell, this improvement is obtained at the expense of a severely degraded hold stability or HSNM as its crosscoupled inverters may be disconnected from  $V_{DD}$ , if the stacked PMOS is turned OFF based on the data stored in its storage nodes.

The power delay product (PDP) is an important design metric which reflects the combined effect of delay and power consumption during read/write operations of an SRAM cell, and the lower it is the better. On the other hand, the stability of the cell is quantified by its RSNM, WSNM and HSNM, which must be very high. In addition, the effective design of an SRAM cell requires efficiency in terms of layout area as well. Therefore, to comprehensively asses the performance of different cells used in this work, a design metric called SNM per unit area to PDP ratio (SAPR) has been used, as specified in Ahmad et al. (2016). It is given by:

$$
SAPR = \frac{RSNM \times WSNM \times HSNM}{R_{PDP} \times W_{PDP} \times Area}
$$
 (2)

where  $R_{\text{PDP}}$  and  $W_{\text{PDP}}$  are the PDP obtained during read and write operations, respectively. The SAPR of various cells normalized to SE9T, at  $V_{DD} = 0.7$  V, is reported in Table [1](#page-5-0). As is evident, the SE9T, 7T and FC7T cells exhibit considerably higher SAPR than most other cells owing to their single-ended nature which reduces overall power consumption, the use of feedback-cutting technique which enhances write ability as well as lower area consumption. However, the proposed cell exhibits the highest SAPR due to read-decoupling technique which enhances its read stability, the use of feedback-cutting mechanism which enhances its writing ability as well as transistor stacking in the read path which reduces leakage power dissipation significantly.

# 5 Conclusion

We propose a power-aware, half-select disturbance free 9T (SE9T) cell. It exhibits improvement in read stability owing to the use of decoupled single-ended read operation while the effect of feedback-cutting technique brings about enhancements in write ability. Reduction in  $P_{\text{DYN}}$ 

consumption is achieved due to the reduced activity factor of bitline switching, as the cell is single-ended. Leakage power dissipation is also curtailed due the stacking of transistors in the read path. The proposed circuit exhibits robust behavior even when subjected to severe process variations. Thus, our proposed SE9T can be chosen for low power SRAM design for BAN sensor nodes.

# References

- Ahmad S, Gupta MK, Alam N, Hasan M (2016) Single-ended schmitt-trigger-based robust low-power SRAM cell. IEEE Trans Very Large Scale Integr Syst 24(8):2634–2642
- Aly RE, Bayoumi MA (2007) Low-power cache design using 7T SRAM cell. IEEE Trans Circuits Syst II Express Briefs 54(4):318–322
- Anh-Tuan D, Low JYS, Low JYL, Kong ZH, Tan X, Yeo KS (2011) An 8T differential SRAM with improved noise margin for bitinterleaving in 65 nm CMOS. IEEE Trans Circuits Syst I Regul Pap 58(6):1252-1263
- Chiu YW, Hu YH (2014) 40 Nm Bit-interleaving 12T subthreshold SRAM with data-aware write-assist. IEEE Trans Circuits Syst I Regul Pap 61(9):2578–2585
- Dautov R, Tsouri GR (2016) Securing while sampling in wireless body area networks with application to electrocardiography. IEEE J Biomed Health Inform 20(1):135–142
- Ensan SS, Moaiyeri MH, Hessabi S (2018) A robust and low-power near-threshold SRAM in 10-nm FinFET technology. Analog Integr Circuits Signal Process 94(3):497–506
- Ensan SS, Moaiyeri MH, Moghaddam M, Hessabi S (2019) A lowpower single-ended SRAM in FinFET technology. AEU Int J Electron Commun 99:361–368
- Farkhani H, Peiravi A, Moradi F (2014) A new asymmetric 6T SRAM cell with a write assist technique in 65 nm CMOS technology. Microelectron J 45(11):1556–1565
- Gupta S, Gupta K, Pandey N (2018) Pentavariate  $V_{min}$  analysis of a subthreshold 10T SRAM bit cell with variation tolerant write and divided bit-line read. IEEE Trans Circuits Syst I Regul Pap 65(10):3326–3337
- Islam A, Hasan M (2012a) Leakage characterization of 10T SRAM cell. IEEE Trans Electron Devices 59(3):631–638
- Islam A, Hasan M (2012b) A technique to mitigate impact of process, voltage and temperature variations on design metrics of SRAM cell. Microelectron Reliab 52(2):405–411
- Izumi S et al (2015) A wearable healthcare system with a 13.7  $\mu$ A noise tolerant ECG processor. IEEE Trans Biomed Circuits Syst 9(5):733–742
- Kulkarni JP, Kim K, Roy K (2007) A 160 mV robust schmitt trigger based subthreshold SRAM. IEEE J Solid-State Circuits 42(10):2303–2313
- Kushwah CB, Vishvakarma SK, Dwivedi D (2017) A boostless write optimised single ended robust 7T SRAM cell for ultra-low power memory design. Int J Electron Lett 5(1):13–25
- Kwong J, Chandrakasan AP (2011) An energy-efficient biomedical signal processing platform. IEEE J Solid-State Circuits 46(7):1742–1753
- Maroof N, Kong B (2017) 10T SRAM using Half- VDD precharge and row-wise dynamically powered read port for low switching power and ultralow RBL leakage. IEEE Trans Very Large Scale Integr Syst 25(4):1193–1203
- <span id="page-11-0"></span>Moghaddam M, Timarchi S, Moaiyeri MH, Eshghi M (2016) An ultra-low-power 9T SRAM cell based on threshold voltage techniques. Circuits Syst Signal Process 35(5):1437–1455
- Morifuji E, Yoshida T, Kanda M, Matsuda S, Yamada S, Matsuoka F (2006) Supply and threshold-voltage trends for scaled logic and SRAM MOSFETs. IEEE Trans Electron Devices 53(6):1427–1432
- Nabavi M, Sachdev M (2018) A 290-mV, 3.34-MHz, 6T SRAM with pMOS access transistors and boosted wordline in 65-nm CMOS technology. IEEE J Solid-State Circuits 53(2):656–667
- Nayak D, Acharya DP, Mahapatra K (2017) A read disturbance free differential read SRAM cell for low power and reliable cache in embedded processor. AEU Int J Electron Commun 74:192–197 NIMO PTM model. <http://ptm.asu.edu/>
- Pal S, Islam A (2016a) Variation tolerant differential 8T SRAM cell for ultralow power applications. IEEE Trans Comput Des Integr Circuits Syst 35(4):549–558
- Pal S, Islam A (2016b) 9-T SRAM cell for reliable ultralow-power applications and solving multibit soft-error issue. IEEE Trans Device Mater Reliab 16(2):172–182
- Pal S, Gupta V, Islam A (2018) Variation resilient low-power memristor-based synchronous flip-flops: design and analysis. Microsyst Technol 6:1–14
- Pal S, Gupta V, Ki WH, Islam A (2019a) Transmission gate-based 9T SRAM cell for variation resilient low power and reliable internet of things applications. IET Circuits Devices Syst 13(5):584–595
- Pal S, Bose S, Ki W-H, Islam A (2019b) Characterization of halfselect free write assist 9T SRAM cell. IEEE Trans Electron Devices 66(11):4745–4752
- Pal S, Bose S, Islam A (2019c) A reliable write assist low power SRAM cell for wireless sensor network applications. IET Circuits Devices Syst 14:137–147. [https://doi.org/10.1049/iet](https://doi.org/10.1049/iet-cds.2019.0050)[cds.2019.0050](https://doi.org/10.1049/iet-cds.2019.0050)
- Pal S, Gupta V, Islam A (2019d) Design of CNFET based power- and variability-aware nonvolatile RRAM cell. Microelectron J 86:7–14
- Pal S, Gupta V, Ki WH, Islam A (2019e) Design and development of memristor-based RRAM. IET Circuits Devices Syst 13(4):548–557
- Pal S, Bose S, Ki W-H, Islam A (2019f) Design of power- and variability-aware nonvolatile RRAM cell using memristor as a memory element. IEEE J Electron Devices Soc 7:701–709
- Pal S, Bose S, Islam A (2019g) Design of memristor based low power and highly reliable ReRAM cell. Microsyst Technol 1:1–15
- Pal S, Bose S, Ki WH, Islam A (2020a) A highly stable reliable SRAM cell design for low power applications. Microelectron Reliab 105:113503
- Pal S, Bose S, Ki WH, Islam A (2020b) Half-select-free low-power dynamic loop-cutting write assist SRAM cell for space applications. IEEE Trans Electron Devices 67(1):80–89
- Sharma V, Cosemans S, Ashouie M, Huisken J, Catthoor F, Dehaene W (2012) Ultra low-energy SRAM design for smart ubiquitous sensors. IEEE Micro 32(5):10–24
- Sharma V, Gopal M, Singh P, Vishvakarma SK (2018) A 220 mV robust read-decoupled partial feedback cutting based lowleakage 9T SRAM for Internet of Things (IoT) applications. AEU Int J Electron Commun 87:144–157
- Takeda K et al (2006) A read-static-noise-margin-free SRAM cell for low-VDD and high-speed applications. IEEE J Solid-State Circuits 41(1):113–121
- Tawfik SA, Kursun V (2008) Low power and robust 7T dual-Vt SRAM circuit. In: Proceedings of IEEE Int. Symp. Circuits Syst., pp 1452–1455
- Tu MH, Lin JY, Tsai MC, Jou SJ, Te Chuang C (2010) Single-ended subthreshold SRAM with asymmetrical write/read-assist. IEEE Trans Circuits Syst I Regul Pap 57(12):3039–3047
- Tu MH, Lin JY, Tsai MC (2012) A single-ended disturb-free 9T subthreshold SRAM with cross-point data-aware write word-line structure, negative bit-line, and adaptive read operation timing tracing. IEEE J Solid-State Circuits 47(6):1469–1482
- Wang X, Zhang Y, Lu C, Mao Z (2016) Power efficient SRAM design with integrated bit line charge pump. AEU Int J Electron Commun 70(10):1395–1402

Publisher's Note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.