1 Introduction

Rapid advancements in technology has made it possible to continuously monitor the health of a patient with the help of small and precise sensors attached to their body. The detection of blood pressure and low oxygen levels are some of its potential applications (Izumi et al. 2015). Such a collection of sensors capable of wireless communication forms a wireless network system, which is popularly known as wireless-body-area-network (WBAN) (Dautov and Tsouri 2016).

WBAN nodes generally have a very small form factor (< 1 cm3) which reduces their on-sensor battery energy and are constrained to a very low power limit (Sharma et al. 2012). However, the data collected by the sensors must be transferred wirelessly to a processing unit, which is a power-hungry process. Thus, to reduce power consumption the amount of data must be minimized (Kwong and Chandrakasan 2011). This is realized by employing signal processing. The processing is done on wireless sensor-nodes, that require a highly-dense SRAM (static RAM) along with an intelligent processor (Sharma et al. 2012). WBAN nodes are mostly “ON” as they continuously collect, process and transfer data, for the real-time diagnosis of patients. Consequently, the dynamic power consumption of SRAM-based cache memories must be considerably low to ensure extended battery lives of WBAN nodes (Sharma et al. 2012).

Since dynamic or active power is a quadratic function of supply voltage (VDD), it can be significantly reduced by downscaling of VDD (Morifuji et al. 2006). Moreover, reduction in supply voltage also leads to decrease in static power consumption, which is responsible for a considerable portion of cumulative power consumption, as it is linearly dependent on VDD (Gupta et al. 2018). Thus, downscaling of VDD leads to decrease in overall power consumption.

However, with decrease in supply voltage, operational delay increases which causes substantial amount of energy consumption for every read/write cycle (Nabavi and Sachdev 2018). Moreover, as VDD reduces, (VDD − Vt) also reduces, which leads to severe deterioration of noise margin and renders the circuit unreliable (Maroof and Kong 2017). In addition, further degradation in the stability of SRAM cell may occur due to the extensive presence of PVT variations and random dopant fluctuations (RDF) in submicron technologies (Nayak et al. 2017).

At low supply voltages, the read SNM (which stands for static noise margin) of conventional 6T bitcell is considerably degraded and hence, it is highly susceptible to read upsets (Pal and Islam 2016a). On a similar note, the 6T cell fails to keep the same driveability of the transistors at low VDD leads to a high probability of write failure, as the cell may be unable to reverse the stored data. Moreover, half-select issue is another major problem in the conventional 6T cell, which leads to miswriting in half-selected cells.

Over the past couple of decades, several modifications to the conventional design have been proposed to overcome its limitations. The fully differential 8T cell (FD8T) in Anh-Tuan et al. (2011) employs an additional decoupling inverter to prevent half-select disturbance. However, it is essentially a conventional 6T cell and suffers from read upset occurrences. The read-decoupled SRAM cells proposed in Pal and Islam (2016a, b), Islam and Hasan (2012a), Chiu and Hu (2014), Sharma et al. (2018) and Ensan et al. (2018) isolate storage nodes from bit lines. Consequently, they exhibit significant improvements in RSNM. Moreover, by employing an additional tail-transistor in their core cell, the leakage power dissipation of LP9T (Pal and Islam 2016a), LP10T Islam and Hasan (2012a) and LP11T (Pal and Islam 2016b) is considerably curtailed.

However, such improvements are achieved at the expense of very high dynamic power consumption. The most effective way of curtailing dynamic power is the use of single-ended structures as their bitline activity factor is below 0.5 (Aly and Bayoumi 2007). However, in the absence of any write assist mechanism, such single-ended cells are incapable of writing ‘1’ (Tu et al. 2010).

Therefore, several distinct write assist schemes have been employed by various single-ended cells like Aly and Bayoumi (2007), Tu et al. (2010, 2012), Pal et al. (2019a, b, 2020a), Farkhani et al. (2014), Kushwah et al. (2017), Takeda et al. (2006), Tawfik and Kursun (2008) to successfully complete the ‘1’ writing operation. For example, by using a feedback cutting transistor inside its core cell, the 7T cell in Aly and Bayoumi (2007) and 9T cell in Pal et al. (2019a) exhibit significant improvements in write static noise margin (WSNM) or write ability. However, in absence of decoupling techniques it is prone to frequent read upsets. Although the write ability of cells in Tu et al. (2010) and Farkhani et al. 2014) is enhanced due to the use of a core cell with asymmetrical inverter sizing, they are highly susceptible to PVT variations. Authors in Tu et al. (2012) have suggested a 9T SRAM cell (SEDF9T) which exhibits an improved write ability by employing a negative bitline scheme at the expense of a high VDD,min, when subjected to PVT variations. Read-disturb free 7T cells in Kushwah et al. (2017) and Takeda et al. (2006) employ an additional transistor in one of the inverters at the cost of a severely degraded hold stability. Although the dual Vt SRAM cell in Tawfik and Kursun (2008) can write ‘1’ successfully, it increases the fabrication complexity considerably. In order to achieve enhancements in both read stability and write ability without hampering each other, a write-assist low power 11T (WALP11T) cell Pal et al. (2019c), a data aware power cut off (DAPC) SRAM cell (Chiu and Hu 2014), a Schmitt trigger based SRAM cell (Kulkarni et al. 2007) and a 12T DWA12T cell (that cuts the loop dynamically) (Pal et al. 2020b) have been previously proposed, which also exhibit robust behaviour when subjected to severe process variations. For all their advantages, these designs consume excessive amounts of active power and a considerable area overhead is incurred in their fabrication.

In order to address the various issues faced by the aforementioned design, we have proposed a single-ended 9T (SE9T) SRAM cell (see Fig. 1) in this paper, which not only enhances the read stability and write ability individually, but also minimizes both active and static power consumption. Moreover, it is also half-select disturbance free.

Fig. 1
figure 1

Sketch of single-ended 9T (SE9T) static random-access memory bit cell, which is proposed in this work

The SE9T bitcell is elaborately described in Sect. 2. We compare our cell with the cells in the literature in Sect. 3 and the comparison summary is provided in the next section (i.e., Sect. 4). We summarize the paper in Sect. 5 with appropriate conclusion.

2 The SE9T cell and its operation

The core cell of SE9T (see Fig. 1) is comprised of inverter INV1/INV2 formed by MP1/MP2 and MN1/MN2. The transistor MN6 is used as feedback cutting FET, which is placed between INV1 and INV2. The CSL is used as column selection control line. It puts the access transistor MN4 ON while the WL and WLB, which are row-based word lines, put write access transmission gate (TG) composed of MN3 and MP3. MN4 is connected to a single columnar bitline (BL). The read decoupling transistor MN5 has its gate connected to node QB and is connected between nodes X1 and X2. Figure 2 provides a simplified architecture of the proposed cell. MN5 is connected to GND via FET RDT. RDT is larger transistor and is shared by each cell in a particular row and is activated by row-based signal RWL.

Fig. 2
figure 2

Simplified array-based architecture for SE9T SRAM cell

2.1 Feedback cutting write operation

At the beginning of the write operation, the column-based CSL and row-based WL are set to VDD to turn MN4 and MN5 ON, respectively. The RDT is put in nonconducting state by grounded RWL. Row-based write enable signal WE is set to GND to turn OFF the feedback-cutting MN6, which in turn cuts off the feedback path, in presence of which it is not possible to complete the ‘1’ writing operation. Consequently, the inverter, adjacent to the BL, drives the other inverter and eventually write operation is completed. BL is driven by write driver (not shown) to VDD for writing ‘1’ or driven to GND for writing ‘0’ to storage node Q.

Let us take the case of writing ‘1’ with the assumption that QB holds ‘1’ and Q holds ‘0’ initially prior to this write operation. For this purpose, BL is kept @ VDD. As a result, voltage from BL is passed on to node Q2. As the voltage at Q2 rises, the output of INV1 (QB) rapidly falls from VDD. With decrease in VQB, the input voltage of INV2 decreases. This turns MP2 ON while MN2 OFF. This causes VQ to rise which is the output of INV2. Thus, logic ‘1’ is stored at storage node Q while logic ‘0’ is stored at storage node QB. Similarly, ‘0’ is written in a complementary fashion.

2.2 Single ended decoupled-read operation

During each read operation, WE is set to VDD to turn ON the intermediate MN6 to ensure that feedback path between the two inverters exists. The nodes Q and as well as QB are physically separated from the pre-charged BL by setting row-based WL/WLB to GND/VDD, which turns the TG OFF.

At the beginning of read operation, CSL and RWL are activated to turn ON access transistor MN4 and shared read-discharge transistor RDT, respectively. Since the gate of MN5 is connected to storage node QB, BL either discharges or remains charged depending upon the data stored in QB.

A sense amplifier (not shown) is used to sense a 50 mV fall in voltage of BL with respect to a reference voltage, completing a read operation.

2.3 Hold operation

CSL and WL/WLB are maintained at GND and GND/VDD respectively during the hold operation to turn OFF access transistor MN4 and TG during the hold operation while BL remains precharged. To retain data through feedback path, WE is maintained at VDD to turn ON the MN6. Since RWL is set to GND, the RDT is maintained in OFF state.

3 Simulation setup and results

In this work, SPICE and 16-nm PTM (http://ptm.asu.edu/) have been used for this work. We have compared our design with existing 7T (Aly and Bayoumi 2007) (Fig. 3), SEDF9T (Tu et al. 2012) (Fig. 4) and FD8T (Anh-Tuan et al. 2011) (Fig. 5) cells to determine its effectiveness. In addition to prevalent read–write conflicts, widespread effects of process variations are also instrumental in determining the sizing of SRAM cells. The RDF induced Vt shift is related to device dimensions and is given by

$$ \sigma_{vt} \propto \frac{1}{{\sqrt {Width \times Length} }}. $$
(1)

Thus, Eq. (1) signifies that Vt variations decrease considerably with increase in device area Islam and Hasan (2012b). Consequently, transistors that occupy large areas are highly tolerant of PVT variations. However, it is also a necessity to address the read–write conflict existing in conventional 6T cells to obtain optimum transistor sizing. FD8T basically 6T bitcell with an additional NOT gate. Hence, it is susceptible to flipping of data while reading. The access transistors need to be weaker than driver transistors to deal with this issue. Hence, the ratio of their widths, given by βratio, must be larger than 1. On the other hand, the ‘1’ storing node is required to be definitely discharged to zero by the access transistor during the writing mode. However, the PMOSFET keeps trying to maintain the storage node high. This results in, a ‘fight’ or a conflict arises between the two devices. The γratio, (PMOSFET to ACCESS-NMOSFET strength ration) must be chosen appropriately in order to ensure a successful write operation. Therefore, the βratio should be maintained between 1.2 and 3 (Pal and Islam 2016a) while the γratio should be kept below 1.8 (Pal and Islam 2016b) to obtain an optimum read and write operation in the conventional 6T cell.

Fig. 3
figure 3

Schematic of 7T SRAM cell (Aly and Bayoumi 2007)

Fig. 4
figure 4

Schematic of SEDF9T SRAM cell (Tu et al. 2012)

Fig. 5
figure 5

Transistor-level circuit diagram of FD8T static RAM bitcell (Anh-Tuan et al. 2011)

By taking aforementioned constraints into consideration, the PD and PU transistors of the FD8T cell have been assigned a width of 160-nm and 64-nm, respectively, in the core cell, while a 64-nm width has been assigned to all other transistors (see Fig. 5). Thus, a βratio of 2.5 and a γratio of 1 are maintained. All other cells used in this work have been apportioned suitable sizing (see Figs. 1, 3, 4) to ensure a fair comparison. Since the row-based RDT, employed by the proposed cell, is shared by each cell in a row, a relatively larger width (160-nm) has been assigned to it.

The process/device parameters are becoming no more predictable because of the aggressive technology scaling. Therefore, the design metrics of an SRAM cell are also becoming unpredictable. Thus, influence of process variations on different design metrics need to be investigated (Islam and Hasan 2012a). MOSFET parameters (such as L, W, NDEP, tOX, etc.) and environmental parameters (such as temperature and supply voltage) have been given 10% Gaussian variation with 3σ to generate model parameters of MOSFET for 5000 Monte Carlo samples (Pal and Islam 2016b).

Capacitance plays a very important role in determining various design parameters of an static RAM bitcell, such as read/write access time and power as well as other parameters. Therefore, by assuming an array size of 256 × 16, the capacitance associated with BLs, WL, etc. of every cell has been estimated for their various simulations.

3.1 Read stability

Read SNM is the smallest magnitude of noise voltage which is capable of flipping data stored in a bitcell while reading (Nayak et al. 2017). Consequently, it is an estimate of the cell’s stability during read operation.

Read SNM is estimated as given in Pal et al. (2019b) (see Fig. 6a). The read stabilities of various comparison cells at different VDD are shown in Fig. 7. FD8T cell exhibits the least RSNM. This is because the FD8T cell is prone to frequent read upsets as it is basically a traditional 6T cell having an extra not gate (Pal and Islam 2016b). 7T exhibits slightly enhanced RSNM when compared to FD8T (Pal et al. 2019a).

Fig. 6
figure 6

a RSNM of various cells and b butterfly curves of row-half-selected SE9T cell during hold, read and write operations @ VDD = 0.7 V

Fig. 7
figure 7

RSNM values of comparison cells at different VDD

Amongst all comparison cells, the SEDF9T and SE9T cells show the highest RSNMs. This can be attributed to the read decoupling technique employed by these cells, which physically isolates the storage nodes from the bitline to prevent any capacitive noise involvement from the same and consequently, eliminates the possibility of read upset occurrence (Sharma et al. 2018; Tu et al. 2012). Thus, from Table 1, which provides RSNM values of various comparison bitcells at a VDD = 700 mV, it is found to exhibit 3.36×/2.87× higher RSNM than FD8T and 7T, respectively.

Table 1 Comparison among different SRAM cells @ VDD = 0.7 V

3.2 Read access time (TRA) and read current (IREAD)

The read delay (TRA) for cells employing differential reading schemes is estimated as mentioned in Pal et al. (2019d, e, f, g) while the same for single-ended reading cells is estimated as mentioned in Pal et al. (2019b, 2020a).

TRA of various comparison bitcells at various VDD values are illustrated in Fig. 8, from which it can be seen that the differential- reading FD8T and 7T achieve the shortest TRA. However, this is obtained at the expense of degraded read stability, which is detailed earlier in previous subsection.

Fig. 8
figure 8

TRA values of comparison cells at different VDD

As opposed to this, the TRA of single-ended cells like SEDF9T and SE9T is relatively longer. This is because they have more number of transistors in their read path. In addition, read delay is further lengthened as their read buffer possess higher body-effect.

If the situation, where ‘Q’ and ‘QB’ hold ‘0’ and ‘1’ respectively, is considered during read mode, then MN5, in the read path, becomes conductive. The transistor RDT is turned ON by active RWL (refer Fig. 1). A positive voltage higher than zero is built at the intermediate node (‘X1’) of MN4 and MN5, since BL is high. It is seen that initially the voltage at X1 rises to 173 mV and is followed by its gradual fall to 160 mV at the point of reading, when the bitline voltage reaches (VDD − 50 mV) (see Fig. 9). Thus, the VBS (body-to-source voltage) of transistor MN4 becomes negative. Consequently, the drivability of the device is diminished and BL discharges slowly. On the contrary, the node QB of the FD8T cell records a lower increase in voltage, from 0 to 103 mV, due to voltage division effect. Therefore, BL discharges at a faster rate and the corresponding TRA is shorter than compared to SE9T. SE9T exhibits shorter TRA when compared to SEDF9T.

Fig. 9
figure 9

Voltage at node X1 of SE9T cell during read operation

The read current (IREAD) comparison of various cells at different VDD is provided by Fig. 10. Given that TRA is inversely proportional to read current Pal et al. (2019b), the proposed cell concedes a penalty in read delay due to a relatively small value of read current. Owing to the same reason, the SEDF9T exhibits smaller IREAD than the SE9T cell.

Fig. 10
figure 10

Read current values of comparison cells at different VDD

With downscaling of supply voltage, the performance of submicron SRAM cells are severely limited by increasing process variations (Pal et al. 2018; Ahmad et al. 2016). Therefore, it is necessary to ensure that the cell is robust in operation when subjected to harsh situations. As SE9T and SEDF9T cells are read-decoupled in nature, they show significantly lower variability in terms of both TRA and IREAD when compared to the 7T and FD8T cells.

From Fig. 11, which shows the TRA distribution plots of SE9T and 7T @ VDD = 700 mV, it can be seen that SE9T exhibits 1.15× tighter disperse in TRA compared to 7T. Furthermore, SE9T also shows a 1.06× narrower spread in TRA than FD8T cell (see Fig. 12). From Fig. 13, which compares the variability in IREAD distribution of SE9T and FD8T at VDD = 700 mV, it can be observed that our bitcell exhibits a 1.38× tighter disperse in IREAD than FD8T cell, as well as 1.54× shorter spread in IREAD than 7T cell (see Fig. 14). So, the SE9T cell operates robustly under widespread PVT variations.

Fig. 11
figure 11

TRA distribution plot of SE9T and 7T @ 700 mV supply voltage

Fig. 12
figure 12

Variability in TRA of different cells at various VDD values

Fig. 13
figure 13

Read current distribution plot of SE9T and FD8T @ 700 mV supply voltage

Fig. 14
figure 14

Variability in IREAD of different cells at various VDD values

3.3 Analysis of writing capability

The capability of a static random-access memory bitcell to complete a write operation successfully is estimated by WSNM. In other words, it signifies the ability of a static random access memory bitcell to reduce the ‘1’ storing node such that its value becomes lower than the VM (switching voltage) of the inverter, which holds ‘0’, for flipping the saved value (Pal and Islam 2016b).

Write SNM is graphically calculated as mentioned in Pal et al. (2019b). Figure 15, shows the WSNM of various cells at VDD = 700 mV. As is evident, the single-ended SEDF9T shows the lowest WSNM as mentioned in Pal et al. (2019b). Since the FD8T employs differential writing scheme and its write path consists of fewer transistors, it exhibits a relatively higher WSNM than that of SEDF9T. Although single-ended in nature, the SE9T and 7T cells exhibit considerably higher write ability than FD8T due to the use of feedback-cutting technique. Since, the proposed cell consists of a TG as one of its access transistors in addition to MN4, there is no voltage drop across the TG during the write operation. Consequently, even though the 7T has a single transistor in its access path, the SE9T and the 7T cells exhibit equal WSNM. Therefore, our bitcell exhibits 7.00×/1.50× higher WSNM than SEDF9T/FD8T (see Table 1).

Fig. 15
figure 15

Estimation of write ability of various cells @ VDD = 700 mV

3.4 Write access time (TWA)

FD8T shows the shortest TWA (see Fig. 16). This is due to its dual-bitline writing scheme. The write delay of single-ended cells depends on whether ‘0’ or ‘1’ is being written. ‘1’ writing is particularly difficult to perform and takes more time to complete the write operation (Pal et al. 2019b). Consequently, SEDF9T shows longer delay than FD8T.

Fig. 16
figure 16

Write access time of different cells at various supply voltage

The write delay of the 7T cell for writing ‘1’ to storage node Q, is considerably lengthened, as compared with SEDF9T cell, owing to the usages of feedback loop cutting method (Pal et al. 2019b). Since the proposed cell employs a feedback-cutting mechanism similar to 7T and consists of a TG in its access path, it shows equal write delay as 7T. Therefore, from Table 1, which provides the TWA of various cells @ supply voltage = 0.7 V, the SE9T cell shows 1.12× shorter and 2.5× longer TWA than that of SEDF9T and FD8T respectively.

3.5 Dynamic or active power consumption

The power consumed by an SRAM cell due to charging/discharging of capacitance is defined as PDYNAMIC (dynamic power) (Morifuji et al. 2006). The overall PDYNAMIC consumption is estimated as the summation of the dissipated power due to assertion of various control signals and the power dissipated due to charging/discharging of bitlines. It is known that PDYNAMIC is directly proportional to the effective capacitance. Thus, it is inferred that higher PDYNAMIC is required to drive control line if the capacitance associated with it is larger. As a result, an array size of 256 × 16 has been assumed and accordingly, the approximate capacitance associated with BL, WL, etc. of every cell has been estimated for the simulation purpose. Therefore, an estimated capacitance of 17 fF has been assigned to BL of the SE9T cell while the row-based WL/WE/WEB/RWL and column-based CSL signals have been assigned estimated capacitances of 0.5 fF/0.7 fF/0.7 fF/0.2 fF and 10 fF,

respectively.

It is seen from Fig. 17 that the 7T and SE9T cells take up significantly smaller power than FD8T. This is because of the usages of single-ended scheme of writing which implies that the BL does not require to be discharged for every write operation and their αSWITCHING is maintained below 0.5 (Aly and Bayoumi 2007). Given that the majority of dynamic power consumption is constituted by the charging/discharging of bitlines (Wang et al. 2016), the write power consumed by 7T and SE9T is lower than cells like FD8T, which employ dual-bitline structures. WE, WEB and WL control signal lines are row-based. Hence, they are shared by lesser number cells (16 cells). Therefore, capacitance involved with them are smaller as compared to column-based signal lines (256 cells). This results in their lower capacitance and consequently, lower power consumption.

Fig. 17
figure 17

Write power consumed by different cells at various supply voltages

On contrary, the CSL signal is columnar in nature, which implies that the capacitance associated with it is larger. This higher value of capacitance gives rise to higher PWRITE (write power consumption of the SE9T cell when compared to 7T, which uses control signal lines, which row-based. Write operation of SEDF9T requires discharging of BL to GND, for both write ‘1’ and write ‘0’ operations. As a result, it consumes higher power than proposed cell. Thus, SE9T consumes considerably lower write power than FD8T and SEDF9T while consuming slightly higher write power than 7T (see Table 1).

Figure 18 shows read power (PREAD) consumptions of different cells at various supply voltages. It clear from the Fig. 18, that the FD8T and 7T cells use double-ended reading scheme and hence, they consume higher PREAD, as compared to SEDF9T and SE9T. This is because SEDF9T and SE9T use single-ended reading scheme.

Fig. 18
figure 18

Read power consumed by different cells at various supply voltages

Since control signals of 7T are row-based, capacitance associated with them are smaller. This results in lower PREAD than that of FD8T cell because FD8T’s CSL is column-based, hence having larger value of capacitance. Owing to having only single bitline, the SE9T and SEDF9T cells consume lower PREAD. SEDF9T consumes slightly lower read power than SE9T, wherein multiple row-based control signals like WL, RWL, WE and WEB are asserted.

3.6 Static power dissipation

Since leakage power dissipation is a significant concern in submicron technologies, its reduction is one of the major aims of any SRAM design (Chiu and Hu 2014). The hold power or leakage power dissipation of various cells at different VDD is illustrated in Fig. 19. Moreover, 7T bitcell consumes the highest leakage power dissipation as mentioned in Pal et al. (2019b).

Fig. 19
figure 19

Hold power consumed by different cells at various supply voltages

The leakage power dissipation exhibited by single-ended cells like SEDF9T and SE9T is significantly lower compared to dual ended FD8T and 7T (Pal et al. 2019b). Further reduction in bitline leakage is obtained in these cells because of transistor stacking in the read path.

From Table 1, which shows static power consumption at 700 mV supply voltage, we see that the static power consumption of SEDF9T and SE9T cells is nearly equal. However, the proposed SE9T cell takes up 2.92 times and 1.04 times lower hold power than 7T and FD8T cells.

3.7 Mitigation of half-select disturbance

Figure 2 shows the memory architecture of the proposed SE9T cell. If ‘1’ needs to be stored to ‘Q’ of bitcell, which is on the top left of the memory architecture, it is completely selected by adjusting the control lines as specified in Sect. 2.1 and the write operation is successfully completed. All other cells in the same row share the row-based WL_0/WLB_0 and WE_0, set at VDD/GND and GND respectively, with the selected cell. Consequently, their access TG is turned ON while MN6 is turned OFF. Thus, these cells are row half-selected. However, since CSL is columnar in nature, none of the row half-selected cell share CSL_0, set at VDD, with the selected cell. Consequently, their respective CSL signal, kept at GND, maintains the access transistor MN4 in the OFF state and separates nodes that store logic ‘0’ or ‘1’ from bit lines, which avoids wrong-writing in row half-selected cells.

Due to severe leakage in submicron technologies, one may think that miswriting may take place. Figure 20 shows the simulated results (Monte Carlo simulations with 5000 sample) of various node voltages of the cell, which is row half-selected, for both the cases i.e., during write ‘1’ at ‘Q’ (Fig. 20a) and write ‘0’ at ‘Q’ (Fig. 20b) for much longer time than TWA. It can be seen that the ‘Q2’ voltage does not rise or fall to the switching threshold, VM, of the inverter. Therefore, stored data are reserved. This is further proven by the butterfly curve, shown Fig. 6b, which is obtained for the cell which is row half-selected during write ‘1’ operation. The figure illustrates that the cell exhibits a considerable magnitude of SNM to resist the flipping of stored data.

Fig. 20
figure 20

Simulated node voltages of row-half selected SE9T cell while a writing ‘1’ and b writing ‘0’ to ‘Q’

Cells which are neither in the same row nor in the same column as the selected cell, are unselected as their respective WL/WLB and CSL signals are deactivated. All the cells in the same column as the selected cell, share CSL_0, set at VDD, with it. This turns their access transistor MN4 ON and as a result, these cells are column half-selected. However, given that each of the column half-selected cells is located in a different row, they do not share WL_0/WLB_0, set at VDD, with the selected cell. Consequently, miswriting is prevented in the absence of any write path as the access transistor MN5 is turned OFF by the respective WL signals, set at GND, of column half-selected cells.

Similarly, while read operation, misreading in half-selected cells is prevented due to the row-based RWL and columnar CSL and is reflected by the butterfly curve of the row half-selected cell (see Fig. 6b), which exhibits a significant magnitude of SNM. During the hold operation, the butterfly curve of the row-half selected cell shows that it is capable of preserving the stored data (see Fig. 6b). Thus, the proposed SE9T cell is half-select disturbance free.

3.8 Layout area

The layout view of the SE9T, 7T, FD8T and SEDF9T are displayed in Fig. 21. They have been designed as mentioned in Pal et al. (2019b). All the cell areas have been normalized with respect to the proposed SE9T, where the row-based RDT has been excluded

Fig. 21
figure 21

Layout view of SE9T

and higher metal layer is used (not shown) to connect node X2 to the transistor. However, the RDT if sketched within the row pitch in the leftmost side of that row, causes negligible area overhead, because all the cells of that row share it. The area consumed by different cells considered for comparison are listed in Table 1. The 7T cell consumes marginally lesser area (0.81×) when compared to SE9T for having fewer transistors. For SEDF9T, the transistors, which decouple the storage node during read operation and for the read path, fit exactly in the place which is left by the upsized PD and consume less area compared to SE9T. An extra inverter causes FD8T to consume more area and TG causes SE9T consume larger area than other cells. However, the extra PMOS (MP3) in FD8T results in relatively more area, which is not directly connected to the core inverter and requires a separate n-well (larger in size) whereas for SE9T the extra PMOS (MP3) is directly connected with the internal cross-coupled inverter.

4 Comparison summary

The comparison of SE9T with the previously discussed FD8T, 7T and SEDF9T cells along with three additional designs—feedback-cutting 7T (FC7T) (Ensan et al. 2019), ultra-low-power 9T (ULP9T) (Moghaddam et al. 2016) and feedback-cutting 11T (FC11T) (Ensan et al. 2018)—has been reported in Table 1. For fair comparison of simulation results, all the cells were assigned appropriate sizing and capacitances. As is evident from Table 1, the SE9T cell shows the higher WSNM than most comparison cells due to the usages of feedback-breaking method. In addition, our proposed cell exhibits considerably higher RSNM than that of FC7T, 7T, FD8T and same as that of other read decoupled cells like SEDF9T and FC11T. Although the ULP9T is read-decoupled in nature, depending on the data stored in its storage nodes, the stacked PMOS in its core cell may be turned OFF, which disconnects the cross-coupled inverters from VDD and deteriorates the ability to retain its stored data. Consequently, it exhibits a poor RSNM. Since, the SE9T cell employs single-ended writing schemes, it exhibits longer TWA than FD8T and ULP9T. The proposed bitcell exhibits a marginally longer write time as compared with SEDF9T owing the use of feedback-cutting mechanism. However, owing to the presence of a TG in its access path it shows similar delay when compared to other single-ended writing cells such as FC7T, 7T and FC11T which employ feedback-cutting techniques as well. Although, SE9T exhibits a longer TRA than that of FC7T, 7T and FD8T, it shows significantly shorter TRA when compared to read-decoupled SEDF9T and FC11T while exhibiting slightly longer TRA than that of ULP9T

The single-ended writing FC7T, 7T and SE9T cells consume considerably lower write power than differential writing cells such as FD8T and ULP9T. SE9T’s column-based CSL when asserted causes higher write power than the FC7T and 7T cells. Moreover, the proposed cell consumes significantly lower read power than differential reading cells like 7T and FD8T. In addition, its leakage power dissipation is also lower than most of the comparison cells. Although the ULP9T exhibits lower leakage power due to power gating of the stacked PMOS transistor in its core cell, this improvement is obtained at the expense of a severely degraded hold stability or HSNM as its cross-coupled inverters may be disconnected from VDD, if the stacked PMOS is turned OFF based on the data stored in its storage nodes.

The power delay product (PDP) is an important design metric which reflects the combined effect of delay and power consumption during read/write operations of an SRAM cell, and the lower it is the better. On the other hand, the stability of the cell is quantified by its RSNM, WSNM and HSNM, which must be very high. In addition, the effective design of an SRAM cell requires efficiency in terms of layout area as well. Therefore, to comprehensively asses the performance of different cells used in this work, a design metric called SNM per unit area to PDP ratio (SAPR) has been used, as specified in Ahmad et al. (2016). It is given by:

$$ SAPR = \frac{RSNM \times WSNM \times HSNM}{{R_{PDP} \times W_{PDP} \times Area}} $$
(2)

where RPDP and WPDP are the PDP obtained during read and write operations, respectively. The SAPR of various cells normalized to SE9T, at VDD = 0.7 V, is reported in Table 1. As is evident, the SE9T, 7T and FC7T cells exhibit considerably higher SAPR than most other cells owing to their single-ended nature which reduces overall power consumption, the use of feedback-cutting technique which enhances write ability as well as lower area consumption. However, the proposed cell exhibits the highest SAPR due to read-decoupling technique which enhances its read stability, the use of feedback-cutting mechanism which enhances its writing ability as well as transistor stacking in the read path which reduces leakage power dissipation significantly.

5 Conclusion

We propose a power-aware, half-select disturbance free 9T (SE9T) cell. It exhibits improvement in read stability owing to the use of decoupled single-ended read operation while the effect of feedback-cutting technique brings about enhancements in write ability. Reduction in PDYN consumption is achieved due to the reduced activity factor of bitline switching, as the cell is single-ended. Leakage power dissipation is also curtailed due the stacking of transistors in the read path. The proposed circuit exhibits robust behavior even when subjected to severe process variations. Thus, our proposed SE9T can be chosen for low power SRAM design for BAN sensor nodes.