1 Introduction

The use of static random access memories (SRAMs) is continuously increasing in system-on-chips (SoCs) designs to improve the logic performance [1]. The demand for portable applications such as mobile phones, laptops, and medical equipment, and battery-operated devices like internet-of-things (IoTs) and wireless sensor networks (WSNs) require low-power consumption in the SoCs [2]. Hence, the design of low-power SRAMs is the priority. Downscaling the supply voltage (VDD) is the most efficient way to curtail of total power consumption of an SRAM cell, in consequence of lengthening battery life. This is because both the dynamic power and leakage power consumptions reduce quadratically and linearly with VDD reduction, respectively [3]. Although an SRAM cell can dissipate lower power in the near-/sub-threshold region (where the VDD is slightly higher/lower than the threshold voltage (Vth) of the transistor), it must face the increasing manufacturing process, voltage, and temperature (PVT) variations, reduced cell stability, degraded voltage margin, and prevailing leakage current in this region [4, 5]. In the advanced technology nodes coupled with VDD reduction, the random dopant fluctuations (RDF) increase the Vth variations and lead to on-chip SRAM malfunctioning during the read and write operations because there is an exponentional relationship between SRAM static noise margin (SNM) and Vth in the sub-Vth region [2, 5]. Furthremore, in the scaled technology nodes, SRAM cells become susceptible to soft-errors. Soft-errors can be in the forms of single-cell upsets (SCUs), in which a single bit of data is affected, and multi-cell upsets (MCUs), where multiple bits of data are affected [1]. Single-errors (SERs) are often considered when designing for space applications and other high-radiation environments [6]. This is because the minimum charge deposited in a sensitive node of an SRAM cell that results in a data flip, which called critical charge (Qcrit), while operating within a highly space radiation like high-energy \(\alpha\) radiation, reduces with scaling down of the process technology along with VDD scaling [7].

For conventional 6-transistor SRAM cell (hereafter-called C6T), read SNM (RSNM) and write SNM (WSNM) are two main conflicting design metrics [8]. The C6T cell shows poor read stability at low VDD, hence data stored in the cell may flip during the read operation, resulting in read upset [9]. In the further low VDD, write failure in the C6T cell also may occur because this cell cannot maintain the devices strength ratios in such a VDD [10]. The C6T cell suffers from the read-disturbance and half-select disturb issues [11]. In addition, it shows larger variability in the nanoscale technologies due to severe PVT variations [12]. Since Vth fluctuation is increased by PVT variations in the ultrashort-channel, the C6T cell and its variants cannot be operated in further low VDD without parametric and functional failures, causing yield loss [13]. There is, thus, the need for developing circuit-level techniques to overcome these issues. The most common approaches available in the literature are: read-decoupling technique, feedback-cutting technique [14,15,16,17,18,19,20], power-gating technique [2, 21,22,23], floating the cell VSS [24, 25], single-ended operations [22, 26,27,28], Schmitt-trigger (ST)-based SRAM design [3, 15, 17,18,19, 22, 28,29,30], stacking of transistors [31], bit-interleaving [14, 16, 19, 20, 29, 32], negative bitline [26, 33], wordline boosting [33,34,35]. The well-known technique for RSNM improvement is the read-decoupling, in which the bitlines is fully decoupled from the internal storage nodes of the cell during the read operation. This technique enhances the RSNM as high as hold SNM (HSNM) at a cost of extra transistors or bitline leakage in the reading path.

2 Related works: design and challenge

To overcome the aforementioned issues and challenges, several configurations of SRAM cells have been proposed in the literature. The ST-based SRAM cells proposed in [3, 30] improve both the RSNM and WSNM simultaneously but suffer from the read-disturbance issue and high leakage current. The 7 T and transmission gate (TG)-based 9 T cells proposed in [36, 37], respectively, employ an additional NMOS access transistor inside their cross-coupled inverters pair to eliminate the writing ‘1’ issue in the single-ended SRAM cells. This NMOS device cuts one of the feedback paths of the cross-coupled inverters off during the write operation and leads to an improvement in the WSNM. However, these SRAM cells exhibit high leakage power because the leakage current at their half-cell is much higher than its counterpart in the C6T SRAM cell. Furthermore, these SRAM cells suffer from poor RSNM due to the lack of the read-decoupling method. The 9 T SRAM cell proposed in [38] utilizes fully differential read-disturb free operation; hence, the RSNM improves. However, this SRAM cell incurs a penalty in the read delay and dynamic power due to doubling the number of access transistors connected to the same bitlines. In order to exploit the column-interleaving structure, 9 T [26, 33, 39] and 10 T [40,41,42] SRAM cells have been proposed. These cells use separate path for reading the data stored in the cell, which enhances the cell RSNM. However, the WSNM degrades due to the presence of two series-connected NMOS access transistors in their writing path. Two bit-interleaving SRAM cells were proposed in [43, 44] to enhance both RSNM and WSNM by utilizing isolated read path and data-dependent feedback cutting write-assist scheme, respectively. However, these cells incur a penalty in the write delay and dynamic write power. This is due to offering reduced voltage swing in the gate of write-access transistors and differential writing structure, respectively. Furthermore, the cell in [43] achieves a low read current due to the presence of three series-connected NMOS access transistors in its reading path. The 12 T SRAM cell proposed in [45] consists of an additional pair of transmission gates (TGs) in its access path to refresh the stored data. However, as the gate of the PMOS of the TGs are connected to the storage nodes of the core cell, one of the PMOS is always at ON-state and as a result, it consumes significantly higher leakage power.

Hence, it can be stated that the research is still in progress for developing the circuit-level techniques for SRAM cell design to prevent the leakage and to improve the RSNM and WSNM at scaled VDD. Furthermore, simple cross coupled inverter pair topology, as of the C6T SRAM cell, exhibits less immunity to the process variations at lower VDD.

To address and resolve the above-mentioned problems and challenges, we propose a bit-interleaved 12 T SRAM cell that utilizes the write data-dependent feedback-cutting write-assist and read decoupling techniques. The proposed 12 T SRAM cell, that we hereafter call WFC12T SRAM cell, uses an inverter-based AND gate, offering full voltage swing in its output compared to an NMOS device used in SRAM cells designed in [8, 43, 44], to control write-access transistors and eliminate write half-select issues, reducing write failure probability under harsh PVT variations. It also leads to an improvement in WSNM and write delay. The differential writing structure along with write data-dependent feedback-cutting write-assist technique used in the proposed WFC12T SRAM cell improves the WSNM and write delay. A TG placed inside the cell core of the SRAM proposed in [46] to improve the WSNM, increases write delay due to the formation of two cascaded inverters. Our proposed design lacks the series-connected access transistors in its writing path and uses single-end bitline reading operation, which results in better improvements in the WSNM, write delay, and dynamic read power consumption, compared to 12 T SRAM cells presented in [1, 47]. To summarize, the main characteristics of the proposed WFC12T SRAM cell are as follows.

  1. 1)

    Offering better voltage transfer characteristics (VTCs) of the left/right half-cell by NMOS stacked structure.

  2. 2)

    Improving the RSNM by the isolated read access path and sharp VTC.

  3. 3)

    Enhancing the cell’s WSNM using feedback cutting-aware differential write operation.

  4. 4)

    Low leakage power due to the high VGND in its read access buffer and the presence of stacked transistors in its read buffer and inside the cell core.

  5. 5)

    Low dynamic read power and moderate dynamic write power consumption.

  6. 6)

    More reliability when subjected to grave PVT variations.

  7. 7)

    Supporting bit-interleaving architecture.

  8. 8)

    Reducing multi-bit soft-errors by offering high Qcrit, which indicates its suitability for space applications.

The rest of the paper is organized as follows. Section 3 introduces the proposed WFC12T cell and describes its different operations. The simulation results and the discussions are expressed in Sect. 4. Finally, Sect. 5 concludes this paper.

3 The proposed WFC12T cell and its operations

Figure 1 shows the schematic of the WFC12T cell along with its timing diagram. In this cell, column select line (CSL), write bitline (WBL), complementary write bitline (WBLB), and read bitline (RBL) are column-based, while write wordline (WWL), read wordline (RWL), and virtual ground (VGND) are row-based. The access transistors, M7/M8, are activated by local write wordline (LWWL), which is the output of INV-based AND gate, formed by MP1/MN1. The M9/M10 are gated by Q/RWL and form read access buffer to isolate storage nodes Q/QB from RBL during the read operation. Two additional NMOS transistors, M2/M5, which are driven by WBLB/WBL, are placed in series with pull-up (M3/M6) and pull-down (M1/M4) transistors inside the cell core. Subsections 2.1 to 2.3 describe different operations of the WFC12T cell.

Fig. 1
figure 1

Schematic of the WFC12T SRAM cell along with its timing diagram

3.1 Hold operation

During hold operation or standby mode, the WWL and CSL are kept at VDD and GND, respectively, to set LWWL at GND for turning off the access transistors, M7/M8. The RWL is kept at GND to inactive the M9 transistor. In order to place M2/M5 in the on condition and retain data stored in the cell, the WBL, WBLB, and VGND have remained at VDD. The RBL is left floating.

3.2 Read operation

Before read access, the RBL is precharged to VDD. The WBL and WBLB are kept at VDD to turn on the M2/M5. The CSL and WWL are set at GND and VDD, respectively, to produce output, LWWL, at a low level. Since the LWWL is at GND, the access transistors, M7/M8, are off. The RWL is set to VDD and hence M9 is turned on, while VGND is kept at GND to provide a large read current for discharging the RBL voltage. Depending on what data storage node QB stores, either the RBL is discharged to the ground through the path comprising M9/M10/VGND or remains at its precharged high value (see Fig. 2(a)).

Fig. 2
figure 2

a Read ‘0’ operation. b Write ‘0’ operation

3.3 Write operation

Write operation begins with the WWL and CSL being set to GND and VDD, respectively, in order to set LWWL at VDD. Hence, the access transistors, M7/M8, are turned on. The RBL and VGND are maintained at VDD and GND, respectively, while the RWL is set to GND. Depending on the data to be written to Q/QB, the WBL/WBLB is set at VDD/GND or GND/VDD. For instance, consider the case in which the logical value of ‘0’ is to be written to ‘0’ storing node Q. In this case, WBL/WBLB is kept at GND/VDD and hence M2/M5 is turned on/off. Since M5 is off, there will be no path from QB to the ground. Therefore, WBLB successfully charges up the QB to ‘1’ through M8. Then, the ‘1’ stored at QB completely turns off M3 and thus helps to discharge the node Q to ground through paths comprising M2/M1 and M7 to successfully write ‘0’ into Q (see Fig. 2(b)). The writing ‘1’ at Q operation is performed using a similar procedure.

4 Simulation results and discussion

This section presents the simulation results and the discussion for the proposed cell and the selected cells for comparison based on Sect. 3.1. The circuit diagram of SRAM cells selected in this study is shown in Fig. 3. It includes the fully differential 8 T (FD8T) [48], single-ended TG-based 9 T (TG9T) [46], single-ended disturb-free 9 T (SEDF9T) [26], differential self-refreshing logic-based 12 T (WWL12T) [45], and 12 T (Kim12T) cell [49]. Table 1 compares these cells with the proposed WFC12T cell based on the cell features.

Fig. 3
figure 3

Schematic of SRAM cells, a FD8T [48], b TG9T [46], c SEDF9T [26], d WWL12T [45], e Kim12T [49]

Table 1 A structural and features comparison among various SRAM cells selected in this paper with the WFC12T cell

4.1 Simulation setup and transistor sizing

We utilized HSPICE using the 16-nm CMOS predictive technology model (PTM) [50] for the analysis of the main design metrics of the proposed WFC12T cell. The proposed cell is compared with the FD8T [48], TG9T [46], SEDF9T [26], WWL12T [45], and Kim12T [49] SRAM cells to access its relative superiority in terms of important indicators. Furthermore, we have considered the impact of PVT variations on SRAMs’ design metrics by employing the simulation setup expressed in [8]. Moreover, the VDD is linearly varied by ± 10% and ± 20% from the nominal value of 0.7 V.

The transistor sizing plays an important role to determine the behavior of an SRAM cell. The SRAM cells’ transistors sizing has been determined by satisfying the three following main conditions.

  1. 1)

    Attention to read–write conflicts in the FD8T cell, in order to gratify this condition, the size ratio of pull-down and access transistors (\(\beta_{ratio} = \beta_{pull - down} /\beta_{access} , \beta = C_{ox} W/L\)) must be between 1.2 and 3 [51]. The pull-up to access transistors size ratio (\(\gamma_{ratio} = \beta_{pull - up} /\beta_{access}\)), on the other hand, must be less than or equal to 1.8 [52]. Thus, we have chosen \(\beta_{ratio} = 1.33\) and \(\gamma_{ratio} = 0.67\) because these values make a balance between the values of RSNM and WSNM [53].

  2. 2)

    The fight against the impact of PVT variations in nanoscale technology, Vth shift, induced by random dopant fluctuation (RDF) shows an inverse dependence on the square root of device area as follows [54]

    $$\sigma_{vt} \propto EOT\left( {\sqrt {W \times L} } \right)$$
    (1)

    where EOT is effective oxide thickness, W and L effective channel width and length, respectively. From Eq. (1), it is observed that the simplest way to fight against variability issues is the increase in device area [13].

  3. 3)

    All SRAMs have been sized accordingly for fair comparison [30].

Therefore, in the proposed design, we have assigned a width of 160 nm for MN1, M1, M2, M4, and M5 and a width of 80 nm for MP1, M3, and M6. Furthermore, we have chosen a width of 120 nm for the transistors M7 and M8. The read path transistors, M9 and M10, have been sized with a width of 120 nm (see Fig. 1). Other studied SRAM cells have been sized based on this strategy.

4.2 Read access time analysis

The speed of read operation for an SRAM cell is determined in terms of read access time or read delay (TRA). We have plotted the TRA of all SRAM cells at different VDD values in Fig. 4(a). The TRA of single-ended reading bitcells is defined as the time required to discharge the bitline voltage by 50 mV from its initial high precharged value [46, 55]. For differential reading bitcells, TRA is obtained as the time needed to reach a 50 mV voltage difference between the two bitlines [24]. The WFC12T cell uses the read access buffer and suffers from the larger body effect offered by that. Hence, it needs a longer time (TRA) for discharging the bitline voltage. The nodes voltage of the proposed WFC12T cell during its read operation, assuming the node Q/QB stores ‘0’/ ‘1’ at \(V_{DD} = 0.7\) V is shown in Fig. 4(b). According to the initial logical value stored at node QB, the M10 is turned on and the M9 is activated by setting RWL to VDD. Since the bitline is initially precharged to VDD before read operation, a non-zero voltage caused by voltage divider between M9 and M10 at node X1 of read buffer is developed. From Fig. 4(b), it is observed that the node X1 voltage reaches a maximum non-zero value of 0.1599 V and gradually decreases to 0.1546 V at the time of recording of TRA (i.e. at the time which RBL is discharged by 50 mV from its initial high precharged value). This issue causes a negative VBS (body to source voltage) of M10 and as a result increases Vth, according to Eq. (2) [56].

$$V_{th} = V_{th0} - \lambda_{BS} V_{BS} - \lambda_{DS} V_{DS}$$
(2)

in which Vth0 is the initial threshold voltage, \(\lambda_{BS} > 0\) and \(\lambda_{DS} > 0\) are body bias coefficient and drain-induced barrier lowering (DIBL) coefficient, respectively, and VDS is drain to source voltage. Due to an increase in Vth, the driving strength of the transistor reduces and therefore RBL is discharged slowly. The WFC12T cell shows 1.37X longer and 1.69X shorter TRA than that of FD8T and SEDF9T SRAM cells, respectively at \(V_{DD} = 0.7\) V. The TG9T and SEDF9T SRAM cells employ the same read decoupling technique with three series NMOS transistors, and therefore show the same TRA. The WFC12T cell shows the same TRA compared to Kim12T and WWL12T SRAM cell due to the same read path.

Fig. 4
figure 4

a TRA of various SRAM cells at different VDD values. b Nodes voltage of WFC12T while read operation at \(V_{DD} = 0.7\) V

4.3 Read stability analysis

We obtained the RSNM value using the approach suggested in [57, 58]. Figure 5(a) plots the butterfly VTCs for all SRAM cells during read operation at \(V_{DD} = 0.7\) V. It is clear that the WFC12T offers 5.2X, 1.26X, and 1.18X improvement in RSNM than that of FD8T, Kim12T (also SEDF9T and TG9T(, and WWL12T, respectively. Recent studies have shown that if ratio of an SRAM cell’s RSNM and VDD be larger than or equal to 0.25, that is highly stable [59]. Hence, the WFC12T cell shows high stability due to having a ratio equal to 0.30. Figure 5(b) plots the variation of RSNM value of all SRAM cells with a linear variation of VDD, which shows the WFC12T cell has the highest RSNM amongst all SRAM cells at all VDD values considered for simulations. The FD8T cell shows the least RSNM because this cell is indeed a conventional 6 T cell, and therefore suffers from read disturbance. The other SRAM cells use read decoupling technique, which fully decouples the bitlines from the internal storing nodes during the read operation, and thus their RSNM is as good as HSNM. However, the WWL12T exhibits a slightly lower RSNM than read decoupling SRAM cells because this cell uses a self-refreshing technique, in which the internal storing nodes are affected through the path created by this technique. The proposed WFC12T cell shows the highest RSNM due to having better VTC. This is due to the presence of the stacked transistors M2 and M5 in the pull-down paths of half-cells.

Fig. 5
figure 5

a Butterfly VTCs at \(V_{DD} = 0.7\) V and b RSNM value at different VDD values for various SRAM cells

4.4 Write access time analysis

The plot of write access time (TWA) of all SRAM cells for writing ‘1’ to their ‘0’ storing node Q, which is measured as the time required to charge up to 90% of VDD [30, 60], at different VDD values is shown in Fig. 6. Writing ‘1’ for the SEDF9T and TG9T SRAM cells are the worst possible cases due to their asymmetrical single-ended writing structure and the presence of two series-connected NMOS transistors in their write paths [26, 46]. Thus, these cells exhibit longer TWA amongst all cells. However, the TG9T cell shows the longest TWA amongst all the SRAM cells. This is attributed to the fact that one of the inverters in the cell core of this cell is followed by another one. The WWL12T cell shows the third-worst TWA due to the presence of NMOS transistor in series with transmission gate in write path. The Kim12T and proposed WFC12T SRAM cells show the best TWA because the writing in these cells is deferential and there is only one access transistor in their write paths (as compared to WWL12T cell). The employment of write-assist technique reduces the TWA when compared with FD8T SRAM cell. However, the proposed design shows a slightly shorter TWA than that of Kim12T cell. Because the Q node in the Kim12T cell is not fully charged by WBL through NMOS transistor M5, and therefore it needs to be charged by VDD through two series-connected PMOS transistors M7 and M2. These series transistors increase the time required for reaching the Q node to 90% of VDD. From this figure, it is observed that the WFC12T shows 1.56X, 1.19X, 2.78X, 5.35X, and 2.05X shorter TWA as compared to FD8T, Kim12T, SEDF9T, TG9T, and WWL12T, respectively, at \(V_{DD} = 0.7\) V.

Fig. 6
figure 6

TWA of various SRAM cells at different VDD values

4.5 Write-ability analysis

The WSNM value has been determined by obtaining the length of the minimum square that can be inserted in the write butterfly curve [57]. We have plotted the WSNM of various SRAM cells for writing ‘1’ at \(V_{DD} = 0.7\) V and its variation with linear VDD sweep, in Fig. 7(a) and (b), respectively. The SEDF9T cell offers the lowest WSNM due to its single-ended nature and the presence of two series NMOS transistors in its write paths. The WWL12T cell shows the worst WSNM after the SEDF9T cell, attributed to the presence of one NMOS transistor in series with transmission gate. However, the writing in this cell is differential, and this is why WWL12T shows higher WSNM than that of SEDF9T cell. Due to the differential writing structure and the existence of one NMOS access transistor in its write paths, the FD8T SRAM cell exhibits higher WSNM than those of the above-mentioned SRAM cells. Although, the TG9T SRAM cell employs a write assist technique, however, it shows the higher WSNM after the proposed design and Kim12T SRAM cell. This can be explained by the presence of two series access NMOS transistors in its write paths. The proposed design and Kim12T show almost the same WSNM in all VDD values. From these figures, it is observed that the WFC12T offers 1.62X, 1.99X, 2.34X, and 1.11X improvement in WSNM when compared to FD8T, WWL12T, SEDF9T, and TG9T, respectively, at \(V_{DD} = 0.7\) V.

Fig. 7
figure 7

WSNM of various SRAM cells, a at \(V_{DD} = 0.7\) V, b at different VDD values

4.6 Leakage power (P leakage) dissipation

Leakage power (Pleakage) dissipation is one of the major problems in nanoscale SRAM cells since it remains in hold mode most of the time [10, 61]. Hence, it contributes a huge portion of total power consumption. Thus, it is important to reduce Pleakage in SRAM cells. For sub-100 nm technology, the total leakage current mainly includes the subthreshold leakage current (ISUB), the gate leakage current (IGATE), and the junction leakage current (IJUNC) [62]. Hence, we have shown the paths of these leakage components in the FD8T and WFC12T in hold ‘1’ mode (i.e. Q maintains ‘1’) in Fig. 8(a) and (b), respectively. The ISUB and IJUNC flow in OFF-state MOS transistors while the IGATE flows in all transistors, regardless of their state. Table 2 shows a comparison between the FD8T and proposed WFC12T in hold mode based on the number of leakage components. From this table, it seems that the WFC12T dissipates higher Pleakage than that of the FD8T due to more leakage components. However, the WFC12T exhibits much lower Pleakage than that of the FD8T as well other cells at different VDD values as shown in Fig. 9. This can be explained by Eq. (2), and Eq. (3) [63], which expresses the basic subthreshold current modeling for a MOS device, with applying a non-zero voltage at nodes A (\(\approx\) 504 mV) and X1 (\(\approx\) 231 mV), and high VGND.

$$I_{SUB} = I_{0} \exp \left[ {\frac{{V_{GS} - V_{th} + \lambda_{BS} V_{BS} + \lambda_{DS} V_{DS} }}{{\eta V_{T} }}} \right]\left[ {1 - exp\left( {\frac{{ - V_{DS} }}{{V_{T} }}} \right)} \right]$$
(3)
Fig. 8
figure 8

Schematic of the leakage paths in a FD8T, b WFC12T

Table 2 Comparison of the number of leakage components in FD8T and proposed WFC12T
Fig. 9
figure 9

Leakage power dissipation of various SRAM cells at different VDD values

In Eq. (3), I0 is the subthreshold current when \(V_{GS} = V_{th}\), \(\eta\) is the subthreshold swing factor, and \(V_{T} = KT/q\) is the thermal voltage.

Due to the following reasons, the proposed WFC12T cell shows the minimum Pleakage:

  1. 1)

    The VDS of M2, M9, and M10 have reduced due to the formation of a non-zero voltage at nodes A and X1, and high VGND, respectively. Consequently, the reduced VDS increases the effective Vth value for these transistors, according to Eq. (2), and hence the ISUB is exponentially reduced, according to Eq. (3).

  2. 2)

    As nodes A and X1 are at positive voltages and VGND is at a high state, the VBS of M2, M9, and M10 becomes negative, respectively. Thus, their Vth increased and leads to a reduction of the ISUB.

  3. 3)

    As nodes A and X1 voltage are positive and VGND is set at high in hold mode, the VGS for M2 is reduced and for M9/M10 is rendered negative. Thus, their ISUB through these transistors is reduced.

Furthermore, in the WFC12T cell, the M2/M5 is connected to M1/M4 in series which leads to an increase in the effective channel length of the pull-down transistors in the right/left half-cell. This increases Vth and consequently, ISUB further reduces. In addition, stacking of MOS devices, M1/M2 and M4/M5, leads to increased resistance of pull-down path and thus reduces the overall Pleakage. From Fig. 9, it can be seen that the WFC12T cell shows an improvement of 2.17X, 2.65X, 2.13X, 2.02X, and 2.95X in Pleakage compared to FD8T, Kim12T, SEDF9T, TG9T, and WWL12T at \(V_{DD} = 0.7\) V, respectively. The WWL12T cell shows the highest Pleakage because the PMOS transistors presented in the transmission gate, which controlled by internal storing nodes are always ON. The Kim12T cell dissipates the second-highest Pleakage due to supplying four PMOS transistors from power VDD. The SEDF9T and TG9T SRAM cell consume slightly lower Pleakage than that of FD8T cell due to being single-ended nature and the presence of stacked transistors in their read and write paths.

4.7 Dynamic power consumption

Dynamic power is the main portion of the total power consumption. Equation (4) expresses the relation between the read/write dynamic power consumption (\(P_{read/write}\)) of an SRAM cell, in which different aspects including its read/write operation such as bitline switching activity factor (\(\alpha_{bitline}\)), effective capacitance (\(C_{eff}\)), operating voltage (\(V_{DD}\)), and read/write operation frequency (\(f_{read/write}\)) are considered [64].

$$P_{read/write} = \alpha_{bitline} \times C_{eff} \times V_{DD}^{2} \times f_{read/write}$$
(4)

Assume that a 256 × 16 array, due to the higher number of cells in a column than a row, column-based control signals have higher power consumption than those of row-based counterparts. Figure 10(a) and (b) show the read power and write power of all cells considered in this study along with the proposed WFC12T cell for comparison at different VDD values, respectively. The FD8T and WWL12T SRAM cell employ differential reading structure, and therefore \(\alpha_{bitline}\) for these cells is equal to 1 during the read operation. This is why these SRAM cells consume higher dynamic power during the read operation. However, the WWL12T cell consumes lower read power than that of FD8T cell, attributed to row-based control signals only and lower \(f_{read}\). Other SRAM cells (TG9T, SEDF9T, Kim12T, and proposed WFC12T) employ single-ended reading operation, which reduces \(\alpha_{bitline}\) to less than 0.5. The SEDF9T cell consumes the least read power amongst single-ended reading cells. This can be explained with the employment of row-based signals during the read operation and its higher TRA, which reduces \(f_{read}\). In the TG9T, several row-based control signals should be asserted during the read operation, increasing the read power. Due to this reason, the TG9T cell consumes the highest read power. The proposed WFC12T and Kim12T consume almost the same and second-best read power due to the same number of row-based control signals and TRA.

Fig. 10
figure 10

Dynamic power consumption during a Read operation, b Write operation

The write operation in the TG9T cell is performed as single-ended, and therefore \(\alpha_{bitline}\) is reduced to less than half. This is why the TG9T cell shows the lowest write power amongst all the SRAM cells. Other SRAM cells (FD8T, SEDF9T, Kim12T, proposed WFC12T, and WWL12T) use dual-bitlines structure, and thus their write power is the highest. Although, SEDF9T cell utilizes only one bitline, however, its bitlines need to be fully discharged during every write operation. The WWL12T and Kim12T cells consume the second- and third-best write power due to row-based control signals only. Due to having the higher TWA, the WWL12T cell offers lower write power than that of Kim12T cell. Although, the FD8T and proposed WFC12T cells use an inverter-based AND gate to drive access transistors during the write operation, which is controlled by row-and column-based signals, however, the FD8T cell consumes slightly lower write power. This is because the FD8T cell shows lower \(f_{read}\). The proposed WFC12T cell shows lower write power than that of the SEDF9T cell, attributed to lesser column-based signals, which are asserted during the write operation.

To summarize, the proposed WFC12T cell offers 2.22X, 1.15X, and 1.76X lower read power compared to FD8T, TG9T, and WWL12T cell at \(V_{DD} = 0.7\) V. However, it incurs a penalty of 1.32X in read power when compared with SEDF9T cell at this supply voltage. Furthermore, the proposed WFC12T cell shows 1.16X lower write power than that of the SEDF9T cell and incurs at least 1.42X higher write power when compared with other studied SRAM cells.

4.8 Critical charge (Q crit) analysis and half-select issues

The SERs occur when the critical charge Qcrit of the high-energy \(\alpha\) particles hit node is less than the charge resulting from the electron–hole pairs [6]. The scaling down of technologies along with VDD reduction increase dramatically the susceptibility of an SRAM to SER since Qcrit reduces with VDD and has an exponential dependency with the SER of SRAM cell, as expressed by Eq. (5) [65].

$$SER = K\phi A exp\left( { - \frac{{Q_{crit} }}{{Q_{s} }}} \right)$$
(5)

in which K is a proportionality constant, \(\phi\) is the neutron flux with energy greater than 1 MeV, A is the sensitive area of the circuit, and Qs is the charge collection efficiency of the device in fC [6, 65]. We determined the Qcrit of the proposed WFC12T by injecting the double exponential current pulse (Iinj) as expressed in Eq. (6) into its sensitive node in hold operation [7]. We have considered the ‘1’ storing node (Q) for all cells because this node is more vulnerable to SERs than the ‘0’ storing node (QB) due to the higher carrier mobility of NMOS transistors [1].

$$I_{inj} \left( t \right) = I_{peak} \left(e^{{ - \frac{t}{{\tau_{f} }}}} - e^{{ - \frac{t}{{\tau_{r} }}}} \right)$$
(6)

In Eq. (6), Ipeak is the magnitude of the current pulse, and \(\tau_{f}\) and \(\tau_{r}\) are the material-dependent falling and rising time constants, respectively. In this study, we have chosen \(\tau_{f} = 50 {\text{ps}}\) and \(\tau_{r} = 1 {\text{ps}}\) since Iinj has a long fall time and a short rise time [6, 61]. In order to calculate Qcrit value, we have obtained minimum Ipeak and critical time (Tcrit) by numerous runs that are adequate to flip the data stored in a cell. Tcrit is defined as the time duration between the beginning time of the Iinj and the time when storage nodes Q and QB cross each other. After determining Ipeak and Tcrit, the Qcrit is obtained as the area under Iinj up to Tcrit [6, 61], as shown in Fig. 11(a). We obtained the Qcrit of various SRAM cells at different VDD values as depicted in Fig. 11(b). Furthermore, Table 3 shows the Ipeak, Tcrit, and Qcrit value of these cells at \(V_{DD} = 0.7\) V. It can be seen from these figures and table that the WFC12T cell shows 1.32X, 1.37X, and 1.13X higher Qcrit than that of FD8T, Kim12T, and SEDF9T, respectively. From Eq. (6), it can be realized that with an insignificant increase in Qcrit, SER is further reduced. Thus, this makes the proposed WFC12T cell a good choice for space applications. In order to implement a bit-interleaving structure, to deal with multiple-bit errors, a cell must be half-select disturb-free [46]. The proposed WFC12T cell is half-select disturb-free because in this cell, the CSL, WBL, WBLB, and RBL are column-based, while the WWL, RWL, and VGND are row-based like ultra-dynamic voltage scalable (U-DVS) 10 T SRAM cell proposed in [66]. As earlier was seen, the proposed WFC12T cell offers high Qcrit and hence shows lower susceptibility to single-bit SERs. Therefore, the WFC12T will obtain high MCUs immunity.

Fig. 11
figure 11

a Graphical definition of Qcrit as the area under Iinj up to Tcrit for ‘1’ storing node Q at \(V_{DD} = 0.7\) V [6]. b Qcrit of various cells at different VDD values

Table 3 Ipeak, Tcrit, and Qcrit value of various SRAM cells at \(V_{DD} = 0.7\) V

4.9 Statistical analysis of design metrics

We have taken into consideration the effect of variations on the SRAM cell’s design metrics in terms of variability using Monte-Carlo (MC) simulations. The mean (µ), standard deviation (σ), and variability (σ/µ) [59] value of TRA, RSNM, and WSNM of various SRAM cells at \(V_{DD} = 0.7\) V are given in Table 4. Figure 12 shows the TRA distribution plots of various SRAM cells. It is seen from this figure that the WFC12T cell shows 1.91X lower TRA variability compared to the FD8T SRAM cell at \(V_{DD} = 0.7\) V. The FD8T cell shows the highest TRA variability due to the lack of the read decoupling technique. Furthermore, the proposed design offers 1.17X lower TRA variability than that of TG9T (SEDF9T) cell, attributed to the development of a higher voltage (0.1546 V) at intermediate node X1 of the proposed WFC12T cell compared to TG9T (which is 0.1082 V), resulting in larger body effect in the proposed design. A body bias acts as a tuning knob to compensate for parameter variations and body biasing is effective to address fluctuations in design metrics due to process and temperature variations, and therefore the proposed design offers lower variability than TG9T cell [44].

Table 4 Mean, standard deviation, and variability of TRA, RSNM and WSNM of various cells at \(V_{DD} = 0.7\) V
Fig. 12
figure 12

TRA distribution plots of various SRAM cells at \(V_{DD} = 0.7\) V

In order to evaluate the reliability of SRAM cells in terms of RSNM, we have plotted the RSNM distribution plots obtained by MC simulations at \(V_{DD} = 0.7\) V, as shown in Fig. 13(a). Furthermore, the RSNM variability of various SRAM cells at different VDD values is shown in Fig. 13(b). The WFC12T shows 4.98X, 1.36X, and 1.44X lower spread in RSNM than that of the FD8T, Kim12T, and WWL12T, respectively at \(V_{DD} = 0.7\) V. The FD8T shows the highest RSNM variability, attributed to the lack of the read decoupling technique. The lower RSNM variability observed in the WFC12T cell is due to the higher RSNM induced by the existence of stacked transistors M2 and M5 in its cell core, improving VTC.

Fig. 13
figure 13

a RSNM distribution plots at VDD = 0.7 V, b RSNM variability at different VDD values for various SRAM cells

Figure 14(a) shows distribution plots of WSNM for various SRAM cells at \(V_{DD} = 0.7\) V. From this figure and Table 4, it is seen that the WFC12T cell exhibits 1.86X, 3.04X, 1.04X, and 2.70X tighter spread in WSNM than that of the FD8T, SEDF9T, TG9T, and WWL12T, respectively. Furthermore, we have shown the variability in WSNM for all cells when sweeping VDD, in Fig. 14(b), which indicates the robustness of the WFC12T under severe PVT variations at all VDD values. The SEDF9T shows the highest WSNM variability due to its single-ended nature and lack of write-assist technique.

Fig. 14
figure 14

a WSNM distribution plots at VDD = 0.7 V, b WSNM variability at different VDD values for various SRAM cells

4.10 Layout area and quality metric

In this section, we compared the studied SRAM cells with the proposed WFC12T cell based on their layout area and then proposed a new quality metric for SRAM cells. The layout of all the cells has been implemented with the 32-nm CMOS technology and their area is given in Table 5 as normalized to the WFC12T cell. Figure 15 shows the layout of the proposed cell. The FD8T, SEDF9T, and TG9T cells occupy a lower area than the proposed cell due to the existence of fewer count of transistors. Due to the presence of PMOS transistors in TGs/supply-cutoff transistors, WWL12T/Kim12T occupies 1.08X/1.07X higher area.

Table 5 A comparison between SRAM cells based on area and quality metric
Fig. 15
figure 15

Layout of the proposed WFC12T SRAM cell implemented by 32-nm CMOS technology

To assess the superiority of an SRAM cell, several figures of merit (FoM) have been suggested in the literature [24, 67]. The electrical quality metric (EQM), which is proposed in [67], ignores write delay and design metrics variability. Moreover, stability (i.e. RSNM) per unit area to power-delay-product (PDP) ratio (SAPR) introduced in [24] does not take into account some design metrics of an SRAM cell such as WSNM, HSNM, variability, and leakage power. We introduced a new quality metric considering the demands of the space applications such as energy consumption, stability, and area. The proposed quality metric is defined as variation and space particles radiation-aware stability per area to energy ratio (VS2AER) and expressed in Eq. (7). Three last rows of Table 5 report the normalized values of SAPR, EQM, and VS2AER at \(V_{DD} = 0.7\) V, representing the superiority of the proposed SRAM cell.

$$VS^{2} AER = \frac{{Q_{{{\text{crit}}}} \times {\text{HSNM}} \times {\text{RSNM}} \times {\text{WSNM}}}}{{\left( {\frac{\sigma }{\mu }{\text{RSNM}} \times \frac{\sigma }{\mu }{\text{WSNM}} \times \frac{\sigma }{\mu }T_{{{\text{RA}}}} } \right) \times T_{{{\text{RA}}}} \times T_{{{\text{WA}}}} \times P_{read} \times P_{{{\text{write}}}} \times P_{{{\text{leakage}}}} \times {\text{Area}}}}$$
(7)

4.11 Comprehensive comparison

This section comprehensively compares the performance of the proposed WFC12T SRAM cell with that of other state-of-the-art SRAM cells, which employ decoupled read access path and write data-dependent feedback cutting write-assist scheme. The SRAM cells considered for comprehensive comparison are data-dependent write-assist 11 T (D2WA11T) [44], PMOS-PMOS-NMOS-based cell core 12 T (PPN12T) [43], and dynamic loop cutting write-assist 12 T (DWA12T) [1]. All these SRAM cells have been sized properly for a fair comparison with our proposed cell. The simulation results at 0.7 V supply voltage and room temperature have been given in Table 6. The cell core of all the compared SRAM cells consists of cross-coupled inverters pair with stacked transistors. The PPN12T SRAM cell shows the lowest HSNM due to its PMOS stacked structure. Other SRAM cells show equal HSNM because of their NMOS stacked structure. All cells offer the RSNM equal to HSNM. This can be attributed to the use of the read decoupling technique. However, the proposed WFC12T SRAM cell offers 1.19X/1.17X improvement in RSNM/RSNM variability compared to the PPN12T SRAM cell. Due to the presence of only two series transistors in their read path, the TRA for D2WA11T, DWA12T, and proposed WFC12T SRAM cells is the same and 40.68% lower than that of the PPN12T SRAM cell, which uses a read buffer with three series transistors.

Table 6 Comparison of the proposed WFC12T SRAM cell with other write data-dependent write-assist SRAM cells at VDD = 0.7 V

Both D2WA11T and PPN12T SRAM cells utilize an NMOS device to drive their write-access transistors during the write operation. This single NMOS transistor passes a weak ‘1’ logic value, and therefore the driving strength of those transistors is reduced. This yields a reduction in the cell’s WSNM and an increase in the write delay. These cells show a 67.42%/7.31%/12.86% penalty in write delay/WSNM/WSNM variability compared to the proposed WFC12T SRAM cell. Furthermore, with the presence of two series transistors in its write paths, the DWA12T SRAM cell exhibits a 47.49%/2.05%/4.69% penalty in write delay/WSNM/WSNM variability. The DWA12T SRAM cell employs a fully differential structure and consumes the highest dynamic read power (1.32X higher than the WFC12T cell). However, it consumes 4.53% lower dynamic write power compared to the WFC12T cell due to its higher write delay.

Due to the employment of an additional column-based control signal (RCWL) compared to the WFC12T cell, the PPN12T cell consumes 1.09X higher dynamic read power. However, it dissipates the lowest leakage power among compared SRAM cells due to the existence of a higher count of PMOS devices. Although, the proposed WFC12T SRAM cell occupies the highest area and consumes the highest dynamic write power and leakage power among the SRAM cells considered for comparison, however, it shows the highest proposed quality metric (VS2AER). This implies that the proposed WFC12T SRAM cell outperforms other SRAM cells with isolated read path and data-dependent feedback cutting write-assist scheme and can be a good choice for reliable low-power applications.

5 Conclusion

In this paper, we proposed a reliable twelve transistor SRAM cell to reduce multi-bit soft-error, namely WFC12T. The proposed cell used single-bitline read decoupling technique and write data-dependent feedback cutting-aware differential writing structure. Hence, the WFC12T cell offered a great improvement in both RSNM and WSNM. The proposed cell minimized the Pleakage through stacked transistors employed in right/left half-cell and read access buffer. The WFC12T cell exhibited high reliability when subjected to severe PVT variations, which is an indication of the robustness of the proposed cell. Furthermore, this cell eliminated the write half-select issues by employing a separate gate to drive access transistors, and hence bit-interleaving architecture and various error correction coding can be applied for this cell to enhance the soft-error immunity. Moreover, the proposed WFC12T cell offered the second-best read power and shows a high critical charge under high-energy space radiation. All these together maked the proposed WFC12T cell a good choice for reliable, stable, and low-power portable applications and battery-operated devices. Future work in this paper can be directed toward improving the stability of the column and row half-selected SRAM bitcells during the normal write operation in the involved SRAM bitcell. Similarly, lowering the leakage power dissipation in the SRAM bitcells, which are in the idle mode in a BI architecture by reducing the corresponding VDD to a level where those cells can maintain the stored data will be a promising investigation.