1 Introduction

The design of low power and high performance SRAM cell becomes a necessity in today’s microprocessor because SRAM is a critical component in memory design. It occupies a significant portion of the chip’s area and consumes a significant portion of the total power. Hence, the need of new design techniques for low power and high performance SRAM cell has become essential to reduce the power consumption and improve the stability of the cell. The conventional 6T cell has simple structure and large storage capacity but consumes large power and has poor read stability. Leakage current in the cell dominates the total chip leakage power whereas switching activity on the highly capacitive bit line and word line is costly in terms of energy. The subthreshold SRAM cells [16] have been proposed to reduce the leakage current and the active power consumption of the cell. The main drawbacks of these cells are their poor read stability and write ability. The 6T SRAM cell consumes maximum power during write operation when the large bit line has to charge or discharge completely. To overcome this problem various design techniques have been proposed [711]. The 9T disturb-free cell reported in Ref. [4] is able to enhance the write ability of the cell using write assist technique at the cost of large parasistic capacitance which leads to increased delay. In Ref. [12] the 9T data-aware cell is proposed to save the write power but the cell suffers from leakage current problem in the hold mode and imposes hardware burden due to signal HD. The designed 130 mV subthreshold SRAM cell in Ref. [13] achieved lower write power consumption by reducing the discharging actvity at the respective bit line. They used two write signals SCL and SCR on the cost of wiring overhead and degraded access delay.

In this paper, we have proposed a data-dependent write assist dynamic SRAM cell which uses single bit line for read and write operations. The isolated read and write operations improve the read stability of the cell. The write signal WS is introduced to lower the discharging activity at the write bit line BL and enhanced the write margin (WM). The data at the storage nodes flip faster during write operation due to broken feedback path which is restored in read and hold mode (WL = 0 V). The leakage current in the read circuit is controlled by two OFF transistors. The proposed cell results in average power saving of approximately 60 % during write operation and 52.8 % during read operation.

The rest of the paper is organised as follows. Section 2, deals with the architecture of the proposed cell along with detailed discussion on write and read operations. In Sect. 3, the simulation results and comparisions with other cells (reported in the literature) are presented. In Sect. 4, the implementation of 4 × 4 array using the proposed cell is given. This section also deals how to avoid the half-select row and column disturbances in data-aware cell. Section 5 concludes the work.

2 Architecture of the proposed cell

Figures 1(a) shows the architecture and layout of the proposed cell respectively. It has distinct read and write ports. The write circuit is similar to the conventional 6T cell except that the write operation is performed at single bit line BL using write signal WS. Write signal WS is used to reduce the discharging activity at BL to save dynamic power noting that maximum power in the conventional 6T cell is consumed due to large voltage swing on the bit line. The feedback path is cutoff by off transistor PM3 and thus cell behaves as dynamic cell. The read operation is performed at read bit line RBL using three series connected NMOS transistors which controls the leakage current effectively by turning off transistors NM4 and NM6 during write and hold mode. During read and hold mode, the latch property of the cell is restored (WL = LOW) to keep the data intact in the cell. The proposed cell shows higher immunity against PVT (process, voltage and temperature) variation and aging effect. The detailed operation of the proposed cell is explained below.

Fig. 1
figure 1

a Architecture and Layout of the proposed cell, b write 1 and write 0 waveforms

2.1 Write operation

The high voltage at WL breaks the latch property of the cell so that data can be transferred easily and faster. In write 1 operation, the write signal is grounded which turns OFF the transistor NM1 and hence a high resistive path is established between node Q and the ground. This high resistive path flips the node Q at HIGH without allowing BL to discharge as in the case of the 6T cell (Fig. 1(b)). The lower discharging actvity at BL results in lower dynamic power consumption. The read leakage current becomes negligible due to three series connected OFF transistors in the read path.

In write 0 operation, signal WS is connected to Vdd. Transistor PM1 turns OFF and no current flows from Vdd to storage node Q, hence node Q flips to LOW (Fig. 1(b)) without any appreciable voltage drop at BL. Two OFF transistors in the read path minimize the leakage current.

In both operations, the discharging activity at BL is reduced which results in considerable dynamic power saving.

2.2 Read operation

The low voltage at WL restores the latch property of the cell. During read 0 opeartion (QL = high), the read bit line RBL discharges through three series connected ON transistors. In read 1 operation, transistor NM5 turns OFF which does not allow precharged RBL to discharge.

2.3 Hold mode

During hold mode, the cell gives lower standby power consumption due to lower leakage current in read and write circuits.

3 Simulation results and discussion

The proposed and other cells (reported in literature) were simulated in terms of energy (power), access delay, PVT(process, voltage and temperature) and aging effect under static stress using MOSIS TSMC 65 nm model parameters at Vdd = 1.2 V. The layout of the cell (Fig. 1(a)) was drawn using Cadence 6.1 design rules. The impact of the cell on the read stability, write margin and leakage currents were studied in detail.

3.1 Write energy/power consumption

Due to write signal WS, the voltage drop (ΔVBL) on the bit line BL reduces (Table 1) which results in lower dynamic power consumption according to the relation: Pdynamic = CL*Vdd*ΔVBL*f, where, CL is lumped capacitance, f is frequency and ΔVBL is bit line voltage drop.

Table 1 Voltage drop on BL during write 1 operation in different cells

Due to two OFF transistors in the read path, leakage power (Pleakage = Pstatic = Vdd*Ileakage) of the cell also reduces. The reduced dynamic and static power results in lower energy dissipation as shown in Fig. 2(a). The average energy saving is 54.1 % compared to 8T Subthreshold cell [3], 54.7 % compared to the 9T disturb free cell [4] and 55.8 % compared to the 9T DA cell [12]. The three series connected OFF transistors in the read path of the proposed cell during write 1 operation result in approximately 2 % lower energy dissipation compared to write 0 operation.

Fig. 2
figure 2figure 2

a Write energy dissipation in various cells. b Write 1 power consumption for various process corners in different cells. c Write access delay in different cells. d Percentage write access time degradation under static stress

From Fig. 2(b) it can be observed that the proposed cell consumes lower power than the other cells irrespective of the process corners. For TT corner, 67.2 % lower power consumption is achieved compared to the 8T Subthreshold cell [3], 52.8 % compared to the 9T disturb free cell [4], 56.5 % compared to 130 mV subthreshold cell [13] and 61.12 % compared to the conventional 6T cell.

3.2 Write access delay

Since write operation is performed by WS instead of WL in the proposed cell, we have defined the write access delay as the time taken to flip the data of the storage nodes once WS is asserted as HIGH or LOW. Due to dynamic nature of the proposed cell and write signal WS, the access delay is lower than the other cells (Fig. 2(c)). The percentage write access degradation in the proposed cell is lower than the 6T cell and DA cell under static stress condition (Fig. 2(d)).

3.3 Write margin

The write margin of the proposed cell is defined as the maximum WS voltage required to flip the data at the storage node. The write margin of the proposed cell is 500 mV for LVT NM1 transistor (Fig. 3(a)) which is much larger than the other cells and does not show any significant deviation from its nominal value as the temperature rises from −20 to 120 °C. The larger write ability is due to no role played by WL in the proposed cell. The write margin increased to 680 mV for HVT NM1 transistor. Figure 3(a) also shows the simulated N-curve of the proposed cell. The value of the negative peak current is about 303 pA and the voltage difference between two turning points is 280 mV. The large value of WTI (write trip current) makes the cell ready to cope with any external noise during write operation. The data at the storage node flips to strong 0 or strong 1 for β (pull-up ratio) = 1 or slightly lower than 1 whereas WM degrades for β > 1 due to increase in leakage current. The proposed cell shows less percentage degradation in WM under static stress condition (Fig. 3(b)). The higher WM and larger immunity against stress are due to less signal hardware burden which results in lower parasitic capacitance.

Fig. 3
figure 3

a Write margin and N-curve of the proposed cell. b Write marging degradation under static stress condition

3.4 Write leakage current

The static power consumption of the circuit is now a major source of power consumption in any submicron device. The static power consumption is mainly due to subthreshold leakage current. The leakage current through transistor PM3 at different process corners is shown in Fig. 4(a) and it is found to be of the order of approximately 1.16 nA for SS corner. The current through OFF transistor PM3 increases as the voltage on WL increases due to more charge injection from BL. This result reflects that the overall power consumption of the cell is lower for WL = 0 V than the WL = Vdd. The current through OFF transistor NM1 is also lower for WL = 0 compared to WL = 1.2 V which confirms our finding that the total power consumption of the proposed cell is lower for grounded WL at the cost of increased write access delay. The leakage current in the proposed cell, 6T cell and 9T DA cell [12] increases with increase in voltage on RBL whereas for 9T disturb free cell [4] and 130 mV subthreshold cell [13] it remains almost constant (Fig. 4(b)). Due to OFF transistors in the read circuit, the leakage current through RBL is always lower than the other cells in the proposed cell. If RBL is precharged to 0.8 V instead of 1.2 V, the read bit line leakage current can be reduced further. The current through OFF transistor NM1decreases with Vdd due to signal WS which keeps NM1 in the strong cutoff region (Fig. 4(c)). The leakage current mainly increases with temperature and hence it is a serious issue in any cell. Figure 4(d) shows the robustness of the proposed cell against temperature which makes it possible to use it in the worst condition (T = −20, 120 °C) with minimal power loss.

Fig. 4
figure 4figure 4

a Leakage current through feedback transistor at various process corners in the proposed cell. b Variation of RBL current with RBL voltage during write operation in different cells. c Variation of NM1 current with Vdd. d Variation of NM1 current with temperature during write 1 operation

3.5 Read power consumption

In the proposed cell, the precharged read bit line RBL does not discharge during read 1 operation due to OFF transistor NM5 whereas during read 0 operation RBL discharges to 393.2 mV in the proposed cell, 247.2 mV in 8T subthreshold cell, 462.13 mV in 9T disturb free cell and 650 mV in the 6T conventional cell at Vdd = 1.2 V. Due to lower leakage current in the write and read circuits, total power consumption in the proposed cell is lower than the other cells (Fig. 5(a)). The average read power saving is approximately 31 % at Vdd = 0.8 V and 25 % at Vdd = 1.2 V. The overall read 0 power consumption reduces by approximately 18 % for SS corner compared to TT corner in the proposed cell due to lower current between RBL and ground (Fig. 5(b)).

Fig. 5
figure 5figure 5

a Read power consumption in different cells at Vdd = 0.8 V and Vdd = 1.2 V. b Read 0 power consumption for various process corners in different cells. c Read 1 dealy in different cells at two different power supply. d Read 0 delay for various process corners in different cells

3.6 Read delay

During read 1 operation, the forbidden discharging activity at RBL results in lower read access delay compared to the other cells as shown in Fig. 5(c) irrespective of Vdd value. Due to larger threshold voltage of the transistors in SS corner, the read access delay is increased by 17.6 % compared to the TT corner. From Fig. 5(d) it is clear that the proposed cell reads data 1.33× faster compared to the 130 mV subthreshold cell [13], 1.15× faster compared to the 9T disturb free cell [4] for TT corner. The proposed cell gives larger read access delay compared to the conventional 6T cell for all the process corners except SF and SS corners. The proposed cell reads data 1.046× slower than the 6T conventional cell for TT corner due to three series connected NMOS transistors in the read path which can be compensated by enlarging the transistor NM5.

3.7 Read leakage current

The leakage current through write bit line is 589.48 pA (read 1)/800.84 pA (read 0) in the proposed cell compared to 4.523 nA (read 1)/12.31 nA (read 0) in 8T subthreshold cell and 1.408 nA (read 1)/1.65 nA (read 0) in 9T disturb free cell. The lower leakage current through BL reduces the impact of the leakage from unaccessed cells and gives the additional advantage of connecting more cells on the single bit line during read operation. Due to lower leakage from bit line into unaccessed bit cells there is no undesired voltage drop and hence no distortion in reading the data.

3.8 Static power consumption

In hold mode, WS maintains its value due to internal latch. The static power consumption in the proposed cell is lower than other cells irrespective of the power supply and temperature (Fig. 6(a), (b)) due to lower leakage through write bit line BL and stack effect in read circuit. For SS corners, proposed cell saves 12 % power compared to the 8T subthreshold cell and 30.3 % compared to the 9T disturb free cell (Fig. 6(c)).

Fig. 6
figure 6figure 6

a Variation of hold power consumption in different cells with Vdd. b Variation of hold power consumption against temperature in different cells. c Static power consumption at various process corners in different cells. d Noise margin curve for the proposed cell. e Percentage RSNM degradation under static stress

Due to isolated read and write ports and signal WS, the data at the storage nodes are maintained strongly at their respective values in hold mode. The stored data does not change as the temperature changes for TT corner due to restricted leakage current. For SS corner, lower data at Q gets distorted as temperature increases whereas higher data remains intact at any temperature.

3.9 Static noise margin (SNM)

The read static noise margin (RSNM) of the proposed cell is equal to ideal hold static noise margin (HSNM) as shown in Fig. 6(d). The read static noise margin is found to be 390 mV for the proposed cell, 375 mV for the 8T subthreshold SRAM cell and 300 mV for 6T conventional cell at Vdd = 1.2 V, T = 27 °C for TT corner. RSNM furthe improves for SS corner due to lower subthreshold leakage current in the cell (Table 2). We have also simulated the SNM at different temperature and observed no significant shift in the value. For larger cell ratio (γ > 1), voltage at QL degrades slightly whereas lower cell ratio maintains the data at their strong values (i.e. strong low and strong high). The percentage degradation in SNM is lower than the other cells (Fig. 6(e)).

Table 2 RSNM for process corners

The data at the storage nodes maintained strongly at their respective values for power supply range 300 mV ≤ Vddmin ≤ 400 mV.

3.10 Monte carlo simulation

In order to evaluate the effectiveness of the proposed cell, Monte Carlo (MC) simulations are performed under global and local mismatch at Vdd = 1.2 V for write and hold case. From Fig. 7 it is clearly observed that nodes Q and QL are clamped to strong high and strong low levels for hold “1” and hold “0”. All the Q and QL distribution points fell in the µV region for low voltage level. There is only 0.03 % voltage drop compared to Vdd with very low mean value. This small variation shows a strong hold SNM in the proposed cell. In order to see the Hold cell stability, MC simulation was run for proposed cell and 8T subthreshold cell [3] for N = 1000 at Vdd = 1.2 V, T = 27 °C and TT process corner (Fig. 8(a)). Statistically, the proposed cell gives 1.02× mean SNM and 74.6 % reduction in the standard deviation compared to the 8T subthreshold cell due to controlled leakage current in the write and read circuits. Figure 8(b) shows the MC for write ability. It is observed that as long as WS is set at its respective value, the two storage nodes flipped strongly at the desired value. This confirms the robustness of the write ability of the proposed cell.

Fig. 7
figure 7

MC simulation for steady state voltages at Q and QL with Vdd = 1.2 V, N = 2500

Fig. 8
figure 8

a Monte Carlo simulation results for the hold SNM with N = 400. b Monte Carlo simulation for write ability of the proposed cell (N = 1000)

4 Array architecture and half-select row/column issue

Figure 9(a) shows a 4 × 4 array which is implemented using proposed cell and column based approach. In this approach, signal WS is routed parallel to write bit line BL because WS has to track the BL. The global WS (GWS) generator is placed in column multiplixer block which has inputs of data-in and write enable (WE) (WE = high for write and WE = 0 for read/Hold). Each colunm in the array has its own local WS which is connceted to GWS through NMOS pass transistor. The toggling of the write WS signal causes disturbance in the unselected cell on the selected row during write operation, hence to avoid any instability on the storage nodes of the unaccessed cells sharing same WL in the column, we proposed a peripharal circuit as shown in Fig. 9(b). When a particular column address CA is asserted high, corresponding local WS can be connected to GWS whereas other local WS maintain their previous data through ON PMOS pass transistor (Fig. 9(b)). Figure 10(a) shows two write operation sequences 1100 and 1010 in different column cells on the selected row. The write operation in the array is performed at the storage nodes Qi (i = 1–4) of various cells. As seen when we are performing the write operation at storage node Q4 of the selected cell4, data at the storage nodes of the other unselected cells remain undisturbed.

Fig. 9
figure 9

a 4 × 4 array using proposed cell. b Circuit for avoiding column-half select disturbance on selected row

Fig. 10
figure 10

a Transient response of the column half selected cells on the selected row. b Transient response of the row half selected cells on selected column

To avoid the row half select disturbance on the storage nodes in the cells when WS toggles during write operation, we have connected each cell to column WS by NMOS pass transistor whose gate is controlled by row address RA so that only selected cell should be connected to global WS and other cells maintained their data through internal latch (WL = 0). The disturbance on the unselected cells (for which WL = low) does not take place by asserting the row address at different interval of time. Figure 10(b) shows two write operation sequences in different cells on the selected column. The data remain maintained in unselected cells of the selected column even though WS toggles.

In the read operation half select disturbance does not arise in the array because WS does not toggle. Due to single read and write stacks in the proposed cell as well as single read and write bit line, the needs of seperate RWL and WS drivers will not add any area overhead. The read and write paths of the array can be optimized independently by sharing the read and write bit lines across different number of bits.

All the peripherals circuits in the array can be designed using the CMOS logic for functional robustness and simplicity. Although the proposed cell adds area overhead compared to the 6T cell but ovarall area penalty in the array will be less due to connection of more number of cells on a single bit line.

5 Conclusion

The proposed cell consumes lower power, enhances the write margin and read stability. The write access delay is reduced due to feedback cut-off technique. The proposed cell design makes the cell more tolerant towards the process, temperature and voltage variation as well as aging effect under static stress. The storage node does not float during read operation and thus cell is insensitive to any positive noise. The power saving is more than 50 % in write and read operations compared to the other cells. The proposed cell shows larger immunity towards the statistical variation due to signal WS. The area overhead due to increased number of transistors can be compensated by connecting more cells on a single bit line in the array.