Keywords

1 Introduction

Low-power high-speed high-dense less complex integrated circuit design has become a challenge in this nano-technical era. Advance complex digital circuit designs are formed by using transistor-level technologies; mainly, CMOS technology is highly available in the recent electronic market due because it maintains Moore's law [1] to maintain the high scalability of devices. But, now, more advanced technology beyond CMOS technology is required for device area, complexity, delay, power, and cost optimization. Thus, in this paper, an advanced technology QCA with electron-spin operational criterion is selected to design a novel advanced shift register.

QCA is established by Lent et al. in 1993 [2, 3]. This technical platform is considered as a low-power high-speed technology due to the presence of quantum wire, which can flow information from input to output by electro-repulsion criterion between two consecutive quantum cells, and it can be operated in tetra-Hz frequency range [7]. A basic 4-dot quantum cell is used in this work, and when these quantum cells are specified one after another, it presents a quantum wire [4]. 2-dots among the 4-dots in a quantum cell are occupied by moving electrons, and they are always placed diagonally to neighbor electrons because of the electrostatic repulsive force between the same charged carriers. The movement takes place by tunneling, which is happened from one dot to another in a cell. Thus, in this technology, leakage current flow is very low and energy dissipation is in the pico-joule range. These quantum cells can be placed in different layers easily in this technology to convert it into a 3D manner to reduce the unit area-occupation of proposed circuitry.

In this paper, the selected advanced technology QCA with the advancement of multilayer circuitry, is utilized for getting a novel advanced shifter. There are four different types of shift registers basically, and among them, serial-in-serial-out (SISO) and parallel-in-parallel-out (PIPO) are fewer complexes compare to others. But, QCA-based design of PIPO is more optimized than QCA-based design of SISO, which is proved in paper [6]. Thus, the PIPO shift register is selected in this paper, where more optimization is done by using a multilayer structure and reversible ‘D’ flip-flops. The main contributions of this paper are:

  • Design a reversible QCA-based ‘D’ flip-flop with less complexity, area-occupation, power-dissipation, delay, and cost compared to a most optimized existing design.

  • Design a QCA-based multilayer 3D PIPO shift register up to 4 bit using the proposed D flip-flop with less complexity, area-occupation, power-dissipation, delay, and cost compared to a most optimized existing design.

  • Increase the bit number of proposed circuitry up to 8-bit by adding two 4-bit proposed registers in the pipelined manner and check the complexity, area-occupation, power-dissipation, delay, and cost.

  • Check the output function of the proposed register design at the time of room temperature increment and layer separation gap decrement.

The whole contribution is thoroughly presented in five different sections: Sect. 2 presents the theory-based background of proposed technical field and logical field, Sect. 3 presents the review of related work, Sects. 4 and 5 present the design and outcomes of the proposed ‘D’ flip-flop and multilayer multi-bit proposed register, respectively, and Sect. 6 presents the conclusion of this proposed work.

2 Theoretical Background

The “3-input majority gate or MG,” “5-input majority gate or MG,” and “inverter gate” (give inverted form of input) are the most effective and highly utilized gates in our proposed low-power 4-dotted QCA-design technology, based on the previously discussed binary ‘0’ and binary ‘1’ selection in QCA platform. The polarity of inputs and output in a “3-input MG” are matched. The output of “3-input MG” is shown in Eq. 1, where A, B, and C are the three inputs. “AND Gate” and “OR Gate” are also designed by changing the polarity -1 and + 1 of one of the three inputs in “3-input MG” (given in Eqs. 2 and 3, respectively) (Fig. 1a presents a clear image of “3-input MG”) [7,8,9,10]. Another important multi-input MG is the “5-input MG.” Equation 4 represents a “5-input MG” with input A, B, C, D, and E. If the three inputs among five are merged and change the clock zone (discussed below in QCA-based clock scheme part) from clock 0 to clock 1 near output section, it gives a “3-input XOR” output, and without changing the clock zone, it presents the output of normal “3-input MG” with 4% output strength increment, but the cell complexity is increased from 5 to 11. The “5-input MG” is shown in this section in Fig. 1b, which is representative of the “3-input XOR” operation. As we know, “inverter gate” is rapidly required to design any digital-based circuitry, which gives a “NOT-Gate” outcome. In this theory about QCA-based widely used conventional logic gates, this clear reflection of basic single later “inverter gate” is also included, which is given in Fig. 1c.

$$3{\text{ input MG}}\left( {{\text{A}},{\text{B}},{\text{C}}} \right) = {\text{AB}} + {\text{BC}} + {\text{AC }}$$
(1)
$$3\,{\text{input}}\,{\text{MG}}\left( {{\text{A}},{\text{B}},0} \right) = {\text{A}}.{\text{B}}$$
(2)
$$3\,{\text{input}}\,{\text{MG}}\left( {{\text{A}},{\text{B}},1} \right) = {\text{A}} + {\text{B}}$$
(3)
$$\begin{aligned} 5\,{\text{input}}\,{\text{MG}}\left( {{\text{A}},{\text{B}},{\text{C}},{\text{D}},{\text{E}}} \right) & = {\text{ABC}} + {\text{ABD}} + {\text{ABE}} + {\text{ADE}} \\ & \quad + {\text{ACE}} + {\text{ACD}} + {\text{BCE}} + {\text{BCD}} \\ & \quad + {\text{BDE}} + {\text{CDE}} \\ \end{aligned}$$
(4)
Fig. 1
a) Single layered structure based on QCA. b) Three input majority gates are shown. c) 5 inputs majority gate with input A,B,C,D,E and e-invertor.

QCA-based basic single-layer structure of a “3-input MG,” b “5-input MG,” and c “Inverter Gate”

In QCA, a different clocking scheme helps to control the information flow from one part to another in a circuit, maintain the power gain by restoring the lost signal energies in the environment, and identify the delay of the design. This is a pipelined structure, where four clock zones with four clock phases are presented. Four clock zones are: clock zone-1, clock zone-2, clock zone-3, and clock zone-4, and 4 clock phases with 90° phase difference are: switch, hold, release, and relax, which are given in Fig. 2 [11]. In this clocking scheme, when the given clock is high, the potential barrier between two dots becomes low and the total polarization of the circuit is 0, and when the clock becomes low, the potential barrier between two dots is high, and electrons are placed in the dots through a tunnel according to the polarization of cells, which depends on the specified neighbor-cells.

Fig. 2
Clock phases used in QCA technology showing potential barriers versus time.

Clock phases used in QCA technology

In the above-discussed conventional logic gates, the “information erase with the copy” cannot be possible. So, energy is dissipated per bit, which can be maintained through design-adiabaticity, and this adiabatic logic can be followed by using reversible gates, where “information erase with the copy” can be maintained by the “Bennet clock scheme” [12,13,14]. So, energy dissipation per bit can be maintained by adding this reversible gate. In a conventional gate, only the outcomes depend on the inputs, but in this type of gate (reversible gate), the inputs are also outcome dependent. That means the arrangement of outputs is also able to represent the arrangement of inputs and vice versa in this proposed reversible gate. To form this arrangement in a reversible gate, it is required to maintain the same number of inputs and outputs. Thus, we can say that a proper exploration of the advantages of QCA-based circuitry can be possible by using a reversible gate, and in a multilayer platform, it becomes more effective because of its energy-controlled nature. Due to this reason, this proposed design is formed in a hybrid manner by adding a reversible gate with widely used “3-input MG.” Figure 3 represents a proper block diagram of a basic reversible gate.

Fig. 3
Block diagram of basic reversible gate by entering n number of inputs to get the same number of outputs.

Block diagram of basic reversible gate

In a circuit design, crossing-criterion of two wires is a very common and important thing, which becomes more complex at the time of operations of a circuit increment. Delay, area, output strength, and power-dissipation also depend on this criterion. Thus, crossover design selection in circuit formation is a challenging part. In our proposed technology, QCA, coplanar, multilayer, and crossover by changing clock zones of two different crossing wires is presented [15]. In our work, multilayer crossover is used, where different cells are specified in different layers, which acts as an inverter in two different consecutive layers. But, this type of structure can increase the output strength compare to coplanar form and also compare to a single later “inverter gate” with a 25% delay reduction. In this multilayer structure, the vertically separated quantum cells are tuned to match their kink energy in the horizontal plane unlike the transistor-based structure [16]. In Fig. 4, this bridge-looking multilayer QCA-based structure is presented.

Fig. 4
Bridge-looking multilayer QCA-based structure is presented.

Multilayer QCA-based structure [5]

3 Review of Related Work

As we know, D flip-flop is an important widely used sequential circuit in the digital world due to its simple operation, and in this paper; ‘D’ flip-flop is utilized to form a PIPO shift register in the QCA platform. Because of the simplicity and huge application of this proposed circuitry using ‘D’ flip-flop, different novel designs are published in different years related to this proposed work. In this work, some optimized-related designs from 2019 to 2021 are discussed. In 2019’s March, a QCA-based level-sensitive ‘D’ flip-flop is introduced by Ting Li et al [22], where only 28 quantum cells are required, and in that paper, the optimization-based advantages of 3-bit PIPO register based on the proposed D flip-flop compare to 3-bit SISO register based on the proposed ‘D’ flip-flop is also proved in single-layer platform.

After 4 months, a novel Universal Shift Register is presented by Jun-Cheol Jeon (published in 2020), where 4-bit PIPO is a part, and in this design, a 24 quantum cell-based ‘D’ flip-flop is used with 0.25 higher delays compared to the previous one in a single-layer platform [23]. Next, in 2020, 3-bit PIPO and SISO are presented by Shuyan Fan et al to prove the advancement of PIPO compared with SISO again based on area-occupation, cell complexity, and delay, where dual edge-triggered ‘D’ flip-flop is used [6].

In 2021, another novel ‘D’ flip-flop is presented applying only 21 quantum cells and the same delay by Salma Yaqoob et al [24] to form a single-layer SISO register. But this ‘D’ flip-flop formation can be optimized more, and optimization of shift registers can also be possible, which is shown in this paper.

4 Proposed ‘D’ Flip-Flop

‘D’ flip-flop or Delay flip-flop or Data flip-flop can store data, and it can work by maintaining reversible logic [4]. This data-storing process follows Eq. 5, and in this work, a level-sensitive ‘D’ flip-flop is proposed, where the reversible logic is applied. The block diagram of ‘D’ flip-flop is presented in this part in Fig. 5, and based on this diagram, a novel level-sensitive ‘D’ flip-flop is formed in this work using 20 quantum cells (18*18 nm2 per cell area).

$$Q = D.{\text{CLK}} + Q.{\text{CLK}}^{\prime }$$
(5)
Fig. 5
The block diagram of ‘D’ flip-flop is presented, a novel level-sensitive ‘D’ flip-flop is formed in this work using 20 quantum cells based on it.

Block diagram of proposed ‘D’ flip-flop [22]

The proposed ‘D’ flip-flop is presented in this portion in Fig. 6, which follows reversibility. In this proposed operation, output C follows the input CLK to make the design reversible and when CLK is 0, the output Q shows the previous state, and otherwise, the ‘D’ input is followed by output Q, which is presented here by the outcome of the proposed level-sensitive ‘D’ flip-flop (in Fig. 7), achieved through QCA Designer 4.0 [25, 26] software.

Fig. 6
The proposed ‘D’ flip-flop is presented in this portion in which it follows reversibility.

Proposed level-sensitive ‘D’ flip-flop with reversibility

Fig. 7
Outcome of the proposed level-sensitive ‘D’ flip-flop is achieved through QCA Designer 4.0 [25, 26] software.

Simulated outcomes of proposed level-sensitive ‘D’ flip-flop with reversibility

The outcome of the proposed level-sensitive ‘D’ flip-flop is presented in Fig. 7 with THz frequency range and 2 ps delay in output C. This proposed structure is more optimized, where delay reduction of 33.3% is possible for only 2% output strength reduction compare to previously discuss most optimized structures, which is shown in this section in Table 1.

Table 1 Comparison table: proposed level-sensitive ‘D’ flip-flops of different paper

5 Proposed PIPO Shift Register

The advancement to get an optimized register based on area-occupation, delay, areal-power-dissipation [17,18,19,20,21], complexity, and cost PIPO is better than SISO, which is already discussed previously. In this work, the optimization of the PIPO register is increased by using the proposed ‘D’ flip-flop in the multilayer QCA platform. The novel advanced proposed structure of multilayer 3D PIPO with the simulated result and proper parametric comparison is revealed in this portion. At first, the 4-bit structure of the proposed PIPO is presented here, and then, this is converted to 8 bit by adding another 4-bit structure.

The proposed three layers of novel 4-bit PIPO register is presented separately and in the combined manner in Fig. 8 and 9 respectively, where D0-D3 data are given and the outputs are got from Q0-Q3 with another output C which is the direct output of CLK signal (used to make the design reversible without increase the cell count and area-occupation). The outcome of this proposed 4-bit PIPO is presented in Fig. 10, where the output is achieved after 0.5 clock cycles for Q0-Q3. Next, this presented 4-bit structure is used to design an 8-bit PIPO with the same number of layers to control the high-temperature and complexity issues. These three layers of the proposed 8-bit structure, where two presented 4-bit structures are added are shown here separately and also in a combined manner in Fig. 11 and 12, respectively, with the simulated outcomes (in Fig. 13).

Fig. 8
The proposed three layers of novel 4-bit PIPO register is presented separately using level-sensitive ‘D’ flip-flop with reversibility.

Three layers of proposed 4-bit PIPO register using level-sensitive ‘D’ flip-flop with reversibility

Fig. 9
Proposed combined multilayer 3D 4-bit PIPO register is presented in combined way using level-sensitive ‘D’ flip-flop with reversibility.

Proposed combined multilayer 3D 4-bit PIPO register using level-sensitive ‘D’ flip-flop with reversibility

Fig. 10
The outcome of this proposed 4-bit PIPO is presented where the output is achieved after 0.5 clock cycles for Q0-Q3.

Simulated result of proposed multilayer 3D 4-bit PIPO register using level-sensitive ‘D’ flip-flop with reversibility

Fig. 11
Two presented 4-bit structures added are shown here separately, with the simulated outcomes.

Three layers of proposed 8-bit PIPO register using level-sensitive ‘D’ flip-flop with reversibility

Fig. 12
Two presented 3d 4-bit structures added are shown here in a combined way, with the simulated outcomes.

Proposed combined multilayer 3D 8-bit PIPO register using level-sensitive ‘D’ flip-flop with reversibility

Fig. 13
Simulated result of the proposed multilayer 3D 8-bit PIPO register is presented using a level-sensitive ‘D’ flip-flop with reversibility.

Simulated result of proposed multilayer 3D 8-bit PIPO register using level-sensitive ‘D’ flip-flop with reversibility

The parametric and simulated outcomes of proposed designs show the advancement of the 4-bit PIPO compared to the 3-bit design of paper [22] and the 4-bit design of paper [23]. After checking the advancement of the proposed novel 4-bit PIPO structure of this paper, this optimized structure is doubled without increasing the layer number. In Table 2, a parametric investigation of different structures of multi-bit shift registers is presented based on the cell complexity, area-occupation, delay, areal-power-dissipation, and cost. The temperature increment effects on device performance are also checked in this work. This design can work properly up to 5 K temperature more than room temperature with the same output strength, but after 5 K, the output strength is reduced due to electron scattering at high temperature. In 24 K temperature, the output strength is reduced 37% compared to real output strength, and above this, the device starts to do malfunction. Multilayer structures face another problem, which is the volume increment of the proposed structure. But, this proposed design can work properly with the same output strength at the time of layer separation gap reduction from 11.5 nm to 2.5 nm (78% reduction in layer separation gap can be possible) with the same temperature tolerance, power-dissipation, and delay. A graphical representation is also shown in this section in Fig. 14 for a better understanding of these high-temperature and layer separation gap reduction effects.

Table 2 Table of parametric investigation: proposed multi-bit PIPO shift registers of different papers
Fig. 14
A graphical representation is for a better understanding of these high-temperature and layer separation gap reduction effects.

Graphical representation of temperature increment affects device output strength and temperature tolerance for layer separation gap reduction

6 Conclusion

A novel area, delay, complexity, cost, and dissipated power efficient QCA-based multilayer 3D 4-bit and 8-bit PIPO shift register are presented in this paper using optimized novel level-sensitive ‘D’ flip-flop with reversibility. In the proposed ‘D’ flip-flop, 4.8% cell complexity, 33.3% delay, and 50% cost decrement with same area-occupation and areal-power-dissipation is possible compared to previously published most optimized ‘D’ flip-flop parameters [22, 22]. Further, the proposed 4-bit multilayer PIPO structure is capable of reducing 19% cell complexity, 20% unit area-occupation and areal-power-dissipation, 33.3% delay, and 46.7% cost compared to previously published optimized parameters of single-layer 3-bit PIPO register [22]. The novel efficient multilayer 8-bit structure is formed in this work by using the proposed optimized previously discussed components with the same number of layers compared to the 4-bit structure. This proposed 8-bit 3-layered PIPO register can perform with efficient output strength up to 5 K temperature more than room temperature, and this performance is continued at the time of layer separation gap reduction from 11.5 nm to 2.5 nm. Proper fabrication with hardware verification of multi-bit advanced shift registers can be possible in the future.