1 Introduction

The CMOS technology as the dominant technology for VLSI has faced with limitations at nano-scale such as short channel effects [1]. So, alternative technologies such as Silicon On Insulator (SOI) [2,3,4,5,6,7,8], Carbon NanoTube Field Effect Transistors (CNTFETs) [9,10,11,12,13], molecular devices [14], single electron transistors [15], Spintronics [16] and Quantum-dot Cellular Automata (QCA) [17,18,19] are proposed for circuits design at nano-scale.

The QCA technology could be one of the alternatives for the CMOS technology [20]. Recently, the QCA arithmetic and logical circuits design are in the focal point of the researcher interests. Many logical gates and circuits such as Full Adder (FA) circuits [21, 22], multiplier circuits [23, 24], shift register circuits [25, 26], comparator circuits [27, 28] and multiplexer circuits [29, 30] have been designed in the QCA technology. Full adder circuit is one of the important elements in digital circuits. It plays a vital role in computing and arithmetic circuits such as ALUs and microprocessors [23].

Many of QCA full adders have designed using three 3-input majority gates and two inverter gates in one layer [31,32,33,34,35,36,37,38]. Some of these designs have been implemented without the use of wire crossing [39]. A few numbers of designs have used one lesser inverter gate for implementing QCA FAs in single layer [40] and multilayer [41, 42] wire-crossing layouts. The 5-input majority gate (MG5) has also been utilized for designing QCA FAs [22, 31, 43,44,45,46,47,48,49,50,51,52,53,54,55]: coplanar [43,44,45,46,47,48,49,50] and multilayer [22, 31, 51,52,53,54,55] QCA FAs. Some designers have realized QCA FAs using 3-input XOR gate. They have implemented their circuits in QCA coplanar [17, 56,57,58] and QCA multilayer [59, 60].

In this paper, we propose a novel and efficient QCA full adder circuit in multilayer. The sum output is designed by using one efficient QCA XOR gate. The carry output is also designed by using one efficient 3-input majority gate. Then, we propose an efficient 4-bit Ripple Carry Adder (RCA) circuit using the proposed full adder circuit. The proposed circuits are simulated using QCADesigner tool version 2.0.3 [61]. The simulation results demonstrate that the proposed circuits work correctly. The comparison shows that the proposed QCA circuits have advantages compared to other QCA circuits in term of area, latency, and cost.

The rest of this paper is organized as follows. In Section 2, an overview of the QCA technology is presented. In Section 3, previously reported designs are presented. In Section 4, the proposed new efficient 1-bit full adder and 4-bit QCA RCA are presented. Section 5 shows simulation results and compares proposed circuits to other QCA circuits. Finally, Section 6 concludes the paper.

2 Background

2.1 Quantum Cells

The normal QCA cell is usually constructed in a square form that has four quantum-dots in its corners. Generally, two electrons are injected in each cell, which are placed diagonally because of columbic repulsion [62, 63]. So, there are two stable states for replacing the electrons in dots that result in equalization to a binary system that are shown in Fig. 1 [63].

Fig. 1
figure 1

Normal QCA cell [63]

Unlike traditional structures, the position of electrons in the dots is based of binary logic instead of the voltage level in the QCA cells. It should be noted that two diagonal stable states of electrons in the QCA cell give two polarization states P = −1 and P = +1 which are equivalent to logic “0” and “1”, respectively [64]. The polarization P is computed from Eq. (1).

$$ P=\frac{\left({p}_1+{p}_3\right)-\left({p}_2+{p}_4\right)}{p_1+{p}_2+{p}_3+{p}_4} $$
(1)

Where pi denotes the polarization of ith dot. At the time of applying clock signal, electrons can move between dots by tunneling because of lowering level of potential barriers. The cells are constructed in such a way that they are isolated from each other and tunneling between dots of adjacent cells cannot be happened [20].

2.2 QCA Clocking

QCA clocking is different to CMOS clocking. The QCA clock is utilized to control and synchronize signals. Furthermore, there aren’t power lines in the QCA technology. The QCA cells set to four clock phases named switch, hold, release, and relax phases. Actually, clock signals change potential barrier of tunneling between dots and control electron mobility in the tunnel. Therefore, the cell polarizations will be controllable. As a result, the cell gets four states: polarization, fixed polarization, depolarization and keep depolarized in mentioned clock phases, respectively. Clock phases and signal propagation direction are illustrated in Fig. 2.

Fig. 2
figure 2

Clock phases of cells and signal propagation in a wire [65]

2.3 QCA Gates

Three primitive and important gates in the QCA technology are Inverter Gate (IG), Majority Vote Gate (MVG or MG) and XOR gate. In Fig. 3, the layouts of some IGs are illustrated [66].

Fig. 3
figure 3

Various types of IG gates [66]

The MG is an important gate to digital circuits design in the QCA technology. The MG works according to the superposition of inputs. In Fig. 4, two common layouts of 3-input MGs are depicted [67].

Fig. 4
figure 4

Two types of 3-input majority gates; (a) original, (b) rotated [67]

The logical function of 3-input majority gate is defined by Eq. (2).

$$ MG3\left(A,B,C\right)= AB+ BC+ CA $$
(2)

The 2-input AND (AND2) and OR (OR2) gates are obtained by fixing polarization of one input of MG3 to −1 and + 1, respectively [67].

Another important QCA gate is 3-input XOR gate (XOR3). Figure 5 shows the layout of the QCA 3-input XOR gate [56].

Fig. 5
figure 5

The utilized 3-input XOR gate in [56]

3 QCA Full Adder

3.1 Circuit Theory

The outputs of the QCA full adder can be computed as follows [31,32,33,34,35,36,37,38,39]:

$$ Sum=A\oplus B\oplus {C}_{in}= MG\left({C}_{in},\overline{C_{out}}, MG\left(A,B,\overline{C_{in}}\right)\right); $$
(3)
$$ {C}_{out}= AB+A{C}_{in}+B{C}_{in} $$
(4)

Figure 6 shows the QCA block diagram for the implementation of this full adder circuit.

Fig. 6
figure 6

Logical diagram of QCA FA in [31,32,33,34,35,36,37,38,39]

In addition, the output of the full adder can be computed as follows [40,41,42]:

$$ Sum= MG\left(\overline{MG\left(A,B,{C}_{in}\right)}, MG\left(\overline{MG\left(A,B,{C}_{in}\right)},B,{C}_{in}\right),A\right)= MG\left(\overline{C_{out}}, MG\left(\overline{C_{out}},B,{C}_{in}\right),A\right) $$
(5)

Figure 7 shows the QCA block diagram for the implementation of this full adder circuit.

Fig. 7
figure 7

Logical diagram of QCA FA used in [40,41,42]

Moreover, the sum output can be computed as follows [22, 43,44,45,46,47,48,49,50, 52,53,54,55]:

$$ Sum= MG5\left(\overline{C_{out}},\overline{C_{out}},A,B,{C}_{in}\ \right) $$
(6)

Figure 8 shows the QCA block diagram for the implementation of this full adder circuit [22, 43,44,45,46,47,48,49,50,51,52,53,54,55].

Fig. 8
figure 8

Logical diagram of QCA FA in [22, 43,44,45,46,47,48,49,50, 52,53,54,55]

Recently, by using the advantage of 3-input QCA XOR gate, designing of QCA FAs has become more optimum and has facilitated by using only two gates (i.e. XOR3 and MG3) [21, 56,57,58,59,60, 68]. In this method, despite using a minimum number of gates, the main issue is how to interconnect inputs of two gates to each other in coplanar approach. It is because accessing to inputs become more restricted for using the FA in the larger QCA circuits.

3.2 Previous QCA Full Adder Layouts Design

In this section, the previous multilayer QCA full adders are reviewed.

Figure 9 shows the utilized FA in [51] that is based on block diagram that is shown in Fig. 8. It constructed of 51 cells in 0.03 μm2 effective area on three layers. It has 3 clock phases delay.

Fig. 9
figure 9

The layout of the QCA FA in [51]

Figure 10 shows the layout of the designed full adder in [53]. This design uses 52 QCA cells, 3 clock phases and 0.04 μm2 area in three layers.

Fig. 10
figure 10

The layout of the FA in [53]

Figure 11 illustrates the layout of the QCA FA design in [54]. The cell count, area and delay of this design are 31 cells, 0.01 μm2, and 2 clock phases, respectively.

Fig. 11
figure 11

The layout of the FA in [54]

Figure 12 shows two layouts of the designed full adders by Navi et al. in [22, 52]. The first layout that is illustrated in Fig. 12a has 73 cells, 0.04 μm2 area, and 3 clock phases delay. The second layout that is illustrated in Fig. 12b has 61 QCA cells, 3 clock phases, and 0.03 μm2 area.

Fig. 12
figure 12

The layouts of the designed FA a in [22] b in [52]

Another design has realized in [55] that is shown in Fig. 13. This designed FA has 22 QCA cells, 3 clock phases, and 0.01 μm2 area. In this layout, accessing to output cells is impossible unless using extra layers.

Fig. 13
figure 13

Three layers of the designed FA in [55]

It is noticeable that the reviewed designs in Figs. 10, 11, 12 and 13 are designed according to block diagram that is shown in Fig. 8.

Figure 14 shows the three-dimensional view of the utilized QCA FA circuit in [67]. This design consists of 23 cells, 3 clock phases, and 0.01 μm2 area in three layers.

Fig. 14
figure 14

The layout of the designed FA in [67]

In addition, it is feasible to place XOR3 and MG3 gates on two separated layers using the multilayer approach. In these designed circuits, accessing to the inputs and outputs for using the FA in larger circuits become more facilitated.

In this way, Safoev et al. [59, 60] has proposed efficient FA in three layers that uses only 31 cells as shown in Fig. 15. The latency of this design is 2 clock phases, and the required area is 0.02μm2.

Fig. 15
figure 15

Three layers of FA proposed in [59, 60]

4 The Proposed Circuits

4.1 The Proposed QCA Full Adder

As described in [18], where two beside input signals of MG3 (i.e. B and Cin inputs) in Fig. 15 with same polarization reach sooner than another input signal (i.e. A input), the B (or Cin) value temporarily dominates to A input value. Accordingly, MG3 acts like an inverter gate in this case. Thus, lengthening output paths can lead to noise amplification due to synergic effect. It can be tested that lengthening input paths in [59] can lead to wrong response where B = Cin.

Hence, drawing the mid-input (i.e. A input) closer to device cell of two gates (i.e. XOR3 and MG3) gives more reliable output response. In this paper, we use this property to design an efficient QCA full adder circuit.

The logical block diagram of the proposed FA circuit is presented in Fig. 16. In addition, Fig. 17 shows the three layers of the proposed efficient FA.

Fig. 16
figure 16

Logical diagram of the proposed FA

Fig. 17
figure 17

QCA layout of the proposed 3-layer QCA FA

Figure 18 shows three layers of the proposed FA separately. As illustrated in Fig. 18, the XOR3 gate is placed on the main layer (i.e. layer 0) and MG3 gate is placed in layer 2. Layer 1 consists of 3 cells for interconnecting two layers.

Fig. 18
figure 18

Three layers of the proposed FA

The presented work consists of 28 QCA cells. It has 2 clock phases delay. The occupation area of the proposed full adder is 0.01μm2.

4.2 The Proposed 4-Bit RCA

The accessibility to the inputs and outputs of the proposed FA is feasible and it is simple for using the proposed FA in larger designs. Therefore, we use the proposed FA for designing an efficient 4-bit ripple carry adder. Figures 19 and 20 show the logical diagram and the layout of the proposed 4-bit QCA RCA, respectively.

Fig. 19
figure 19

The logical diagram of the proposed 4-bit QCA RCA

Fig. 20
figure 20

The layout of the proposed 4-bit QCA RCA

The layout of the proposed RCA contains only 135 cells, and 0.06μm2 area. This design could be easily extended to n-bit RCA circuit.

5 Simulation Results and Comparison

The proposed 1-bit QCA full adder and 4-bit QCA RCA are simulated by using QCADesigner tool version 2.0.3. In this section, for determining the cost parameter value, the following equation is used:

$$ \mathrm{Cost}=\mathrm{Area}\ \left({\upmu \mathrm{m}}^2\right)\times \mathrm{Latency}\ \left(\mathrm{clock}\ \mathrm{cycle}\right) $$
(7)

5.1 The Proposed QCA Full Adder

Figure 21 shows simulation results using the bistable approximation engine by default settings. The simulation results illustrate that the designed FA performs correctly. The latency is 0.5 clock cycles.

Fig. 21
figure 21

The simulation results of the proposed FA

Table 1 compares our proposed 1-bit QCA FA with other existing designs. This comparison shows that our proposed QCA FA is most cost and delay efficient compared to other QCA FA circuits in [21, 22, 31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59, 67, 68].

Table 1 Comparison of the QCA Full Adders

Based on our simulation results that are shown in Table 1, our proposed QCA FA circuit has a minimum number of cell count, area, delay and cost in comparisons with previous designs in [21, 31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51, 53, 54, 56,57,58,59]. For example, the proposed QCA FA circuit provides an improvement by about 26, 50, 33 and 66% in terms of cell count, area, latency, and cost, respectively compared to [42].

Despite less cell count in [55, 67], the designed FA circuit in this paper has a better delay and cost term features and has improved by about 33% in these two terms. Moreover, despite the presented FA circuit in [55], our design has accessibility to output cells.

Although the designed FA circuits in [21, 36, 44, 54, 56, 58, 59, 68] have the same delay time with our proposed QCA FA circuit, our proposed FA circuit has supremacy in cell count, area and cost parameters. In comparison with the designed FA in [21], our design has one cell lesser and improvement by about 50% in area and cost. In addition, our design has improvements in cell count, area and cost parameters by about 31, 75 and 75% in comparison with [56], 9, 50 and 50% in comparison with [54, 59], 15, 50 and 50% in comparison with [36], 37, 74 and 74% in comparison with [44], 31, 75 and 75% in comparison with [58], and 9, 66 and 66% in comparison with [68], respectively.

Practically, it could be tested that lengthening the input and output lines of the proposed FA circuit leads to more reliable output response and the proposed FA circuit is more relaxed about changing the input and outputs lines than previous design in [59].

The results show that we propose an efficient QCA full adder in terms of cell count, area, delay, and cost. Moreover, compatibility with other designs, accessibility to the inputs and outputs and flexibility for changing the length of the input and output lines are other advantages of the proposed FA circuit. Hence, the proposed QCA FA is applicable to use in designing larger QCA circuits such as RCA circuits.

5.2 The Proposed 4-Bit RCA

Figure 22 illustrates simulation results of the proposed 4-bit RCA that is simulated by bistable approximation engine using 220,000 samples. Other setting remained by default. The proposed RCA has 5 clock phases or 1.25 clock cycles latency.

Fig. 22
figure 22

The simulation results of the proposed 4-bit QCA RCA

The comparison between the proposed 4-bit ripple carry adder circuit and previous circuits are demonstrated in Table 2.

Table 2 Comparison of the 4-bit RCA circuits

According to results in Table 2, the proposed 4-bit QCA RCA circuit has best results in terms of cell count, area and cost in comparison to all mentioned designs in Table 2. So, our design provides improvement at least by about 22, 40 and 40% in terms of cell count, occupation area and cost, respectively in comparison to other QCA RCA circuits in this table. For example, despite the more delay of one clock phase in comparison to [36], the provided improvement in terms of cell count, area and cost are 22, 57, and 46%, respectively.

Despite similar delay time to those of the designs proposed in [58, 60], our proposed design provides considerable improvement from the point of view of cell count, occupation area, and final cost. So, the cell count, area, and cost terms have been reduced by about 26, 39, and 40%, respectively compared to [60].

6 Conclusion

The QCA technology as a promising and developing alternative technology for CMOS technology is in the focal point of the researcher interests for designing the ultra-dense and ultra-speed digital circuits. In this paper, we designed a new and efficient QCA full adder circuit by designing an efficient circuit for sum and carry output in separated layers. Then, we designed a novel and efficient 4-bit QCA RCA using 135 QCA cells in 0.06 μm2 with 5 clock phases delay. The proposed designs are simulated using QCADesigner tool version 2.0.3 that demonstrated correctness work of the proposed designs. Besides, the comparisons showed that the proposed QCA circuits have advantages compared to other QCA circuits in term of area, latency and cost.