1 Introduction

At the nanoscale, the widely used CMOS technology has physical limits [1]. According to researchers, the emerging quantum-dot cellular automata (QCA)-based technology is expected to tackle these design flaws at the nanoscale. In this respect, QCA technology is the CMOS technology’s successor, and it is a rapidly growing nanotechnology, particularly for nanocomputing [2]. The three significant advantages of the QCA technology are high operation speed or negligible delay, high circuit density, and reduced power consumption [3]. A square-shaped cell (called QCA cell), with four quantum dots at each corner, is the heart of this technology [2]. There are four fabrication possibilities of a cell: semiconductor, metallic, magnetic, and molecular [4]. The recent trends in QCA-based research are simple layout design and corresponding energy estimation. Multiplexer and demultiplexer circuits [5,6,7], adder circuits [8, 9], multiplier circuits [10], comparator circuits [11, 12], counter circuits [13], memory circuits [6, 14], etc., have all been designed using QCA over the previous decade. The current article covers the design, energy estimation, and analysis of a simple QCA multiplexer.

The rest of the article is organized as follows: Sect. 2 covered an overview of QCA technology, and Sect. 3 included earlier related research. Section 4 demonstrated the layout of the proposed multiplexer. Sections 5 and 6 included energy calculations using QDE and QCAPro, respectively. In Sect. 7, the cost functions were computed. In Sect. 8, the proposed circuit has been studied and compared to similar existing designs. Finally, Sect. 9 brings the paper to a conclusion.

2 Overview of QCA technology

As mentioned in the earlier section, the backbone of QCA technology is a square-shaped cell with four quantum dots inside, located near each corner to create potential wells [1]. Due to the quantum mechanical tunneling mechanism, these wells localize two electrons and tunnels between these dots [4]. According to Coulomb’s law, these two electrons always try to occupy the corner positions to maintain a long distance; therefore, two stable configurations are possible [2]. Each arrangement is called a ‘polarization state’ and is represented as polarization ‘−1’ or binary ‘0’ and polarization ‘−1’or binary ‘1’, as shown in Fig. 1.

Fig. 1
figure 1

QCA cell

If there is a series of cells and the input polarization is forwarded to the next cell until the output cell, it is called QCA wire [15]. Here, input polarization reflects at the output cell as shown in Fig. 2.

Fig. 2
figure 2

QCA wire

Another important block is the majority gate or majority voters, which comprises five cells. There are three input cells and one output cell [2]. A cell placed in the middle is named device cell or driver cell, as shown in Fig. 3. The logic expression of the Majority gate is mentioned in Eq. (1).

Fig. 3
figure 3

Majority gate

$$MAJ\left(A,B,C\right)=AB+BC+CA$$
(1)

Here, A, B, and C are the inputs, and MAJ (A, B, C) is the output. It acts as two inputs AND gate if any one of the inputs is set as polarization ‘−1’; it acts as two inputs OR gate if any one of the inputs is set as polarization ‘ + 1’.

As shown in Fig. 4, an inverter (NOT gate) is designed by placing two cells in a diagonal orientation, where output polarization is opposite to input polarization [2]. Figure 4a and b depict two distinct inverter arrangements.

Fig. 4
figure 4

QCA NOT gate (a) conventional layout, and (b) corner layout

The information flow among the cells in QCA technology is confirmed by applying a proper clock whose primary function is to create an electric field to the cells, which controls the inter-dot barriers [16]. The QCA clocking uses four different clock phases, and each has a 90° phase difference. The four clock phases are switch, hold, release and relax, as shown in Fig. 5. Cells get polarized at the switch phase according to input polarization value; cells remain at the same polarization at the hold phase, become unpolarized or change their polarization at the release phase, and attain null polarization at the relaxing phase [16]. Proper clock flow ensures a correct flow of information in QCA circuit designs.

Fig. 5
figure 5

QCA clocking

3 Prior reported works

Multiplexer (MUX) is an indispensable component in nanocomputing and nanocommunication as it selects a single path from multiple input channels. For a 2:1 multiplexer, if the select line ‘SL’ is 0, it selects ‘IN1’, and if ‘SL’ is 1, it selects ‘IN2’ according to Eq. (2) mentioned below.

$$\mathrm{OUT}=\mathrm{IN}1.\overline{SL }+\mathrm{ IN}2.\mathrm{SL}$$
(2)

This Eq. (2) has been derived from the logic operation table shown in Table 1 and the symbolic representation is shown in Fig. 6.

Table 1 Logical operational table of 2:1 multiplexer
Fig. 6
figure 6

Symbol of 2:1 multiplexer

Many reported works related to QCA multiplexer exist in the literature, but we consider a few popular designs published in recent years.

A QCA multiplexer with a cell complexity of 19 was proposed by Jeon, which employs three majority gates as NAND gates and four inverters [5]. This architecture has a delay of 0.5-clock-cycle and aids in developing NAND-based multiplexer equations [5]. Ahmadpour et al. proposed a single-layer 2:1 multiplexer with only 10 QCA cells, then extended the design for higher bits operations [6]. It is a majority gate-less design with a latency 0.5-clock-cycle [6]. On the other hand, without employing a majority gate, Majeed et al. found an outstanding QCA mux architecture with a significant decrease in cell count and area [17]. It has a cell count of 9 with 0.25-clock-cycle latency [17]. However, the anomaly in input–output arrangements and the unequal wire length are the limitations of this design [17]. The current work is basically motivated by the work reported in [17]. Recently Jeon recommended another efficient QCA multiplexer with 13 cells and designed registers utilizing the same multiplexer [18]. It has a small input to output delay of 0.25-clock-cycle, and it is a majority gate-less layout [18]. Khan and Mandal calculated the kink energy of a tiny multiplexer with 17 cells [19]. Three majority gates and an inverter were used in this design [19]. The reported circuit has 0.75-clock-cycle latency [19]. The fault-tolerant QCA multiplexer was reported, and its power dissipation was calculated by Ahmadpour and Mosleh in [20]. The reported multiplexer has 35 cells with three majority gates and an inverter [20]. The circuit latency, however, is 1-clock-cycle, which is quite high [20]. Mosleh designed a layout of QCA multiplexer and proposed a new concept of MV32-based gate for QCA layout design [21]. The reported multiplexer utilized 21 cells, three majority gates, and an inverter [21]. The multilayer design and high latency of 0.75-clock-cycle are the main limitations of this design [21]. Asfestani and Heikalabad recently presented an ultra-efficient multiplexer using only 12 cells [22]. This majority gate-less design has 0.25-clock-cycle latency, which is the main advantage of this design [22]. Rashidi et al. proposed an efficient multiplexer using 15 cells and demonstrated the performance using different parameters like delay, circuit complexity, area, etc. [23]. The reported multiplexer utilized two majority gates and an inverter [23]. In addition, it has a latency of 0.5-clock-cycle [23]. Das and De suggested a QCA multiplexer [24] that is comparable to the architecture reported in [19] but with lower latency. It consisted of 17 QCA cells, three majority gates, and one inverter [24]. The authors concluded that the design might play a key role in nanocommunication [24]. Sen et al. proposed a modular-based design approach of QCA multiplexer with 23 cell counts, and the effectiveness was demonstrated by the synthesis of configurable logic blocks [25]. With 0.5-clock-cycle delay, three majority gates and an inverter were used to design the block [25].

All of the designs mentioned above have a critical flaw: they lack comprehensive energy estimation, which is still a gap in the QCA literature. Each cell in a QCA circuit is considered an energy bath. During operation, cell-to-cell energy transfer occurs according to the given clock cycle. As a result, for a thorough study, the energy transformations for each QCA energy ‘bath’ to be addressed using the tool QCADesigner-E (QDE). Therefore, this paper presents a simple QCA multiplexer unit without using any majority gate to overcome these issues.

Additionally, a detailed calculation of energy dissipation is presented. It is worth mentioning that the layout design simulation tool QCADesigner 2.0.3 [26] has been used to verify the primary circuit and check the output. A new tool, QCADesigner-E (QDE), is used for energy estimation of the design, and it works on the coherence vector mode [27, 28]. In addition, another tool QCAPro [29] also used to estimate the energy of the proposed circuit.

4 Proposed multiplexer unit

This part of the article covers the layout design of the prior reported multiplexers and the proposed multiplexer using QCA technology. The most extensively used QCA simulation tool is QCADesigner. This tool can be used to develop and validate any QCA-based circuit layout. The proposed work is designed and validated using this tool. There is flexibility in using any of the two modes: coherence vector mode and Bistable approximation mode; nevertheless, the designs in this study are examined using the coherence vector engine of QCADesigner version 2.0.3 for performance analysis using default software settings and simulation parameters.

4.1 Simulation layout of the proposed block

The QCA equivalent circuit layout of a simple 2:1 multiplexer is shown in Fig. 7. There are two input lines, namely ‘IN1’ and ‘IN2’; one selects line ‘SL’ and one output line as ‘OUT’ as shown in Fig. 7. The proposed multiplexer uses only one OR gate and a total of 10 QCA cells. This layout isn’t based on a majority of voters, but it may be described as an OR-based MUX. Figure 7 depicts previously reported multiplexers as well as suggested multiplexer layouts.

Fig. 7
figure 7

Prior reported and proposed QCA multiplexer layouts. (a) Layout reported in [5]. (b) Layout reported in [18]. (c) Layout reported in [19]. (d) Layout reported in [20]. (e) Layout reported in [22]. (f) Layout reported in [23]. (g) Layout reported in [24]. (h) Layout of proposed MUX

4.2 Simulation result of the proposed block

This layout design step has been carried out in QCADesigner 2.0.3 tool environment using coherence vector simulation mode. The temperature is fixed at 2 Kelvin, and Euler’s approximation method is used for the analysis. We have verified the generated output for all the samples, and the best-reported multiplexer output is shown in Fig. 8. The simulation output can easily be checked using the previously mentioned truth table, Table 1. When ‘SL’ = 0, the output depends on ‘IN1’ and ‘IN2’ doesn’t affect the output; e.g., if ‘SL’ = 0 and ‘IN1’ = 0 then ‘OUT’ = 0; when ‘SL’ = 0 and ‘IN1’ = 1 then ‘OUT’ = 1. Now, if ‘SL’ = 1 then ‘IN1’ doesn’t have any effect on the output and the result depends on ‘IN2’; e.g., if ‘SL’ = 1 and ‘IN2’ = 0 then ‘OUT’ = 0; if ‘SL’ = 1 and ‘IN2’ = 1 then ‘OUT’ = 1 as shown in Fig. 8a. Figure 8b is the simulation report in bus mode where the representation of input lines as ‘BUS_IN1’, ‘BUS_IN2’; select line as ‘BUS_SL’; output line as ‘BUS_OUT’. The layout has a delay of 0.25-clock-cycle because it operates on a single clock phase. In QCA circuits, four clock phases or four clock zones impose 1-clock-cycle delay.

Fig. 8
figure 8

Simulation outputs of the proposed multiplexer

5 Energy estimation using QDE

The energy dissipation is done using the newly introduced tool QDE for each QCA coordinate, considering them as a ‘bath’ of energies. The energy behavior is presented over complete QCA clock cycles. As we know, the QCA cell is in the depolarized state at the beginning of the clock cycle; therefore, to enter a polarization state, energy is collected from the clock and surrounded by cells. At the end of the clock cycle, the cell goes to the depolarized state by restoring energy to the clock and disseminating it to enclosed cells. Here, a part of energy also dissipates to the environment.

First of all, it should be clear that each QCA cell has been considered as a bath of energy, and the total energy transfer for each clock is represented as E_BATH [27, 28]. The amount of energy transferred from cells to the neighboring environment is E_EV, same as E_BATH and still an inconsistency. Therefore,

$$E\_BATH=E\_EV$$
(3)

The amount of energy dissipated may be divided into three major parts; the first part is already discussed and represented as E_EV. The second part is the inter-cell energy transfer in the presence of a clock signal, described as E_CK. The last part is E_IO is the energy transfer between QCA cells. The E_IO energy is the result of incoming or outgoing energy components, which means Eq. (4) below, and it is simple to understand for a QCA wire [28].

$$E\_IO=E\_IN-E\_OUT$$
(4)

A minimal error may be generated during the calculation of energy dissipation, expressed in Eq. (5) below.

$$E\_RR=E\_EV-(E\_CK+E\_IO)$$
(5)

The error may be ± type depending on the direction of energy transfer; for example, if it is ( +), energy transfers to the neighboring environment [27, 28], and if it is (−), then energy coming from the environment. This mathematical analysis has been applied to compute the energy dissipation of our current work, where eight multiplexers have been considered as samples of this experiment.

Each cell of the entire array of cells is considered fixed coordinate, and the default parameters are used in QCADesigner-E. Total 500,000 samples were assessed for simulation, and the mode of the simulation was coherence vector.

The energy estimation table of prior reported and proposed multiplexers using QDE is shown in Table 2 Small errors may occur during the energy estimating process, although they are negligible. Table 2 also displays the minor calculation errors that occurred during energy estimation operations for multiplexers. The suggested multiplexer’s total energy consumption is 11.30 meV, with a little calculation error of −1.21 meV, and its average energy dissipation is 1.02 meV, with a minor calculation error of −0.110 meV.

Table 2 Analysis of energy consumption of the proposed and existing multiplexers using QDE

6 Energy estimation using QCAPro

This experiment has been performed only on the projected multiplexer as this tool is a well-known and popular energy estimation tool, and lots of work has been done previously using this tool. Here, energy estimation is predicted using Hartree–Fock approximation, and the expectation energy value is expressed as

$$\varvec{{E}} = \langle \varvec{H}\rangle = \frac{\hbar }{2} \times \vec{\Gamma } \times \vec{\lambda }$$
(6)

Here, ћ is reduced plank constant, λ is the coherence vector, and Γ a tri-dimensional is the energy vector.

Now the instantaneous power is:

$${P}_{ins}=\frac{d{\varvec{E}}}{dt}$$
(7)

This energy is measured at a constant temperature of 2 K with three tunneling levels as γ = 0.5EK, γ = 1.0EK, and γ = 1.5EK. The energy dissipation of the suggested multiplexer at a fixed temperature 2 K has been calculated using the tool QCAPro. The leakage energy dissipated as 3.70, 9.25, and 18.87 meV for three tunneling levels as γ = 0.5EK, γ = 1.0EKand γ = 1.5EK, respectively. Similarly, the switching energy dissipated as 9.97, 7.45, and 5.23 meV for γ = 0.5EK, γ = 1.0EKand γ = 1.5EK. Finally, the total energy dissipations were 10.37, 18.70 and 27.10 meV for γ = 0.5EK, γ = 1.0EK and γ = 1.5EK. The energy calculation using QCAPro of proposed and existing multiplexers is tabulated in Table 3.

Table 3 Analysis of energy consumption of the proposed and existing multiplexers using QCAPro

7 Calculation of cost functions

When analyzing the performance of a QCA circuit, the cost is a crucial factor to consider. There are three significant cost functions to consider for QCA circuits.

The area-delay cost function is calculated using the following equation: ‘A’ is the area in nm2 and ‘L’ is the delay or latency in the clock cycle. Generally, the unit of this cost function is nm2-square-clock cycle (nm2-scc) [30].

$$\mathrm{Area}-\mathrm{Delay}-\mathrm{Cost}=A\times {L}^{2}$$
(8)

The occupied area of the proposed layout is 7644 nm2 and it has 0.25-clock-cycle delay. Therefore, the area-delay cost is 478 nm2-scc.

Another cost function is the QCA-specific cost, and it is expressed in Eq. (9) below. Here, ‘M’ is the number of used majority gates, ‘I’ is the number of used NOT gates, ‘C’ is the number of applied clocks, and ‘V’ is the number of used crossovers. The general unit to express this function is square-clock cycle (scc) [30].

$$\mathrm{QCA}-\mathrm{specific}-\mathrm{cost}=({M}^{2}+I+{C}^{2})\times {V}^{2}$$
(9)

According to Eq. (9), the QCA-specific cost of the proposed multiplexer is 0.0625 scc.

If ‘E’ is the energy in meV and ‘L’ is the delay or latency in clock cycle then the energy-delay cost is calculated using the Eq. (10). The general unit for this cost function is meV2-square-clock cycle (meV2-scc).

$$Energy-Delay-cost={E}^{2}\times {L}^{2}$$
(10)

Using the QDE-based total energy dissipation, the calculated energy-delay cost for the proposed block is 0.0650 meV2-scc.

Table 4 summarizes the various cost functions of the proposed and previously reported multiplexers.

Table 4 Analysis of cost functions of the proposed and existing multiplexers using QDE

8 Comparisons and discussion

The most important advantage of the proposed design is the simplicity in layout design and negligible latency. The majority gate-less design made the design simpler. It has ultra-low cell complexity and delay. In particular, there are ~ 17, ~ 20, and ~ 17% improvements in cell complexity, total area requirement, and cell area measurement, respectively, compared to the previously reported design in [22]. Furthermore, the proposed design utilizes 42.38% of the available area as shown in Table 5. The proposed design is energy efficient compared to the earlier reported designs. Compared to the best-reported design [19], the proposed layout has got improvement in average and total energy dissipation according to QDE by ~ 11% and ~ 9%, respectively.

Table 5 Comparative study of the proposed design with prior reported designs

Similarly, the proposed design is ~ 31% more energy efficient than the earliest recorded design [22], according to QCAPro (γ = 0.5EK). The proposed architecture is 1.25 times more cost effective in terms of energy-delay. Furthermore, the proposed multiplexer has a ~ 37% lower energy-delay cost. Therefore, the proposed multiplexer is efficient from the energy dissipation point of view, and it might be helpful to design higher-order circuits.

9 Conclusion

Scholars are fascinated by quantum-dot Cellular Automata because of its ultra-low power consumption and incredible speed. The multiplexer is critical in QCA circuits, notably for nanocomputing and nanocommunication applications. Any QCA circuit’s energy estimation is essential for properly evaluating its performance. This article provides a simple multiplexer layout and computes the energy dissipation with success. Other performance-measurement parameters have also been examined, and it has been proved that the proposed circuit outperforms the previously reported designs. The suggested design, in particular, is a majority gate-less structure with ultra-low latency. Improvements in cell complexity and area requirements have been reported. Two tools, QDE and QCAPro, were used to calculate the energy dissipation. According to QDE, the reported multiplexer’s total energy dissipation is 11.30 meV, with an average energy dissipation per cycle of 1.02 meV. Furthermore, using the QCPro tool, the total energy dissipation of the suggested multiplexer is 18.70 meV at a fixed temperature of 2 K with tunneling level = 1.0EK. As a result, this simple multiplexer design is highly energy efficient and may be utilized as a building block for more complex circuits.