Introduction

A clock signal is needed for sequential designs to function. In various sequential schemes, some different frequency- and phase-related clocks are necessary for built-in components. In today’s radio frequency (RF) integrated circuits and transceivers operating at microwave frequencies, one of the most critical units is the clock manager. A CMU (clock management unit) is used to produce different clock frequency signals and to control the clock phase characteristics inside integrated circuits.

Among the greatest critical parameters of clock management units are operating frequency and power dissipation (Mandal and Bhattacharyya 2008; Safarian et al. 2006; Sengupta and Hashemi 2006). Digital CMUs realized in modern integrated-circuit technologies can work up to tens of gigahertz. The main limiting issue of using these units at higher frequencies is the need of a very fast settling frequency feedback loop (Coussy and Morawiec 2008; Shu and Snchez-Sinencio 2004; Razavi et al. 1995). Therefore, the necessity for a new technology that can offer a revolutionary approach to work at higher frequency is seemed more vital than ever.

QCA (quantum-dot cellular automata) is likely being the most advantageous alternative for future power-efficient and high-speed systems. QCA is physical structure of classic cellular automata based on the mechanic quantum effect suggested in Lent et al. (1993), created in Lent and Tougaw (1997), and fabricated in Khatun et al. (2013), Yadavalli et al. (2006), and Hu et al. (2005). In QCA technology, quantum dots are used instead of transistors in CMOS technology for implementing circuits and the position of electrons in quantum dots demonstrates logic states (Lu and Lent 2005). QCA structures have attractive features such as extra low-power dissipation (Blair et al. 2009) and high operational speed (THz range) (Lu et al. 2007; Dysart 2013) at nanoscale.

The main objective of this paper is to propose power-efficient and high-speed QCA-based structure for the CMU. Therefore, first, innovative QCA structures for edge-sensitive D-FFs (D-type flip-flops) are presented with minimum complexity and power. Afterward, by integrating well-optimized D-FFs as fundamental building blocks, a novel architecture for CMU is proposed and implemented in QCA technology. The frequency synthesizer and phase splitter are basic functionality components of the proposed unit. A frequency synthesizer divides the reference clock frequency with power-of-two division ratios for synthesizing new clock frequencies. Then the phase splitter is integrated with the frequency synthesizer to provide quadrant-phase clock outputs of 0°, 90°, 180°, and 270° phase shifting versions. Consequently, the proposed QCA CMU can solve clocking issues exclusively in high-performance integrated circuits operated at very high frequencies.

Quantum-dot cellular automata

In this section, the QCA basics that will be helpful in this paper are presented. The QCA is an emerging technology that supplies logic states not as voltage levels but rather based on the situation of electrons pair (Lent et al. 1993). There are two types of QCA cells: 90° cells as shown in Fig. 1a and 45° cells Fig. 1b. As shown in Fig. 1, the QCA cell consists of four quantum dots that are positioned in corners of a square. There are two extra mobile electrons for each QCA cell which are allowed to tunnel together between dots inside the cell. The Columbic repulsion between two electrons makes the QCA cell as a bi-state device with a polarization (P) of −1 (binary logic 0) and +1 (binary logic 1) (Lent et al. 2006; Ravichandran et al. 2005) as shown in Fig. 1. The 90° cell and the 45° cell do not interact with each other when they are aligned, so 45° cells can be used for coplanar wire crossings. Two structure types of QCA wire are shown in Fig. 2. If QCA cells are located next to each other and formed wire, the Columbic repulsion between electrons makes cells have the same polarization as shown in Fig. 2a or opposite polarization as shown in Fig. 2b (Kim et al. 2006). A coplanar crossover can be implemented by two different types of QCA wires in one layer as shown in Fig.3.

Fig. 1
figure 1

QCA cell. a 90o type. b 45o type

Fig. 2
figure 2

QCA wire. a 90o type. b 45o type

Fig. 3
figure 3

Coplanar crossover technique

In QCA technology, most of the circuits are designed based on majority and inverter gates. The majority gate (Maj) is implemented by five QCA cells as shown in Fig. 4 and its logic function can be expressed by Eq. (1). Two-input AND and OR gates can be implemented using three-input majority gate by setting one of majority gate inputs to constant logic value “0” and “1,” respectively. The other gate in QCA is inverter gate that can be implemented using seven cells or by placing two cells in a diagonal arrangement as illustrated in Fig. 5. This gate is used to reverse the polarization of QCA cells.

Fig. 4
figure 4

QCA majority gate

Fig. 5
figure 5

QCA inverter

$$ \mathrm{Maj}\left( A, B, C\right)= A. B+ B. C+ A. C $$
(1)

A QCA cell has four clock phases (Lent and Tougaw 1997; Cho and Swartzlander 2009) called switch, hold, release, and relax as shown in Fig. 6a. During the switch phase, potential barriers rise, and input cells interact with neighbor cells into become polarized. The real computation is performed in this clock phase. In the hold phase, barriers are held high to prevent electrons tunneling and to fix the polarization of QCA cells. In the release phase, barriers are lowered and then cells are completely unpolarized in the relaxing phase (Lent and Tougaw 1997; Cho and Swartzlander 2009). The clocking system of QCA structures is composed of four clock zones as shown in Fig. 6b. All cells in each zone are controlled by the same QCA clock to perform a specific functionality and then serve as inputs for cells in next zone. The data is transmitted in a pipeline mechanism.

Fig. 6
figure 6

QCA clocking mechanism. a Four phases of the complete cycle. b Four clock zones

For a correct operation of the QCA circuit, the synchronization of majority gates with a four-phase clock signal and optimized delay are important issues. QCA gates should be arranged in successive clocking zones, and added cells may be applied to the inputs and outputs in different clock zones for circuit synchronization (Kim et al. 2006). The closed loop should be implemented by using, at least, four clocking zones to allow the motion of the stored data through the loop cells (Frost et al. 2002). Moreover, individual cells in the clocking zones lead to noisy QCA structures (Kim et al. 2006). Nevertheless, if many cells are involved through one clock zone, the clock rate and delay could descend (Antonelli et al. 2004). Timing and layout issues related to QCA design rules are collected in Liu et al. (2011) and Liu et al. (2013). All these rules give consistency and robustness for proposed QCA-based circuits.

QCA D-type flip-flops

Certainly, key components of the sequential circuit design are flip-flops (Vetteth et al. 2003).Various beneficial characteristics can be exploited for implementing flip-flops in QCA technology such as consistency, high-speed, and low-power consumption (Sheikhfaal et al. 2015). However, QCA-based sequential designs still attract attention in research due to their difficulties regarding the hardware complexity and latency. By utilizing inherent capabilities of QCA cells, two main methods for designing flip-flops are classified as line-based (Vankamamidi et al. 2005; Taskin and Hong 2008) and loop-based (Vankamamidi et al. 2008; Shamsabadi et al. 2009; Yang et al. 2010; Dehkordi et al. 2011; Hashemi and Navi 2012; Angizi et al. 2014) structures. In the line-based structure, data storage is performed based on a three-clock-zone wire, while in the loop-based design, data storage is performed based on a clock-zone loop. In QCA technology, the loop-based layout has lower complexity than the line-based layout.

A D-FF is one of the most fundamental elements in sequential logic. In this section, previously presented structures of edge-triggered D-FFs are described. Thus, after studying these structures, proposed edge-triggered D-FFs are introduced to decrease the input-to-output delay by providing fewer clock zones and to reduce the area and power dissipation with using a less possible number of QCA cells.

Previous QCA D-type flip-flops

The QCA D-FF has been implemented based on embedded wires by utilizing the four-phase clocking arrangement (Huang et al. 2007). However, this method causes the complexity of designing sequential circuits due to timing and synchronization constraints. The independent clock (CLK) terminal should be used as input for QCA flip-flops to diminish the influence of these constraints. According to this knowledge, various designs of D flip-flops have been suggested and implemented in QCA technology. Here, an overview of these previous structures is presented.

The coplanar wire crossing scheme has been used in Shamsabadi et al. (2009) and Yang et al. (2010) to implement and optimize QCA D-FF structures. In Shamsabadi et al. (2009), a rising-edge-triggered D-FF has been introduced in QCA technology. This design is loop-based structure, and its output is sensitive to the positive transition of the clock input. The QCA layout of this structure is shown in Fig. 7. The CLK signal and its inversion value are used in this D-FF to control the evaluation of output at each rising transition of CLK signal. In Yang et al. (2010), a dual-edge-triggered D-FF design has been constructed as shown in Fig. 8. In this structure, the output of D-FF is evaluated at each falling and rising transition of CLK signal using current and delayed versions of CLK signal. The input-to-output delay is 2.25 QCA clocking cycles.

Fig. 7
figure 7

QCA layout of rising-edge-triggered D-FF presented (Shamsabadi et al. 2009)

Fig. 8
figure 8

QCA layout of dual-edge-triggered D-FF presented (Yang et al. 2010)

On the other hand, the D-FF designs in Hashemi and Navi (2012 and Xiao et al. (2012) have been implemented using 90o QCA cells without any crossover wire. In Hashemi and Navi (2012), appropriate “edge to level” converters have been attached for a clock terminal to obtain rising-, falling-, and dual-edge-triggered D-FFs as shown in Figs. 9, 10, and 11, respectively. In Xiao et al. (2012), a dual-edge-triggered D-FF design has been constructed as shown in Fig. 12. This D-FF is implemented based on a multiplexing scheme that evaluates the output at each falling and rising transition of CLK.

Fig. 9
figure 9

QCA layout of rising-edge-triggered D-FF presented (Hashemi and Navi 2012)

Fig. 10
figure 10

QCA layout of falling-edge-triggered D-FF presented (Hashemi and Navi 2012)

Fig. 11
figure 11

QCA layout of dual-edge-triggered D-FF presented (Hashemi and Navi 2012)

Fig. 12
figure 12

QCA layout of dual-edge-triggered D-FF presented (Xiao et al. 2012)

Proposed QCA D-type flip-flops

In this part, new efficient QCA structures of edge-triggered D-FFs are presented with the aim of building low-complexity and useful high-performance models for implementing different sequential circuits in QCA technology.

The D-FF is a memory element with two inputs (D and CLK) and output (Q). It implements the following functionality: transparent (Q follows input D) and hold (Q remains unchanged). The operation of the proposed rising-edge-triggered D-FF is shown in Table 1. The value of delayed clock signal (CLKt-1) and the current value of it (CLKt) are used to evaluate the output of D-FF at each rising transition of the CLK signal. According to this table, when the CLK signal is changed from “0” to “1,” the value of input D will be stored in the output Q and otherwise, the output will not be changed. The corresponding schematic diagram of this D-FF is illustrated in Fig. 13. As shown in Fig. 13, this design requires four majority gates connected together in three successive gate levels.

Table 1 Operation table of proposed rising-edge-triggered D-FF
Fig. 13
figure 13

Rising-edge-triggered D-FF. a Graphic symbol. b Schematic diagram

The QCA layout of proposed rising-edge-triggered D-FF is shown in Fig. 14. As shown in Fig. 14, when the CLK signal is in positive transition (from low to high), the input bit D is transmitted into the output Q and stored in the closed loop and otherwise, the stored value is preserved in the loop due to a QCA-pipelined process. It consists of 46 cells in an area of 0.04 μm2 and latency of 1.25 QCA clocking cycle. It is worth mentioning that the proposed structure is implemented in a single layer using 90o QCA cells without any crossover wire. Furthermore, this implementation has a robust QCA structure that is achieved by applying design rules mentioned in Kim et al. (2006), Frost et al. (2002), Antonelli et al. (2004), Liu et al. (2011), and Liu et al. (2013).

Fig. 14
figure 14

QCA layout of proposed rising-edge-triggered D-FF

Similarly, the proposed falling-edge-triggered D-FF operates based on the operation table demonstrated in Table 2. According to this table, when the CLK signal is changed from “1” to “0,” the value of input D will be stored in the output Q and otherwise, the output will not be changed. The corresponding schematic diagram of this D-FF is illustrated in Fig. 15. As shown in Fig. 15, this design requires four majority gates connected together in three successive gate levels. The proposed QCA implementation in Fig. 16 acts as a falling-edge-triggered D-FF. As shown in Fig. 16, when the CLK signal is in negative transition (from high to low), the input bit D is transmitted into the output Q and stored in the closed loop and otherwise, the stored value is preserved in the loop due to a QCA-pipelined process. It consists of 47 cells in an area of 0.04 μm2 and latency of 1.25 QCA clocking cycle.

Table 2 Operation table of proposed falling-edge-triggered D-FF
Fig. 15
figure 15

Falling-edge-triggered D-FF. a Graphic symbol. b Schematic diagram

Fig. 16
figure 16

QCA layout of proposed falling-edge-triggered D-FF

Furthermore, a proposed dual-edge-triggered D-FF is implemented as shown in Fig. 17 based on the operation table demonstrated in Table 3. According to this table, when the CLK signal is changed from “0” to “1” or “1” to “0,” the value of input D will be stored in the output Q. It is clear that an OR logic operation between rising edge and falling edge of CLK signal is used to evaluate the output of D-FF. It consists of 70 cells in an area of 0.07 μm2 and latency of 1.5 QCA clocking cycle.

Fig. 17
figure 17

QCA layout of proposed dual-edge-triggered D-FF

Table 3 Operation table of proposed dual-edge-triggered D-FF

Proposed QCA clock management unit

The CMU consists of two integrated functional modules: frequency synthesizer and phase splitter. The frequency synthesizer performs the function of power-of-two division ratios on the frequency of input reference CLK to provide a flexible wide ranging of output frequencies. The frequency of output clock signal expressed in Eq. (2) is derived from the division of reference clock frequency by a factor of 2n (where n is a positive integer).

$$ {f}_{\mathrm{out}}={f}_{\mathrm{in}}/{2}^n $$
(2)

The divide-by-two frequency divider can be constructed by using the rising-edge-triggered D-FF as shown in Fig. 18a. As can be seen from Fig. 18a, a divide-by-two frequency divider can be built using a D-FF feeding back the inverted output value directly to the input D at each rising edge of the CLK signal. As a result, the output changes to its opposite value at each clock rising edge and the output frequency is a half of the reference clock frequency. The output clock signal is CLK/2 with a 50% duty cycle.

Fig. 18
figure 18

D-FF based CMU. a Divide-by-two frequency divider. b With phase splitter

Consequently, the proposed CMU with phase-splitting options is constructed by using two D-FFs, one of them is rising-edge-triggered and the other is falling-edge triggered, as shown in Fig.18b. This construction takes a reference CLK and reverses the output of one D-FF at each rising edge and the output of other D-FF at each falling edge. As a result, two outputs of D-FFs are generated that are quadrature (90o) out of phase with each other. Therefore, four clock outputs of CMU are 90o (0.25 period) out of phase with each other. Output clock signals are CLK/2, CLK/2_90, CLK/2_180, and CLK/2_270 with a 50% duty cycle. The CLK/2_90 signal is phase shifted 90° from the CLK/2 signal and so on.

The block diagram of CMU with two inputs and four outputs is outlined in Fig. 19a. The control input (C) is used to initialize and start the CMU. The corresponding QCA layout is explored in Fig. 19b. The QCA layout is achieved by employing two proposed rising-edge and falling-edge-triggered D-FFs shown by dashed rectangular boxes.

Fig. 19
figure 19

Proposed CMU. a Block diagram. b QCA layout

The proposed QCA CMU in Fig. 19 is a modular architecture and therefore straightforwardly expandable. Cascaded ‘n’ stages of CMU model allow additional frequency-synthesizing and phase-splitting versions from the reference clock. For instance, the QCA layout of two-stage CMU is shown in Fig. 20. The QCA layout is achieved by employing two CMU models shown by dashed rectangular boxes. Each of control inputs (C1 and C2) is used to initialize and start the corresponding CMU stage. Output clock signals are CLK/2, CLK/2_90, CLK/2_180, CLK/2_270, CLK/4, CLK/4_90, CLK/4_180, and CLK/4_270 with a 50% duty cycle. The CLK/2_90 clock is frequency divided by 2 and phase shifted by 90° version of the reference clock and so on. Similarly, three-stage CMU is implemented by involving three CMU models in a cascade connection, and its QCA layout is shown in Fig. 21.

Fig. 20
figure 20

QCA layout of proposed two-stage CMU

Fig. 21
figure 21

QCA layout of proposed three-stage CMU

Results and discussion

In this section, proposed QCA structures are simulated to verify the function and to assess the overall figures of merit such as area, a number of QCA cells, delay, and power dissipation.

Performance study of proposed QCA structures

The QCADesigner (Walus et al. 2004) is a popular QCA layout and simulation tool developed at the University of Calgary. For QCA structure layout and performance analyses, all proposed QCA circuits in this paper are firstly examined using both bistable approximation and coherence vector simulation engines of QCADesigner version 2.0.3 with default parameters.

The simulation result of proposed rising-edge-triggered D-FF in Fig. 14 is provided with input and output waveforms as depicted in Fig. 22. Based on this result, the first meaningful output waveform is gained after 1.25 clock cycle delay. Input waveform with different binary values is applied at inputs D. As is apparent when the CLK is altered from the value of “0” to “1,” the value of the input (D) is transmitted to the output (Q), and otherwise, the output Q remains unchanged. Colored arrows in Fig. 22 show the input D to output Q mapping of binary data at positive clock transitions. This result is verified by comparing with the theoretical result in Table 1 and shows that the circuit works accurately. It can be seen that the proposed structure provides a high value of polarization equal to ±9.88e-001 at the output.

Fig. 22
figure 22

Simulation result of proposed rising-edge-triggered D-FF

Similarly, simulation results of proposed falling- and dual-edge-triggered D-FFs in Figs. 16 and 17 are provided as shown in Figs. 23 and 24, respectively. Colored arrows in these figures indicate the required output bits corresponding to input bits at proper clock transitions. As is apparent, when the proper clock transition occurs, the value of the input (D) is transmitted to the output (Q) and otherwise, the output Q remains unchanged. These simulation results indicate that expected values in Tables 2 and 3 have been achieved. It is clear that proposed implementations work accurately. Furthermore, it is worth mentioning that the polarization at all proposed D-FFs achieves a high polarization rate as shown in simulation results.

Fig. 23
figure 23

Simulation result of proposed falling-edge-triggered D-FF

Fig. 24
figure 24

Simulation result of proposed dual-edge-triggered D-FF

The simulation result of single-stage CMU (Fig. 19b) is shown in Fig. 25, which is verified by comparing with the expected clock signals in Fig.18b. Colored ovals and arrows in this figure show generated four phases of divide-by-two output clocks. This simulation result indicates that the proposed implementation works efficiently. Similarly, simulation results of two-stage and three-stage CMUs (Figs. 20 and 21) are provided as depicted in Figs. 26 and 27, respectively. Solid rectangular boxes in these figures show generated frequencies and phases of output clocks. These results indicate that proposed QCA implementations work efficiently.

Fig. 25
figure 25

Simulation result of proposed single-stage CMU

Fig. 26
figure 26

Simulation result of proposed two-stage CMU

Fig. 27
figure 27

Simulation result of proposed three-stage CMU

Then simulations are carried out to study the architectural complexity of proposed QCA structures and to compare various structures. The most widely accepted metrics are covered area, number of QCA cells, and circuit latency. Table 4 shows the comparison results between QCA D-FFs presented in this paper and previous structures in Shamsabadi et al. (2009), Yang et al. (2010), Hashemi and Navi (2012), and Xiao et al. (2012). Regarding these results, proposed D-FF structures surpass previous structures in terms of occupation area, consumed cells, and computation delay with better polarization at the output, considerably. Furthermore, the performance of proposed QCA CMU structures is analyzed, and estimated values are summarized in Table 5. This may be helpful in designing larger integrated circuits and applications operated at very high frequency (THz).

Table 4 Performance comparison of proposed QCA D-FF structures
Table 5 Performance estimation of proposed QCA CMU structures

Power study of proposed structures

It is worth pointing out that several studies have indicated the importance of QCA power dissipation. One of the greatest accurate power dissipation models has been presented in Timler and Lent (2002). Results of this model have been used in Srivastava et al. (2009) to estimate the upper-bound power dissipation for QCA structures. In Srivastava et al. (2011), an accurate energy estimation tool named QCAPro has been developed for QCA structures. This tool separates the total energy consumption in a QCA structure to two broad terms called “switching energy” and “leakage energy.” Energy losses corresponding to switching periods of cells lead to “switching energy” and energy losses during clock transitions lead to “leakage energy.” Based on this tool, the power dissipation is estimated in various tunneling energy levels under non-adiabatic switching, and the circuit functionality is verified according to the Bayesian network analysis (Srivastava et al. 2011).

The power dissipation of proposed QCA structures for D-FFs is evaluated as well as previously structures presented in Shamsabadi et al. (2009), Yang et al. (2010), Hashemi and Navi (2012), and Xiao et al. (2012) using QCAPro tool version1.0. All structures are examined by considering three different tunneling energy levels (0.5 Ek, 1 Ek, and 1.5 Ek) in 2o K temperature, and the power analysis is performed as shown in Table 6 by estimating average switching and leakage energy dissipations overall vector pairs. Based on achieved results in this table, proposed QCA structures of edge-sensitive D-FFs consume lowest switching and leakage energies as compared to previously structures. It is worth mentioning that proposed QCA D-FFs can be considered as low-power and low-complexity structures for implementing sequential circuits due to excluding the wire crossing, decreasing the number of QCA cells, selecting the well-organized arrangement for QCA gates, and applying QCA design rules. Furthermore, the power of proposed QCA CMU structures is analyzed, and estimated values are summarized in Table 7. According to observing results in Tables 5 and 7, we can reach a conclusion that proposed QCA CMU structures consume less complexity and power dissipation. Consequently, the proposed CMU is the best building block to be used in the next generation of microwave and RF integrated circuits.

Table 6 Power analysis of proposed QCA D-FF structures
Table 7 Power analysis of proposed QCA CMU structures

Conclusions

In this paper, innovative and robust designs of edge-triggered D-type flip-flop (D-FF) and clock management unit (CMU) have been introduced and analyzed such as rising-, falling-, and dual-edge D-FFs as well as single-, two-, and three-stage CMUs. The proposed designs have efficient QCA structures regarding complexity and power where simulations have been carried out using both QCAdesigner and QCAPro tools to study performance and power, respectively. Proposed structures provide an excellent opportunity to significantly improve the performance especially for designing larger power-efficient high-speed QCA systems. According to the author’s knowledge, this paper is the first to demonstrate the use of QCA in the design of CMU structures.