1 Introduction

Reversible logic is a promising technology that has applications in emerging technologies such as quantum computing, optical computing, molecular computing, DNA computing, etc. [31]. Reversible logic makes it possible for no information loss during computations. Reversible logic circuits have one-to-one mapping between the inputs and the outputs and generate a unique output from each input. Researchers have proven that in irreversible logic computations, each 1 bit of information lost costs \(KT\ln {2}\) Joules of energy dissipation [14]. Bennett proved that this \(KT\ln {2}\) energy dissipation will not appear in reversible logic circuits [2]. Conservative logic is a logic family which exhibits the property that there are the same number of 1s in the outputs as there are in the inputs. Conservative logic is called reversible conservative logic when there is a one-to-one mapping between the input and the output vectors along with the property that there are an equal number of 1s in both the outputs and the inputs.

The barrel shifter is one of the main computing systems having applications in high-speed digital signal processing, floating-point arithmetic, field programmable gate arrays (FPGAs), and central processing units (CPUs). Barrel shifters can shift and rotate multiple bits in a single clock cycle. In the existing literature, researchers have proposed several designs of barrel shifters. The limitations of existing designs of reversible barrel shifters are that they have a large number of ancilla inputs and garbage outputs [12, 13, 18]. An ancilla input is the constant input in the reversible circuit. A garbage output is an output which exists in the circuit just to maintain one-to-one mapping and is neither a primary nor a useful output. Reversible circuits generate several bits of garbage output to maintain the reversibility. The increasing number of garbage outputs have several penalties such as increasing circuit size, extra interconnections, more heat generation, and increasing the number of I/O pins. Another motivation for this work is the implementation of reversible logic in emerging technologies is still in its infancy [24]. That is why there must be a major emphasis on reducing the complexity of the proposed reversible logic circuits. In the existing literature, the reversible barrel shifter was entirely constructed from Fredkin gates and Feynman gates. The use of those gates resulted in an enormous number of ancilla inputs and garbage outputs [12, 13, 18].

In this work, we proposed reversible designs of barrel shifters based on a new super conservative reversible logic gate (SCRL gate). The SCRL gate has 1 control input depending on the value of which it can swap any two \(n-1\) data inputs. This swapping property of the SCRL gate is useful in designing the reversible barrel shifters that are optimized in terms of the number of garbage outputs and the number of ancilla inputs compared to the Fredkin gate designs. We have developed five design methodologies for reversible barrel shifters: (1) right rotator, (2) logical right shifter, (3) arithmetic right shifter, (4) universal right shifter, and (5) universal bidirectional shifter. The proposed designs of reversible barrel shifters are compared with the existing works in the literature and have shown an improvement ranging from 8.57 to 91.62 % in terms of the number of ancilla inputs and from 17.72 to 91.62 % in terms of the number of garbage outputs. Furthermore, spintronics-based nanomagnetic logic (NML) [36] is one of the emerging nanotechnologies in which it is possible to implement reversible logic gates. NML makes it possible to achieve circuit densities beyond the limits of existing CMOS technology. The SCRL gate is mapped in logic devices that can be implemented in NML computing. It is illustrated that the SCRL gate-based designs of reversible barrel shifters have less NML cost (cost in terms of the number of inverters and majority voters) compared to the Fredkin gate-based designs of reversible barrel shifters.

The paper is organized as follows: Sect. 2 presents the basics of NML computing; Sect. 3 presents the SCRL gate; Sect. 4 presents the proposed designs for each type of barrel shifter; Sect. 5 provides a comparison of the proposed designs to existing designs; Sect. 6 presents the related work toward the proposed designs; in Sect. 7 we discuss the results of the comparisons to other designs.

2 Background

In this work, the reversible logic gates are implemented in logic devices based on NML computing. For that reason we are providing the introductory material on NML computing. NML computing consists of nanoscale magnetic cells [35]. These nanomagnets have two possible polarizations, down and up, which are used to correspond to logic ’0’ (down) and logic ’1’ (up). Figure 1a shows the possible logic states in NML computing. NML computing uses a three phase clocking system to transmit information through the cells. The main gate in NML is the majority voter which is shown in Fig. 1b. The majority voter is represented as \(\mathrm{Output}=AB+BC+AC\) where A, B, and C are the inputs. More details on NML computing can be found in [21, 3436].

Fig. 1
figure 1

Basics of NML computing. a Logic states in NML computing. b Majority voter

3 Applications in supercomputing

The argument is made in [5] that power consumption is a major limiting factor in increasing the computational capacity of supercomputers. The use of reversible logic has the potential to overcome these limitations due to its low-power nature. These new supercomputers could greatly exceed the capabilities of existing supercomputers. This would allow for the construction of scientific models that are currently impossible using existing technology. The specific example given by [5] is completely simulating the Earth’s climate for the purposes of researching global warming. Similar simulations could be constructed for other non-solid portions of the earth.

4 Super conservative reversible logic (SCRL) gate

In this work, we have proposed an \(n\times n\) reversible logic gate named the super conservative reversible logic gate (SCRL gate). The SCRL gate has n inputs that include one control bit input and \(n-1\) data inputs. An \(n\times n\) SCRL gate is shown in Fig. 2. Define the n inputs of the SCRL gate as \(c_{0}, a_{0}, a_{1}, a_{2}, \ldots , a_{n-2}\), where \(c_{0}\) is the control bit input and \(a_{0}\) to \(a_{n-2}\) are the data inputs. The SCRL gate outputs are defined as \(o_{0}, o_{1}, o_{2}, \ldots , o_{n-1}\). The output functions can be expressed as:

$$\begin{aligned} o_{0}= & {} c_{0}\end{aligned}$$
(1)
$$\begin{aligned} o_{i}= & {} {\overline{c}}_{0}a_{i}+c_{0}a_{j}\end{aligned}$$
(2)
$$\begin{aligned} o_{n-1}= & {} {\overline{c}}_{0}a_{n-2}+c_{0}a_{0} \end{aligned}$$
(3)

\(a_{i}\) is the ith input, where \(i\le n-3\) and \(a_{j}\) is the jth input, where \(j=i+1\). The outputs depend on the control input \(c_{0}\). When \(c_{0}=0\), the outputs are \(c_{0}, a_{0}, a_{1}, a_{2}, \ldots , a_{n-2}\). When \(c_{0}=1\), the outputs are \(c_{0}, a_{1}, a_{2}, \ldots , a_{n-2}, a_{0}\). However, the output functions are not unique as there can be more than one version of the SCRL gate. The diversity of the possible output functions of the SCRL gate allows it to implement different functions in reversible computing. Under the special condition of when \(n=3\), the \(3\times 3\) SCRL gate functions as a Fredkin gate (Fig. 3). The NML design of an \(n\times n\) SCRL gate is shown in Fig. 4.

Fig. 2
figure 2

\(n\times n\) SCRL gate

Fig. 3
figure 3

Fredkin gate

Fig. 4
figure 4

NML design of an \(n\times n\) SCRL gate

4.1 Motivation

The SCRL gate makes it possible to rearrange input bits into a predetermined pattern without needing to use any ancilla inputs and garbage outputs. There are certain SCRL gate layouts that can be implemented using Fredkin gates. For example, consider there are inputs \(c_{0}, a_{0}, a_{1}, a_{2}, a_{3}\) where \(c_{0}\) is the control bit and \(a_{0} ,a_{1}, a_{2}, a_{3}\) are the data inputs. The outputs are defined as follows:

$$\begin{aligned} o_{0}= & {} c_{0}\end{aligned}$$
(4)
$$\begin{aligned} o_{1}= & {} {\overline{c}}_{0}a_{0}+c_{0}a_{1}\end{aligned}$$
(5)
$$\begin{aligned} o_{2}= & {} {\overline{c}}_{0}a_{1}+c_{0}a_{0}\end{aligned}$$
(6)
$$\begin{aligned} o_{3}= & {} {\overline{c}}_{0}a_{2}+c_{0}a_{3}\end{aligned}$$
(7)
$$\begin{aligned} o_{4}= & {} {\overline{c}}_{0}a_{3}+c_{0}a_{2} \end{aligned}$$
(8)
Fig. 5
figure 5

Equivalent SCRL gate and Fredkin gate-based implementations. a \(5\times 5\) SCRL gate. b Fredkin gate implementation

The described \(5\times 5\) SCRL gate can be implemented with Fredkin gates without introducing any ancilla inputs and garbage outputs. Figure 5a contains the described SCRL gate and Fig. 5b contains the equivalent implementation using Fredkin gates. Now consider if the outputs were instead defined as:

$$\begin{aligned} o_{0}= & {} c_{0}\end{aligned}$$
(9)
$$\begin{aligned} o_{1}= & {} {\overline{c}}_{0}a_{0}+c_{0}a_{1}\end{aligned}$$
(10)
$$\begin{aligned} o_{2}= & {} {\overline{c}}_{0}a_{1}+c_{0}a_{2}\end{aligned}$$
(11)
$$\begin{aligned} o_{3}= & {} {\overline{c}}_{0}a_{2}+c_{0}a_{3}\end{aligned}$$
(12)
$$\begin{aligned} o_{4}= & {} {\overline{c}}_{0}a_{3}+c_{0}a_{0} \end{aligned}$$
(13)

The described behavior can still be implemented using only one \(5\times 5\) SCRL gate. However, it can no longer be implemented using only two Fredkin gates. The implementation using Fredkin gates would introduce garbage outputs and ancilla inputs and be more complex than using a single \(5\times 5\) SCRL gate.

Another strength of the SCRL gate is it can just as efficiently perform swaps with an odd number of data inputs as it can with an even number of data inputs. For example, consider there are inputs \(c_{0}, a_{0}, a_{1}, a_{2}\) where \(c_{0}\) is the control bit and \(a_{0}, a_{1}, a_{2}\) are the data inputs. The outputs are defined as follows:

$$\begin{aligned} o_{0}= & {} c_{0}\end{aligned}$$
(14)
$$\begin{aligned} o_{1}= & {} {\overline{c}}_{0}a_{0}+c_{0}a_{1}\end{aligned}$$
(15)
$$\begin{aligned} o_{2}= & {} {\overline{c}}_{0}a_{1}+c_{0}a_{2}\end{aligned}$$
(16)
$$\begin{aligned} o_{3}= & {} {\overline{c}}_{0}a_{2}+c_{0}a_{0} \end{aligned}$$
(17)
Fig. 6
figure 6

\(4\times 4\) SCRL gate

Fig. 7
figure 7

(nk) barrel shifter block diagram

The output equations can be implemented by a single \(4\times 4\) SCRL gate as shown in Fig. 6. The implementation using Fredkin gates would be much more complex and requires the use of ancilla inputs and garbage outputs.

5 Barrel shifter designs

A barrel shifter is a functional logic circuit with n inputs and n outputs [20]. The function of a barrel shifter is to shift the input information in a certain order to the left or to the right for up to \(n-1\) bits. A well-designed barrel shifter is able to implement bidirectional information shifting. Researchers have proposed various designs of barrel shifters. The logarithmic barrel shifter is one of the most popular designs since it saves area for the circuit and it is simple to implement. As shown in Fig. 7, a barrel shifter with q stages and n inputs is called an (nq) barrel shifter. The pth stage can shift the information for \(2^{q-p}\) bits where \(q=\log _2{n}\). A control signal is used for either executing or shutting down the shifting mode. The shifting mode can be executed with the control signal set to logic 1 and can be shut down with the control signal set to logic 0. Hence, the information can be shifted from 0 bits (all control signals set to logic 0) to \(n-1\) bits (all control signals set to logic 1). Logical right shift, logical left shift, right rotation, left rotation, arithmetic left shift and arithmetic right shift are the different functions that can be implemented using a barrel shifter. Suppose there is an (8, 3) barrel shifter with the inputs defined as \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\), the six functions performing a 3 bit operation are shown in Table 1 [13].

Table 1 Operation functions and outputs

5.1 Design methodology of right rotator and implementation in SCRL gates

In the existing literature [12, 13, 23], barrel shifters are designed using the Fredkin gate and the Feynman gate. The use of the Fredkin gates and the Feynman gates produces a considerable NML cost, number of garbage outputs, and number of ancilla inputs. Since the SCRL gate can swap any of its two data inputs, the proposed reversible barrel shifters including the reversible rotator has less NML cost, garbage outputs, and ancilla inputs. Hence, the SCRL gate is an ideal platform to design a barrel shifter.

The function of a logical right rotator is to right rotate the bits depending on the control signals. Table 1 shows an example of an (8, 3) reversible right rotator. For an (nq) reversible right rotator, suppose the inputs are defined as \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\) and the outputs are defined as \(O_{q,n-1}, O_{q,n-2}, \ldots , O_{q,2}, O_{q,1}, O_{q,0}\). The rotator can be designed with q stages, where \(q=\log _2{n}\). For the pth stage, with a control signal \(b_{q-p}\), it can perform either no rotation (with control signal set to 0) or a \(2^{q-p}\) bit rotation (with control signal set to 1). Correspondingly, the pth stage will have the outputs as \(O_{p,m}\), where \(m=0, 1, 2, \ldots , n-1\). The methodology that can implement the reversible right rotator is as follows:

  1. 1.

    Stage 1

    For the first stage of an (nq) right rotator, the inputs are \(a_{n-1}\), \(a_{n-2}, \ldots , a_{2}\), \(a_{1}\), \(a_{0}\), and the outputs have the equation of:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{(q-1)}a_{m}+b_{(q-1)}a_{(m+2^{q-1}-n)}&{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ {\overline{b}}_{(q-1)}a_{m}+b_{(q-1)}a_{(m+2^{q-1})}&{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (18)

    where \(b_{q-1}\) is the control signal of the first stage, and m is the mth input or output in this stage. If \(b_{q-1}=0\), the outputs of the first stage will remain the same as the inputs:

    $$\begin{aligned} O_{1,m}=a_{m} \end{aligned}$$
    (19)

    else if \(b_{q-1}=1\), the outputs of the first stage will rotate for \(2^{q-1}\) bits to the right. That is:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} a_{m+2^{q-1}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ a_{m+2^{q-1}}&{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (20)
  2. 2.

    Stage p (\(1\le p\le q\))

    For the pth stage of an (nq) right rotator, the inputs are the outputs of the \((p-1)\)th stage, \(O_{p-1,n-1}, O_{p-1,n-2}, \ldots , O_{p-1,2}, O_{p-1,1}, O_{p-1,0}\). The outputs of the pth stage are:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{q-p}O_{p-1,m}+b_{q-p}O_{p-1,m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ {\overline{b}}_{q-p}O_{p-1,m}+b_{q-p}O_{p-1,m+2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (21)

    where \(b_{q-p}\) is the control signal of the pth stage, and m is the mth input or output in this stage. If \(b_{q-p}=0\), the outputs of the first stage will remain the same as the inputs:

    $$\begin{aligned} O_{p,m}=O_{p-1,m} \end{aligned}$$
    (22)

    else if \(b_{q-p}=1\), the outputs of the first stage will rotate \(2^{q-p}\) bits to the right. The outputs are:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} O_{p-1,m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ O_{p-1,m-2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (23)

In the qth stage (the last stage), the final outputs \(O_{q,n-1}\), \(O_{q,n-2}, \ldots , O_{q,2}\), \(O_{q,1}\), \(O_{q,0}\) can be rotated from 0 to \(n-1\) bits. Using the SCRL gate, each stage of the (nq) reversible right rotator needs one \((n+1)\times (n+1)\) SCRL gate. Thus, in total, the number of \((n+1)\times (n+1)\) SCRL gates is q.

Fig. 8
figure 8

Reversible (8, 3) SCRL right rotator

Throughout the design process for each of the six functions in the reversible barrel shifter, an (8, 3) logic circuit is first designed then the design is expanded to create an (nq) logic circuit. Figure 8 is an example of an (8, 3) reversible right rotator in \(9\times 9\) SCRL gates. An (8, 3) right rotator has three rotation stages. The first stage will either right rotate the inputs \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\) for 4 bits with the control signal \(b_{2}=1\) or keep them unchanged with \(b_{2}=0\). For convenience, the outputs from the first stage, which are also the inputs to the second stage, are defined as \(k_{7}, k_{6}, k_{5}, k_{4}, k_{3}, k_{2}, k_{1}, k_{0}\). Stage 2 will either rotate its inputs for 2 bits with \(b_{1}=1\) or keep them unchanged with \(b_{1}=0\). The outputs from stage 2, which are also the inputs of the third stage, are defined as \(j_{7}, j_{6}, j_{5}, j_{4}, j_{3}, j_{2}, j_{1}, j_{0}\). Stage 3 will either rotate its inputs for 1 bit with \(b_{0}=1\) or keep them unchanged with \(b_{0}=0\). The final outputs are defined as \(O_{7}, O_{6}, O_{5}, O_{4}, O_{3}, O_{2}, O_{1}, O_{0}\). To implement the rotation in SCRL gates, the following equations are defined for each stage:

$$\begin{aligned} k_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{2}a_{i}+b_{2}a_{i-4}&{}\quad \text{ if }\quad i\ge 4\\ {\overline{b}}_{2}a_{i}+b_{2}a_{i+4}&{}\quad \text{ if }\quad i<4 \end{array}\right. }\end{aligned}$$
(24)
$$\begin{aligned} j_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{1}a_{i}+b_{1}a_{i-6}&{}\quad \text{ if }\quad i\ge 6\\ {\overline{b}}_{1}a_{i}+b_{1}a_{i+2}&{}\quad \text{ if }\quad i<6 \end{array}\right. }\end{aligned}$$
(25)
$$\begin{aligned} O_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{0}a_{i}+b_{0}a_{i-7}&{}\quad \text{ if }\quad i\ge 7\\ {\overline{b}}_{0}a_{i}+b_{0}a_{i+1}&{}\quad \text{ if }\quad i<7 \end{array}\right. } \end{aligned}$$
(26)

In this way, a reversible (8, 3) right rotator is designed using three \(9\times 9\) SCRL gates. Table 2 shows all possible rotating bits and the corresponding states of the 3 control signals. The serial binary number of the control signals corresponds to the decimal number of places the bits are rotated. Compared to a previous design with Fredkin gates and Feynman gates [12, 13, 23], this design with SCRL gates uses fewer NML gates which considerably minimizes the cost. The SCRL design also has no garbage outputs which extremely improves the efficiency.

Table 2 Rotating bits and corresponding states of control signals

5.2 Design methodology of logical right shifter and implementation in SCRL gates

A logical right shifter has the function of shifting the inputs to the right for n bits and sets the leftmost n bits to 0. Table 1 shows an example of a 3 bit logical right shift, which changes the inputs \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\) into the outputs \(0, 0, 0, a_{7}, a_{6}, a_{5}, a_{4}, a_{3}\). Suppose an (nq) reversible logical right shifter with the inputs defined as \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\) is to be designed in SCRL gates. It requires q stages of shifting, where \(q=\log _2{n}\). Each one of the q shifting stages is operated by control signal \(b_{q-p}\), where p is the pth stage, \(p=1, 2, \ldots , q\). Thus, with \(b_{q-p}\) set to 0, the shifting mode is off so that the pth shifting stage will not shift the input data. With the control signal \(b_{q-p}\) set to 1, the right shifting mode is active and now the pth shifting stage will do a \(2^{q-p}\) bit right shift. An irreversible design of logical right shifter [31] can be constructed using multiplexers. Some previous reversible designs involve using Fredkin gates as a replacement for the multiplexers and Feynman gates as reversible data copiers. This will produce a large number of garbage outputs. For each stage, there are n garbage outputs. From another point of view, the two-bits-in-a-group way of switching data requires much more time to pass the control signals. Hence, as the number of inputs grows, the operation delay will increase rapidly. Using SCRL gates reduces the total number of gates, the number of garbage outputs, and minimizes the operation delay.

Fig. 9
figure 9

Selection unit of an (8, 3) logical right shifter

The difference between a logical right shifter and a right rotator is that a logical right shifter will set the leftmost shifted bits to 0 while a right rotator replaces them with the rightmost bits. As a result, it requires several selectors to select an input from 0 and any other original inputs. For example, the pth stage requires a total number of \(2^{q-p}\) selectors to satisfy a \(2^{q-p}\) bits right shift. In this design, Fredkin gates are used to achieve the selection function so that the design will remain reversible. Figure 9 is an example of an (8, 3) selection unit of a logical right shifter.

  1. 1.

    Stage 1

    The first stage has the inputs of a selected n bits data from \(0, a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\), and a control signal \(b_{q-1}\). Since it is a logical right shifter, the first stage will either perform a shift of \(2^{q-1}\) bits or no bits. No matter what state the control signal \(b_{q-1}\) is set to, the outputs will always keep the inputs from \(a_{n-1}\) to \(a_{n-2^{q-1}}\). Therefore, the other half of the inputs will be a selection from \(0, a_{n-1-2^{q-1}}, \ldots , a_{2}, a_{1}, a_{0}\). So the first stage requires a total number of \(2^{q-1}\) Fredkin gates and 1 \((n+1) \times (n+1)\) SCRL gate. Define the final outputs as \(O_{q,n-1}, O_{q,n-2}, \ldots , O_{q,2}, O_{q,1}, O_{q,0}\). Correspondingly, the pth stage will have the outputs expressed by \(O_{p,m}\), where m is the mth output of this stage. Thus, for stage 1, define the selected inputs as \(s_{n-1-2^{q-1}}, \ldots , s_{2}, s_{1}, s_{0}\). They have the following equation:

    $$\begin{aligned} s_{m}={\overline{b}}_{q-1}a_{m}\quad \text{ for }\quad 0\le m\le n-1-2^{q-1} \end{aligned}$$
    (27)

    where \(b_{q-1}\) is the control bit for all of the Fredkin gates. When \(b_{q-1}\) is set to 0, \(s_{m}=a_{m}\), the actual inputs will remain the same as the primary inputs \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\). If \(b_{q-1}\) is set to 1, then \(s_{m}=0\), it allows the proposed logical right shifter design to set the leftmost shifted bits to 0. Then the actual inputs are \(s_{n-1-2^{q-1}}, \ldots , s_{2}, s_{1}, s_{0}, a_{n-1}, a_{n-2}, \ldots , a_{n-2^{q-1}}\). The outputs have the following equation:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{q-1}a_{m}+b_{q-1}s_{m+2^{q-1}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ {\overline{b}}_{q-1}s_{m}+b_{q-1}a_{m+2^{q-1}}&{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (28)

    where \(b_{q-1}\) is the control signal of the first stage, and m is the mth input or output in this stage. If \(b_{q-1}=0\), the outputs of the first stage will remain the same as the inputs:

    $$\begin{aligned} O_{1,m}={\left\{ \begin{array}{ll} a_{m} &{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ s_{m} &{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (29)

    else if \(b_{q-1}=1\), the outputs of the first stage will rotate for \(2^{q-1}\) bits to the right. That is:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} s_{m+2^{q-1}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ a_{m+2^{q-1}}&{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (30)
  2. 2.

    Stage p (\(1\le p\le q\))

    At the pth stage of an SCRL logical right shifter, the primary inputs are \(O_{p-1,n-1}\), \(O_{p-1,n-2}, \ldots , O_{p-1,2}, O_{p-1,1}, O_{p-1,0}\), which are from the previous stage. The pth stage requires \(2^{q-p}\) Fredkin gates as selectors to allow the stage to select the actual inputs from 0 and the rightmost \(2^{q-p}\) primary inputs, \(O_{p-1,2^{q-p}-1}, O_{p-1,2^{q-p}-2}, \ldots , O_{p-1,2}, O_{p-1,1}, O_{p-1,0}\). Now define the selected inputs as \(s_{2^{q-p}-1}, s_{2^{q-p}-2}, \ldots , s_{2}, s_{1}, s_{0}\). The selected inputs have the following equation:

    $$\begin{aligned} s_{m}={\overline{b}}_{q-p}O_{p-1,m}\quad \text{ for }\quad 0\le m\le n-1-2^{q-p} \end{aligned}$$
    (31)

    When \(b_{q-p}\) is set to 0, \(s_{m}=O_{p-1,m}\), the actual inputs will remain the same as the primary inputs \(O_{p-1,n-1}, O_{p-1,n-2}, \ldots , O_{p-1,2}, O_{p-1,1}, O_{p-1,0}\). If \(b_{q-p}\) is set to 1, then \(s_{m}=0\) and this SCRL design of a logical right shifter can set the leftmost shifted bits to 0. Then the actual inputs are \(s_{n-1-2^{q-p}}, \ldots , s_{2}, s_{1}, s_{0}\), \(O_{p-1,n-1}, O_{p-1,n-2}, \ldots , O_{p-1,n-2^{q-p}}\). The output equations are as follows:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{q-p}O_{p-1,m}+b_{q-p}s_{m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ {\overline{b}}_{q-p}s_{m}+b_{q-p}O_{p-1,m+2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (32)

    where \(b_{q-p}\) is the control signal of the pth stage, and m is the mth input or output in this stage. If \(b_{q-p}=0\), the outputs of the pth stage will be the same as the inputs:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} O_{p-1,m} &{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ s_{m} &{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (33)

    else if \(b_{q-p}=1\), the outputs of the pth stage will rotate for \(2^{q-p}\) bits to the right. That is:

    $$\begin{aligned} O_{p,m}={\left\{ \begin{array}{ll} s_{m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ O_{p-1,m+2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (34)

Therefore, the primary inputs to the logical right shifter can be right shifted from 0 to \(n-1\) bits according to the different settings of the control bits \(b_{m}\). Consider the different combinations of control bits to be binary numbers. The corresponding decimal numbers are the number of right bit shifts from 0 to \(n-1\). The SCRL design of an (nq) logical right shifter requires a total number of q SCRL gates and \(n-1\) Fredkin gates. An (8, 3) SCRL logical right shifter was first designed and then derived into an (nq) SCRL logical right shifter. As it is shown in Fig. 10, an (8, 3) logical right shifter has 3 shifting stages with the primary inputs defined as \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\). The corresponding final outputs are defined as \(O_{7}, O_{6}, O_{5}, O_{4}, O_{3}, O_{2}, O_{1}, O_{0}\). To simplify the design, define the outputs for stage 1 as \(k_{7}, k_{6}, k_{5}, k_{4}, k_{3}, k_{2}, k_{1}, k_{0}\) and define the outputs for stage 2 as \(j_{7}, j_{6}, j_{5}, j_{4}, j_{3}, j_{2}, j_{1}, j_{0}\). And to separate the selected inputs from each stage, define the selected inputs for stage 1 as \(s_{3}, s_{2}, s_{1}, s_{0}\), the selected inputs for stage 2 as \(x_{1}, x_{0}\), and the selected inputs for stage 3 as \(y_{0}\). The selected inputs are determined by the following equations:

$$\begin{aligned} s_{m}= & {} {\overline{b}}_{2}a_{m}\quad \text{ for }\quad 0\le m\le 3 \end{aligned}$$
(35)
$$\begin{aligned} x_{m}= & {} {\overline{b}}_{1}k_{m}\quad \text{ for }\quad 0\le m\le 1 \end{aligned}$$
(36)
$$\begin{aligned} y_{m}= & {} {\overline{b}}_{0}j_{m}\quad \text{ for }\quad m=0 \end{aligned}$$
(37)
Fig. 10
figure 10

(8, 3) SCRL logical right shifter (FR Frdkin gate, G garbage outputs)

The first stage will either right shift the actual inputs \(a_{7}, a_{6}, a_{5}, s_{4}, s_{3}, s_{2}, s_{1}, s_{0}\) by 4 bits when the control signal \(b_{2}=1\) or keep them unchanged when \(b_{2}=0\). Stage 2 will either right shift its inputs by 2 bits when \(b_{1}=1\) or keep them unchanged when \(b_{1}=0\). Stage 3 will either right shift its inputs by 1 bit when \(b_{0}=1\) or keep them unchanged when \(b_{0}=0\). The outputs for each stage are shown in the following equations:

$$\begin{aligned} k_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{2}a_{i}+b_{2}s_{i-4}&{}\quad \text{ if }\quad i\ge 4\\ {\overline{b}}_{2}s_{i}+b_{2}a_{i+4}&{}\quad \text{ if }\quad i<4 \end{array}\right. }\end{aligned}$$
(38)
$$\begin{aligned} j_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{1}a_{i}+b_{1}x_{i-6}&{}\quad \text{ if }\quad i\ge 6\\ {\overline{b}}_{1}x_{i}+b_{1}a_{i+2}&{}\quad \text{ if }\quad i<6 \end{array}\right. }\end{aligned}$$
(39)
$$\begin{aligned} O_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{0}a_{i}+b_{0}y_{i-7}&{}\quad \text{ if }\quad i\ge 7\\ {\overline{b}}_{0}y_{i}+b_{0}a_{i+1}&{}\quad \text{ if }\quad i<7 \end{array}\right. } \end{aligned}$$
(40)

In this way, a reversible (8, 3) logical right shifter is designed using 3 SCRL gates and 7 Fredkin gates. Since only 1 out of 2 outputs of each Fredkin gates is in use, there are a total of 7 garbage outputs in this design, which is much fewer than the previous reversible designs [12, 13, 18]. In addition, the operation delay is optimized since this design uses fewer Fredkin gates. Table 3 shows the possible right shift operations and the corresponding states of the control signal.

Table 3 Shifting bits and corresponding states of control signals

5.3 Design methodology of arithmetic right shifter and implementation in SCRL gates

An arithmetic right shifter shifts the inputs to the right by n bits while always keeping the leftmost n bits the same as the sign bit \(a_{n-1}\). Table 1 shows an example of a 3 bit arithmetic right shift. It changes the inputs \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\) into the outputs \(a_{7}, a_{7}, a_{7}, a_{7}, a_{6}, a_{5}, a_{4}, a_{3}\). To design an (nq) reversible arithmetic right shifter in SCRL gates with the initial inputs defined as \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\), there must be q stages of shifting operations, where \(q=\log _2{n}\). Each one of the q shifting stages is operated by a control signal \(b_{q-p}\), where p is the pth stage, \(p=1, 2, \ldots , q\). When \(b_{q-p}\) is set to 0, the shifting mode is off so that the corresponding pth shifting stage will not shift the input data. When the control signal \(b_{q-p}\) is set to 1, a \(2^{q-p}\) bit right shift will be performed by the pth shifting stage. An irreversible design of an arithmetic right shifter [20] can be implemented using multiplexers. The arithmetic right shifter is slightly different from the logical right shifter. The arithmetic right shifter always keeps the leftmost shifted bits the same as the sign bit \(a_{n-1}\) while the logical right shifter always sets the leftmost shifted bits to 0. Therefore, the SCRL design of the arithmetic right shifter will be similar to the design of the logical right shifter in that Fredkin gates will be used to choose certain inputs to the SCRL gates. The main difference is the Fredkin gates will select between the sign bit \(a_{n-1}\) and the remaining initial inputs. Another difference is that in the design of the arithmetic right shifter, there will be a certain number of Feynman gates used in each stage to copy the sign bit \(a_{n-1}\). Since Feynman gates do not have any garbage outputs, the number of garbage outputs is the same as it was for the design of the reversible logical right shifter. Figure 11 shows the operation of the copy and selection of the first stage in an (8, 3) arithmetic right shifter. When the second input of a Feynman gate is set to 0, the Feynman gate performs as a copier such that the two outputs are the same as the first input. The copy-selection unit shown in Fig. 11 uses the Feynman gates to create copies of the sign bit and then pass them to the Fredkin gates. The rightmost output of the Feynman gate is not used by the current stage, but will be used in a future stage. The selection part of the arithmetic right shifter will either pass to the shifting operation unit the sign bit (when the control signal is set to 1) or the initial inputs (when the control bit is set to 0).

Fig. 11
figure 11

Example of copy-selection unit (FE Feynman gate, FR Fredkin gate, G garbage outputs)

The previous reversible design in [13] only uses Fredkin gates and Feynman gates. That design will produce n garbage outputs in each stage. The design proposed in this work offers a great improvement over the previous design using fewer Fredkin gates and Feynman gates. The use of fewer gates reduces the number of garbage outputs and minimizes the operation delay.

  1. 1.

    Stage 1

    Define the initial inputs of the first stage of an arithmetic right shifter as \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\) with a control signal \(b_{q-1}\). Since there are \(2^{q-1}\) selected inputs in the first stage, define the selected inputs as \(s_{n-1-2^{q-1}}, \ldots , s_{2}, s_{1}, s_{0}\). A total of \(2^{q-1}\) Fredkin gates are required to select \(2^{q-1}\) bits of input and \(2^{q-1}\) Feynman gates are required to make copies of the sign bit \(a_{n-1}\). If the final outputs are defined as \(O_{q,n-1}, O_{q,n-2}, \ldots , O_{q,2}, O_{q,1}, O_{q,0}\), then the pth stage will have its outputs defined by \(O_{p,m}\), where \(0\le m\le n-1\). For \(b_{q-1}=0\), the selection unit will select the serial inputs \(a_{n-1-2^{q-1}}, \ldots , a_{2}, a_{1}, a_{0}\) and no shifting will occur because the shifting unit will not be active. For \(b_{q-1}=1\), the selection unit will select the serial inputs \(s_{n-1-2^{q-1}}, \ldots , s_{2}, s_{1}, s_{0}\) and the shifting unit will perform a \(2^{q-1}\) bit right shift. In this way, the arithmetic right shifter performs an arithmetic right shift. The selection inputs have the following equation:

    $$\begin{aligned} s_{m}={\overline{b}}_{q-1}a_{m}+{b}_{q-1}a_{n-1}\quad \text{ for }\quad 0\le m\le n-1-2^{q-1} \end{aligned}$$
    (41)

    When \(b_{q-1}\) is set to 0, \(s_{m}=a_{m}\), the actual inputs will remain unchanged as the primary inputs \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}\). If \(b_{q-1}\) is set to 1, \(s_{m}=a_{n-1}\), then the leftmost shifted bits will be set as the sign bit \(a_{n-1}\). The actual inputs are \(s_{n-1-2^{q-1}}, \ldots , s_{2}, s_{1}, s_{0}, a_{n-1}, a_{n-2}, \ldots , a_{n-2^{q-1}}\). The outputs have the following equation:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{q-1}a_{m}+b_{q-1}s_{m+2^{q-1}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ {\overline{b}}_{q-1}s_{m}+b_{q-1}a_{m+2^{q-1}}&{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (42)

    If \(b_{q-1}=0\), the outputs of the first stage will remain the same as the inputs:

    $$\begin{aligned} O_{1,m}={\left\{ \begin{array}{ll} a_{m} &{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ s_{m} &{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (43)

    else if \(b_{q-1}=1\), then the outputs of the first stage will rotate \(2^{q-1}\) bits to the right. That means the output equations will be:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} s_{m+2^{q-1}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ a_{m+2^{q-1}}&{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (44)
  2. 2.

    Stage p (\(1\le p\le q\))

    At the pth stage of an SCRL logical right shifter, the initial inputs are the outputs from the previous stage \(O_{p-1,n-1}\), \(O_{p-1,n-2}, \ldots , O_{p-1,2}, O_{p-1,1}, O_{p-1,0}\). The pth stage requires a total of \(2^{q-p}\) Fredkin gates and \(2^{q-p}\) Feynman gates to select the actual inputs from the sign bit \(a_{n-1}\) and the rightmost \(2^{q-p}\) initial inputs, \(O_{p-1,2^{q-p}-1}, O_{p-1,2^{q-p}-2}, \ldots , O_{p-1,2}, O_{p-1,1}, O_{p-1,0}\). If the selected inputs are defined as \(s_{2^{q-p}-1}, s_{2^{q-p}-2}, \ldots , s_{2}, s_{1}, s_{0}\), then they have the following equation:

    $$\begin{aligned} s_{m}={\overline{b}}_{q-p}O_{p-1,m}+b_{q-p}a_{n-1}\quad \text{ for }\quad 0\le m\le n-1-2^{q-p} \end{aligned}$$
    (45)

    When \(b_{q-p}\) is set to 0, \(s_{m}=O_{p-1,m}\), the actual inputs will remain the same as the primary inputs \(O_{p-1,n-1}, O_{p-1,n-2}, \ldots , O_{p-1,2}, O_{p-1,1}\). If \(b_{q-p}\) is set to 1, then \(s_{m}=a_{n-1}\). Therefore, the leftmost shifted bits can be set to either the initial inputs or the sign bit \(a_{n-1}\). The actual inputs are \(s_{n-1-2^{q-p}}, \ldots , s_{2}, s_{1}, s_{0}\), \(O_{p-1,n-1}, O_{p-1,n-2}, \ldots , O_{p-1,n-2^{q-p}}\). The outputs can be expressed as the following equations:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{q-p}O_{p-1,m}+b_{q-p}s_{m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ {\overline{b}}_{q-p}s_{m}+b_{q-p}O_{p-1,m+2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (46)

    If \(b_{q-p}=0\), the outputs of the pth stage will remain the same as the inputs:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} O_{p-1,m} &{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ s_{m} &{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (47)

    else if \(b_{q-p}=1\), then the outputs of the pth stage will right shift for \(2^{q-p}\) bits to the right. That is:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} s_{m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ O_{p-1,m+2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (48)

The initial inputs to the arithmetic right shifter can be right shifted from 0 bits to \(n-1\) bits depending on the combination of control bits \(b_{m}\). As previously mentioned, the combination of control bits can be thought of as binary numbers and the corresponding decimal numbers represent the number of right shifts performed from 0 to \(n-1\). In the SCRL design of an (nq) arithmetic right shifter, a total of q SCRL gates, \(n-1\) Fredkin gates and \(n-1\) Feynman gates are required. A design of an (8, 3) SCRL-based arithmetic right shifter is shown in Fig. 12. It has 3 shifting stages with the initial inputs defined as \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\). The corresponding final outputs are defined as \(O_{7}, O_{6}, O_{5}, O_{4}, O_{3}, O_{2}, O_{1}, O_{0}\). To simplify the expressions, define the outputs for the first stage as \(k_{7}, k_{6}, k_{5}, k_{4}, k_{3}, k_{2}, k_{1}, k_{0}\), and define the outputs for stage 2 as \(j_{7}, j_{6}, j_{5}, j_{4}, j_{3}, j_{2}, j_{1}, j_{0}\). Define the selected inputs for stage 1 as \(s_{3}, s_{2}, s_{1}, s_{0}\), the selected inputs for stage 2 as \(x_{1}, x_{0}\), and the selected inputs for stage 3 as \(y_{0}\). The selected inputs are determined by the following equations:

$$\begin{aligned} s_{m}= & {} {\overline{b}}_{2}a_{m}+b_{2}a_{7}\quad \text{ for }\quad 0\le m\le 3 \end{aligned}$$
(49)
$$\begin{aligned} x_{m}= & {} {\overline{b}}_{1}k_{m}+b_{1}a_{7}\quad \text{ for }\quad 0\le m\le 1 \end{aligned}$$
(50)
$$\begin{aligned} y_{m}= & {} {\overline{b}}_{0}j_{m}+b_{0}a_{7}\quad \text{ for }\quad m=0 \end{aligned}$$
(51)
Fig. 12
figure 12

(8, 3) SCRL arithmetic right shifter (FR Frdkin gate, FE Feynman gate, G garbage outputs)

The first stage shift unit will either right shift the actual inputs \(a_{7}, a_{6}, a_{5}, a_{4}, s_{3}, s_{2}, s_{1}, s_{0}\) by 4 bits when the control signal \(b_{2}=1\) while setting the leftmost 4 bits of outputs to be the same as the sign bit \(a_{7}\), or the bits will be unchanged when \(b_{2}=0\). The shift unit of stage 2 will either right shift its inputs by 2 bits when \(b_{1}=1\) while setting the left most 2 bits of outputs to be the same as the sign bit \(a_{7}\), or the bits will be unchanged when \(b_{1}=0\). The shift unit of stage 3 will either right shift its inputs by 1 bit when \(b_{0}=1\) while it keeps the leftmost 1 bit of output as same as the sign bit \(a_{7}\) or the bits will be unchanged with \(b_{0}=0\). For each stage, the outputs are defined by the following equations:

$$\begin{aligned} k_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{2}a_{i}+b_{2}s_{i-4}&{}\quad \text{ if }\quad i\ge 4\\ {\overline{b}}_{2}s_{i}+b_{2}a_{i+4}&{}\quad \text{ if }\quad i<4 \end{array}\right. }\end{aligned}$$
(52)
$$\begin{aligned} j_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{1}a_{i}+b_{1}x_{i-6}&{}\quad \text{ if }\quad i\ge 6\\ {\overline{b}}_{1}x_{i}+b_{1}a_{i+2}&{}\quad \text{ if }\quad i<6 \end{array}\right. }\end{aligned}$$
(53)
$$\begin{aligned} O_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{0}a_{i}+b_{0}y_{i-7}&{}\quad \text{ if }\quad i\ge 7\\ {\overline{b}}_{0}y_{i}+b_{0}a_{i+1}&{}\quad \text{ if }\quad i<7 \end{array}\right. } \end{aligned}$$
(54)

Thus, a reversible (8, 3) arithmetic right shifter is achieved. As it is shown in Fig. 12, this design yields 7 garbage outputs, which is fewer garbage outputs than the previous reversible designs [12, 13, 18]. This design requires 7 Fredkin gates and 7 Feynman gates in total. The reduced number Feynman gates also contributes to the optimization of the operation delay. Table 4 shows the possible right shift operations and the corresponding states of the control signals.

Table 4 States of the control signals and the corresponding shift operations
Table 5 Different operations and the corresponding states of the control signals

5.4 Design methodology of universal right shifter and implementation in SCRL gates

A universal right shifter has the ability to perform any of the previous three functions: right rotation, logical right shift, and arithmetic right shift. The previously proposed SCRL designs of a right rotator, a logical right shifter, and an arithmetic right shifter were used as a basis for the design of the universal right shifter. Thus, the universal right shifter design uses Fredkin gates and Feynman gates to create the selection units that determine which inputs are passed to the SCRL gates. An (nq) universal right shifter consists of \(2^{q-1}\) stages. In Stage 1, the rightmost \(2^{q-1}\) input bits are affected. The initial inputs \(a_{2^{q-1}-1}, a_{2^{q-1}-2}, \ldots , a_{2}, a_{1}, a_{0}\) are passed to the selection unit along with 0 and the sign bit \(a_{n-1}\). The selection unit will choose which value is passed to the SCRL gate depending on which operation the universal right shifter is performing. The initial inputs \(a_{2^{q-1}-1}, a_{2^{q-1}-2}, \ldots , a_{2}, a_{1}, a_{0}\) are passed by the selection unit to the SCRL gate when a right rotation is performed. If the operation being performed is instead a type of right shift, then the selection unit will pass either copies of the sign bit \(a_{n-1}\) if the operation is an arithmetic right shift or 0 if the operation is a logical right shift. The selection is controlled by the control signals \(c_{0}\) and \(b_{q-1}\). The control signal \(c_{0}\) controls the selection between logical right shift and arithmetic right shift while \(t_{q-1}\) controls the selection between the right rotation and the previously selected type of shift. Table 5 shows the different operation that will be performed by each combination of the two control signals.

Fig. 13
figure 13

Selection unit of (8, 3) universal right shifter (FR Fredkin gate, FE Feynman gate, G garbage outputs)

  1. 1.

    Selection unit

    The selection unit consists of two parts. The first part is responsible for selecting between the sign bit \(a_{n-1}\) when performing an arithmetic right shift and 0 when performing a logical right shift. Define the output of the first part of the selection unit as F and with the following equation:

    $$\begin{aligned} F={\overline{c}}_{0}a_{n-1} \end{aligned}$$
    (55)

    The Fredkin gate selects the input \(a_{n-1}\) when \(c_{0}=1\) so that the shifter can perform an arithmetic right shift. The Fredkin gate selects the input 0 when \(c_{0}=0\), so that the shifter can perform a logical right shift. The second part of the selection unit will select between the initial inputs \(a_{2^{q-1}-1}, a_{2^{q-1}-2}, \ldots , a_{2}, a_{1}, a_{0}\), and the the first part’s selected input F. \(2^{q-1}\) Feynman gates are required to make copies of F for the second selection part of the unit. The final selected inputs \(s_{2^{q-1}-1}, s_{2^{q-1}-2}, \ldots , s_{2}, s_{1}, s_{0}\) are selected by:

    $$\begin{aligned} s_{m}={\overline{t}}_{q-1}a_{m}+t_{q-1}F \quad \text{ for }\quad m=0, 1, 2, \ldots , 2^{q-1}-1 \end{aligned}$$
    (56)

    When \(t_{q-1}=0\), the selection unit selects the inputs F that were chosen by the first Fredkin gate. In that way the logic circuit performs as either a logical right shifter or an arithmetic right shifter depending on the value of F. When \(t_{q-1}=1\), the selection unit selects the inputs as the initial inputs so that the logic circuit performs a right rotation. An example of a selection unit of an (8, 3) universal right shifter is shown in Fig. 13. F is the output of the first Fredkin gate. It has the equation of:

    $$\begin{aligned} F={\overline{c}}_{0}a_{7} \end{aligned}$$
    (57)

    For \(c_{0}=0\), \(F=0\), the first Fredkin gate selects the input 0 and for \(c_{0}=1\), \(F=a_{7}\), the first Fredkin gate selects the sign bit \(a_{7}\). Therefore, the first selection process of the selection unit is to make a selection between logical right shift and arithmetic right shift and the selected input F is propagated to the next Feynman gate. This Feynman gate together with the other three Feynman gates yields a total number of 5 copies of F. Four of the copies will be used by the second stage of the selection unit and the other copy will be used by the selection unit of the next stage. The second part of the selection unit will select between F and the initial inputs \(a_{3}, a_{2}, a_{1}, a_{0}\). The final selected inputs are defined as \(s_{3}, s_{2}, s_{1}, s_{0}\). For \(m=0, 1, 2, 3\) the final selected inputs have the following equation:

    $$\begin{aligned} s_{m}={\overline{t}}_{2}a_{m}+t_{2}F \end{aligned}$$
    (58)

    Thus, for \(t_{2}=0\), \(s_{m}=F\), the selection unit then selects to perform either a logical right shift or an arithmetic right shift depending on the the value of F. For \(t_{2}=1\), \(s_{m}=a_{m}\), the selection unit selects to perform a right rotation.

  2. 2.

    Stage 1

    For the first stage, its inputs are the inputs selected by the selection unit \(a_{n-1}, a_{n-2}, \ldots , a_{2^{q-1}}\), \(s_{2^{q-1}-1}, s_{2^{q-1}-2}, \ldots , s_{2}, s_{1}, s_{0}\). The outputs of the first stage are defined as \(O_{1,m}\):

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{q-1}a_{m}+b_{q-1}s_{m+2^{q-1}-n}&{}\quad \text{ if }\quad m\ge 2^{q-1}-1\\ {\overline{b}}_{q-1}s_{m}+b_{q-1}a_{m+2^{q-1}}&{}\quad \text{ if }\quad m<2^{q-1}-1 \end{array}\right. } \end{aligned}$$
    (59)

    \(b_{q-1}\) is the control signal of the first stage. If \(b_{q-1}=0\) the outputs are:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} a_{m} &{}\quad \text{ if }\quad m\ge n-2^{q-1}\\ s_{m} &{}\quad \text{ if }\quad m<n-2^{q-1} \end{array}\right. } \end{aligned}$$
    (60)

    Else if \(b_{q-1}=1\), the outputs of the first stage will rotate \(2^{q-1}\) bits to the right. That is:

    $$\begin{aligned} O_{1,m}= {\left\{ \begin{array}{ll} s_{m+2^{q-1}-n}&{}\quad \text{ if }\quad m\ge 2^{q-1}-1\\ a_{m+2^{q-1}}&{}\quad \text{ if }\quad m<2^{q-1}-1 \end{array}\right. } \end{aligned}$$
    (61)
  3. 3.

    Stage p (\(1\le p\le q\))

    For the pth stage of the design of a universal right shifter, the initial inputs are the outputs from the previous stage \(O_{p-1,n-1}\), \(O_{p-1,n-2}, \ldots , O_{p-1,2}, O_{p-1,1}, O_{p-1,0}\). Since the primarily selected input F can be applied as a global variable and can be copied by Feynman gates the pth stage does not require a double selection unit like the first stage does. The selection unit of the pth stage (\(p\ne 1\)) will only select from F and the inputs \(O_{m}\), where \(0\le m\le n-2^{q-p}\). It requires \(2^{q-p}\) Fredkin gates and \(2^{q-p}\) Feynman gates to achieve the selection. The selected inputs are defined as \(s_{2^{q-p}-1}, s_{2^{q-p}-2}, \ldots , s_{2}, s_{1}, s_{0}\):

    $$\begin{aligned} s_{m}={\overline{t}}_{q-p}O_{p-1,m}+t_{q-p}F\quad \text{ for }\quad 0\le m\le n-1-2^{q-p} \end{aligned}$$
    (62)

    When \(t_{q-p}\) is set to 0, \(s_{m}=O_{p-1,m}\) , the actual inputs will remain as the previous outputs \(O_{p-1,n-1}, O_{p-1,n-2},\ldots , O_{p-1,2}, O_{p-1,1}\). If \(t_{q-p}\) is set to 1, then \(s_{m}=F\) and there are three possibilities for the leftmost \(2^{q-p}\) bits: 0, \(a_{n-1}\), and \(O_{p-1,m}\). The outputs are defined as:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} {\overline{b}}_{q-p}O_{p-1,m}+b_{q-p}s_{m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ {\overline{b}}_{q-p}s_{m}+b_{q-p}O_{p-1,m+2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (63)

    If \(b_{q-p}=0\), then the outputs of the pth stage will remain the same as the inputs:

    $$\begin{aligned} O_{p,m}= {\left\{ \begin{array}{ll} O_{p-1,m} &{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ s_{m} &{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (64)

    Else if \(b_{q-p}=1\), then the outputs of the pth stage will right shift by \(2^{q-p}\) bits. That is:

    $$\begin{aligned} O_{p,m}={\left\{ \begin{array}{ll} s_{m+2^{q-p}-n}&{}\quad \text{ if }\quad m\ge n-2^{q-p}\\ O_{p-1,m+2^{q-p}}&{}\quad \text{ if }\quad m<n-2^{q-p} \end{array}\right. } \end{aligned}$$
    (65)

Thus, at the final outputs side the three different operations shown in Table 5 can be implemented. By having different control signals for each stage, the universal right shifter can complete the chosen operation for any number of bits from 0 bits to \(n-1\) bits depending on the combination of active control signals. A design of an (8, 3) SCRL-based universal right shifter is shown in Fig. 14. It consists of 3 stages with the initial inputs defined as \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\). The corresponding final outputs are defined as \(O_{7}, O_{6}, O_{5}, O_{4}, O_{3}, O_{2}, O_{1}, O_{0}\). To simplify the expressions and equations, the outputs for stage 1 are defined as \(k_{7}, k_{6}, k_{5}, k_{4}, k_{3}, k_{2}, k_{1}, k_{0}\), and the outputs for stage 2 are defined as \(j_{7}, j_{6}, j_{5}, j_{4}, j_{3}, j_{2}, j_{1}, j_{0}\). The selected inputs for each stage are also uniquely defined to further simply the expression and equations. The selected inputs for stage 1 are defined as \(s_{3}, s_{2}, s_{1}, s_{0}\), the selected inputs for stage 2 are defined as \(x_{1}, x_{0}\), and the selected inputs for stage 3 are defined as \(y_{0}\). The selected inputs are selected by the following equations:

$$\begin{aligned} s_{i}= & {} {\overline{t}}_{2}a_{i}+t_{2}F\qquad \text{ for }\quad 0\le m\le 3 \end{aligned}$$
(66)
$$\begin{aligned} x_{m}= & {} {\overline{t}}_{1}k_{m}+t_{1}F\qquad \text{ for }\quad 0\le m\le 1 \end{aligned}$$
(67)
$$\begin{aligned} y_{m}= & {} {\overline{t}}_{0}j_{m}+t_{0}F\qquad \text{ for }\quad m=0 \end{aligned}$$
(68)

The first stage shift unit will either right shift the initial inputs \(a_{7}, a_{6}, a_{5}, s_{4}, s_{3}, s_{2}, s_{1}, s_{0}\) by 4 bits when the control signal \(b_{2}=1\) or leave them unchanged when \(b_{2}=0\). The shift unit of stage 2 will either right shift its inputs by 2 bits when \(b_{1}=1\) or leave them unchanged when \(b_{1}=0\). At last the shift unit of stage 3 will either right shift its inputs by 1 bit when \(b_{0}=1\) or leave them unchanged when \(b_{0}=0\). For each stage the outputs are defined as:

$$\begin{aligned} k_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{2}a_{i}+b_{2}s_{i-4}&{}\quad \text{ if }\quad i\ge 4\\ {\overline{b}}_{2}s_{i}+b_{2}a_{i+4}&{} \quad \text{ if }\quad i<4 \end{array}\right. }\end{aligned}$$
(69)
$$\begin{aligned} j_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{1}a_{i}+b_{1}x_{i-6}&{}\quad \text{ if }\quad i\ge 6\\ {\overline{b}}_{1}x_{i}+b_{1}a_{i+2}&{}\quad \text{ if }\quad i<6 \end{array}\right. }\end{aligned}$$
(70)
$$\begin{aligned} O_{i}= & {} {\left\{ \begin{array}{ll} {\overline{b}}_{0}a_{i}+b_{0}y_{i-7}&{}\quad \text{ if }\quad i\ge 7\\ {\overline{b}}_{0}y_{i}+b_{0}a_{i+1}&{}\quad \text{ if }\quad i<7 \end{array}\right. } \end{aligned}$$
(71)
Fig. 14
figure 14

An SCRL (8, 3) universal right shifter (FR Fredkin gate, FE Feynman gate, G garbage outputs)

Table 6 is a list of all possible bit operations that can be performed by the universal right shifter.

Table 6 All possible bit operations in an (8, 3) universal right shifter

Thus, no operation will be performed if \(b_{2}+b_{1}+b_{0}=0\). Otherwise if the control signal \(c_{0}=0\) and \(t_{2}+t_{1}+t_{0}\ne 0\) the universal right shifter performs as an arithmetic right shifter and if \(c_{0}=1\), \(t_{2}+t_{1}+t_{0}\ne 0\) it performs as a logical right shifter. If \(t_{2}+t_{1}+t_{0}=0\), then no matter what state of \(c_{0}\) is, the universal right shifter will perform as a right rotator.

5.5 Design methodology of universal bidirectional shifter and implementation in SCRL gates

The designs proposed in the previous sections are all exclusively right directional. Designing a logic circuit that is capable of performing each of the six functions in Table 1 is the ultimate goal of this work. This universal bidirectional shifter can be implemented by first designing a reverse unit that can perform an order reverse operation. Combining the proposed universal right shifter with a reverse unit will result in a universal bidirectional shifter. The idea is that if the order of the initial inputs are reversed and then reversed back after the universal right shifter, the final output will be that of a left directional operation with the exception of the arithmetic shift. It is not difficult to find out that ignoring the first bit of data, the arithmetic left shifter is operating as a logical left shifter. It is simple enough to modify the circuit to allow it to perform an arithmetic left shift. A Fredkin gate must be added to the second reverse operation to make a selection from the sign bit \(a_{n-1}\) and the output of the last function operation stage \(f_{0}\). Figure 15 is an example for a reverse unit of a universal shifter.

Fig. 15
figure 15

Reverse unit

Table 7 Operations of the universal bidirectional shifter

The variable r is the control signal of the reverse unit. Suppose the inputs to the reverse unit are defined as \(a_{n-1}, a_{n-2}, a_{n-3}, \ldots , a_{2}, a_{1}, a_{0}\). When the control signal \(r=1\), the first reverse operation will reverse the inputs into the order of \(a_{0}, a_{1}, a_{2}, \ldots , a_{n-3}, a_{n-2}, a_{n-1}\) and the second reverse operation will reverse the data from the last operation stage \(f_{n-1}, f_{n-2}, f_{n-3}, \ldots , f_{2}, f_{1}, f_{0}\) to \(f_{0}, f_{1}, f_{2}, \ldots , f_{n-3}, f_{n-2}, f_{n-1}\). When \(r=0\), the reverse unit will not reverse the order of its inputs. Thus, the logic circuit is able to make bidirectional operations. Stated another way, the logic circuit will perform a left operation when \(r=1\) or it will performed a right operation when \(r=0\). Table 7 is a list of the different operations that will be performed depending on the states of the control signals z, r, T, and \(c_{0}\). Note that \(T=t_{q-1}+t_{q-2}+\cdots +t_{1}+t_{0}\). The control signal z is a switch to select between the logical left shift and arithmetic left shift modes when the other control signal are all have a value of 1.

  1. 1.

    Right rotation

    When all of the control bits are at the state of 0, a right rotation will be performed. The control signal r will not turn the reverse units on and, therefore, the logic circuit is set for a right directional operation. The selection unit selects to perform a right rotation because the initial inputs are selected when \(T=0\). Suppose there is a serial input defined as \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\). After the first part of the reverse unit, the serial input will remain unchanged since the reverse unit is off (\(r=0\)). The selection unit selects an operation of right rotation, where the inputs of each stage will always be the outputs of the previous stage. The rotating bit is controlled by \(b_{q-1}\), which is discussed in the previous designs.

  2. 2.

    Arithmetic right shift

    When \(T=t_{q-1}+t_{q-2}+\cdots +t_{1}+t_{0}=1\), it means that at least one of \(t_{q-1}, t_{q-2}, \ldots , t_{1}, t_{0}\) is set to 1. At least one of the shifting stages will have the input from the first Fredkin gate which selects the input between 0 and the sign bit \(a_{n-1}\). Then with \(c_{0}=0\), the first Fredkin gate in the selection unit selects the input as the sign bit \(a_{n-1}\). Thus, the universal bidirectional shifter performs as an arithmetic right shifter.

  3. 3.

    Logical right shift

    As previously mentioned, when \(T=1\), at least one of the shifting stages will obtain the input from the first Fredkin gate which will be either 0 or the sign bit \(a_{n-1}\). At \(c_{0}=1\), the first part of the selection unit will select 0 as the input. Then controlled by \(b_{q-1}\), at least one of the shifting operation stages will have a certain number of inputs as 0. Therefore, the universal bidirectional shifter performs as a logical right shifter.

  4. 4.

    Left rotation

    To make a left rotation, it is required that \(r=1\) while \(T=0\). \(r=1\) means that the reverse unit is active and the operation performed will be the type of a left directional while \(T=0\) means that the operation is a rotation. If the initial inputs are \(a_{n-1}, a_{n-2}, \ldots , a_{2}, a_{1}, a_{0}\), then after the first part of the reverse unit they are reversed to \(a_{0}, a_{1}, a_{2}, \ldots , a_{n-2}, a_{n-1}\). As an example, suppose that the logic circuit will perform a 2 bit left rotation. After the operation unit, the outputs are \(a_{n-2}, a_{n-1}, a_{0}, a_{1}, \ldots , a_{n-4}, a_{n-3}\). After being reversed by the second part of the reverse unit, the final outputs are \(a_{n-3}, a_{n-4}, \ldots , a_{1}, a_{0}, a_{n-2}, a_{n-1}\). Thus, a 2 bit left rotation occurs and overall the circuit performs as a left rotator.

  5. 5.

    Logical left shift

    For a logical left shift, the direction control signal is required to be set to 1 while the rest of the control signals are set as \(T=1\) and \(c_{0}=1\) so that the operation unit should perform a logical right shift. For the same inputs as a left rotation, suppose the logic circuit is going to perform a 2 bit logical left shift. The inputs are first reversed by the first part of the reverse unit. The operation unit yields the outputs as \(0, 0, a_{0}, a_{1}, \ldots , a_{n-4}, a_{n-3}\). Those outputs are reversed by the reverse unit to yield the final outputs as \(a_{n-3}, a_{n-4}, \ldots , a_{1}, a_{0}, 0, 0\). Thus, a 2 bit logical left shift occurs and overall the circuit performs as a logical left shifter.

  6. 6.

    Arithmetic left shift

    For an arithmetic left shift, all control signals must be set to 1 as if an arithmetic right shift was being performed. In addition, control signal z is also required to be set to 1 so that at prior to the last reverse, the Fredkin gate selects the last bit, which will be reversed to the first after the reverse, and become the sign bit \(a_{n-1}\). Again, suppose the logic circuit is going to perform a 2-bit arithmetic left shift with the same inputs as before in the previous examples. After being reversed by the first part of the reverse unit, the operation unit performs a logical right shift and yields the outputs as \(0, 0, a_{0}, a_{1}, \ldots , a_{n-4}, a_{n-3}\). Next the z controlled Fredkin gate selects the last bit as \(a_{n-1}\). After the second reverse, the final outputs are \(a_{n-1}, a_{n-4}, a_{n-5}, \ldots , a_{1}, a_{0}, 0, 0\). Thus, the universal bidirectional shifter performs a 2-bit arithmetic left shift and overall the circuit performs as a logical left shifter.

Figure 16 is an example of an (8, 3) universal bidirectional shifter. The initial inputs are defined as \(i_{7}, i_{6}, i_{5}, i_{4}, i_{3}, i_{2}, i_{1}, i_{0}\). The first reverse operation yields \(a_{7}, a_{6}, a_{5}, a_{4}, a_{3}, a_{2}, a_{1}, a_{0}\). And the operation unit yields \(f_{7}, f_{6}, f_{5}, f_{4}, f_{3}, f_{2}, f_{1}, f_{0}\). The z controlled Fredkin yields h.

Fig. 16
figure 16

An (8, 3) universal bidirectional shifter (FR Fredkin gate, FE: Feynman gate, G garbage outputs)

6 Comparison of proposed universal bidirectional shifter to other designs

The designs proposed in this work make numerous improvements over existing designs. The proposed designs reduce NML cost, the number of ancilla inputs, and the number of garbage outputs. All of the proposed designs make improvements over existing designs. However, comparisons are only shown for the universal bidirectional shifter designs as it can perform the same operations as all of the other shifter types.

  1. 1.

    Ancilla inputs

    Table 8 shows the number of ancilla inputs in the SCRL design and some other existing designs of reversible universal bidirectional shifters [12, 13, 18]. The ancilla inputs for each design can be determined using the following equations:

    $$\begin{aligned} \mathrm{SCRL}= & {} n\end{aligned}$$
    (72)
    $$\begin{aligned} \mathrm{Kotiyal}= & {} n\times q+2^{q}+1\end{aligned}$$
    (73)
    $$\begin{aligned} \mathrm{Mitra}= & {} n+q\end{aligned}$$
    (74)
    $$\begin{aligned} \mathrm{Hoss.}= & {} 2^{q+2}+n\times (q+2) - 4 \end{aligned}$$
    (75)

    The proposed SCRL design has fewer ancilla inputs than other existing designs. The improvement ratio increases with complexity for the Kotiyal [13] and Hosseininia [12] designs. The improvement ratio over the Kotiyal design [13] increases from 75.76 to 85.75 % and the improvement ratio over the Hosseininia design [12] increases from 88.24 to 91.62 %. The improvement ratio over the Mitra design [18] decreases as the complexity increases from 27.27 to 8.57 %.

  2. 2.

    Garbage outputs

    Table 9 shows the comparison of garbage outputs of the SCRL design and some other existing designs [12, 13, 18].

    Table 8 Ancilla inputs in universal bidirectional shifter designs
    Table 9 Garbage outputs in universal bidirectional shifter designs

    The garbage outputs for each design can be determined using the following equations:

    $$\begin{aligned} \mathrm{SCRL}= & {} n+1\end{aligned}$$
    (76)
    $$\begin{aligned} \mathrm{Kotiyal}= & {} (n+1)\times q+n+3\end{aligned}$$
    (77)
    $$\begin{aligned} \mathrm{Mitra}= & {} n+2q+3\end{aligned}$$
    (78)
    $$\begin{aligned} \mathrm{Hoss.}= & {} 2^{q+2}+q\times (n+2)+2n-4 \end{aligned}$$
    (79)

    The proposed SCRL design has fewer garbage outputs than other existing designs. The improvement ratio increases with complexity for the Kotiyal and Hosseininia designs. The improvement ratio over the Kotiyal design [13] increases from 76.32 to 85.78 % and the improvement ratio over the Hosseininia design [12] increases from 87.84 to 91.62 %. The improvement ratio over the Mitra design [18] decreases as the complexity increases from 47.06 to 17.72 %.

  3. 3.

    Cost

    As the majority voter and the inverter form the basic building blocks of NML computing, the universal bidirectional shifter designs can be evaluated by their implementation cost in terms of inverters and majority voters. Table 10 shows the inverter cost evaluation of the SCRL universal bidirectional shifter and Table 11 shows the majority voter cost evaluation [12, 13, 18].

    Table 10 NML inverter cost in universal bidirectional shifter designs
    Table 11 NML MV cost in universal bidirectional shifter designs

    The proposed SCRL design has a lower cost than other existing designs. It uses both fewer inverters and fewer majority voters than existing designs. The improvement ratio of inverters increases with complexity for most of the existing designs. The improvement ratio of majority voters increases with complexity for each of the existing designs. In terms of inverters, the improvement ratio over the Kotiyal design [13] increases from 49.29 to 57.62 % and improvement ratio over the Hosseininia design [12] increases from 73.72 to 74.89 %. The improvement ratios of our design compared to Mitra design [18] in terms of inverters are in the range of 34 %. In terms of majority voters, the improvement ratio over the Kotiyal design [13] increases from 38.10 to 48.01 %, the improvement ratio over the Mitra design [18] increases from 10.55 to 22.58 %, and the improvement ratio over the Hosseininia design [12] increases from 59.21 to 62.33 %.

7 Related work

Reversible logic has attracted many meaningful attempts in proposed works such as [710, 15, 16, 30, 32, 33, 37]. Minimizing the garbage outputs and ancilla inputs is the common goal for the researchers to optimize the reversible logic circuit designs. We observed that some new technologies have been proposed and have contributed to reducing the ancilla inputs and the garbage outputs [17, 22]. Some of the practical designs are flip-flops, fast fourier transform, perfect shuffle, etc. [1, 4]. Some areas of research also included binary calculations such as half adders, full adders, ripple carry adders, carry look-ahead adders, etc. [3, 6, 11, 19, 2729]. Researchers have proposed new designs of ripple carry adders without ancilla inputs and optimized the operation delay in [25, 26]. We also observed several recent designs of reversible barrel shifters [12, 13, 18]. Among these designs, the Mitra design [18] is the newest having been proposed in March 2015. However, the existing designs are based on the Fredkin and the Feynman gate. The Fredkin gate is used to implement the 2:1 MUX while the Feynman gate is used to avoid the fanout. We observe that a single Fredkin gate is limited to only pairwise bit swap, hence the Fredkin gate-based designs of reversible barrel shifters have a significant overhead in terms of ancilla inputs and garbage outputs. Therefore, the proposed work utilizes a new conservative reversible gate called SCRL gate and optimizes the barrel shifter designs including right rotator, logical right shifter, arithmetic right shifter, universal right shifter and universal bidirectional shifter by reducing the number of ancilla inputs and garbage outputs. This work covers all six functions of the barrel shifters (right/left rotator, logical right/left shifter and arithmetic right/left shifter)and optimizes them in a satisfying range compared to the most recent designs. In addition, a universal bidirectional shifter that can implement all of the six listed functions has been proposed in this work.

8 Discussion and conclusions

In this work, we present a novel conservative reversible logic gate (SCRL gate) which is better compared to the Fredkin gate for implementing the boolean functions that are required to design the barrel shifters. Further, the design methodologies for (nq) reversible barrel shifters based on the SCRL gate are also proposed. The five proposed methodologies of reversible barrel shifters consist of right rotator, logical right shifter, arithmetic right shifter, universal right shifter, and universal bidirectional shifter. The proposed SCRL gate-based designs of reversible barrel shifters are mapped in NML computing and are compared with the existing designs in terms of the number of garbage outputs, the number of ancilla inputs, and the NML cost. The proposed designs of reversible barrel shifters based on the SCRL gate have shown significant improvement compared to the existing designs in the literature.