Novel RNS-to-binary converters for the three-moduli set {2m − 1, 2m, 2m + 1}

Phalguna, P S; Kamat, Dattaguru V; Mohan, P V Ananda

doi:10.1007/s12046-019-1078-0

Novel RNS-to-binary converters for the three-moduli set {2m − 1, 2m, 2m + 1}

Published: 27 March 2019

Volume 44, article number 99, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Sādhanā Aims and scope Submit manuscript

Novel RNS-to-binary converters for the three-moduli set {2m − 1, 2m, 2m + 1}

Download PDF

P S Phalguna¹,
Dattaguru V Kamat¹ &
P V Ananda Mohan²

112 Accesses
1 Citation
Explore all metrics

Abstract

In this paper, Mixed Radix Conversion (MRC)-based Residue Number System (RNS)-to-binary converters for the three-moduli set {2m − 1, 2m, 2m + 1} are presented. The proposed reverse converters are evaluated and compared to reverse converters proposed earlier in literature using Chinese Remainder Theorem (CRT) and New CRT for this moduli set as well as two four-moduli sets {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ − 1} and {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ + 1} regarding hardware requirement and conversion time.

Reverse Conversion Using Core Function, CRT and Mixed Radix Conversion

Article 02 November 2016

Residue to binary converter for the extended four moduli set {2^k, 2ⁿ−1, 2ⁿ+1, 2ⁿ⁺¹+1} for n odd

Article 15 April 2023

Reverse Converters for the Moduli Set {$2^{n}, 2^{n-1}-1,2^{n}-1, 2^{n+1}-1\}(n\,\hbox {Even})$

Article 05 December 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The advantages of Residue Number System (RNS) such as carry-free operation, modularity and fault tolerance have made it attractive in applications like cryptography, digital signal processing (DSP) and communication systems [1,2,3,4]. Several three-, four- or more-moduli sets have been described in literature. They use powers-of-two-related moduli of the form 2^u, 2^u + 1, 2^u − 1, 2^v + 3, 2^v − 3. In addition, other three-moduli sets that use consecutive numbers as moduli also have been investigated, viz., {2m − 1, 2m, 2m + 1} [5] and {2m, 2m + 1, 2m + 2} [6], the latter using two moduli that have a common factor. The moduli sets {2^α − 1, 2^α, 2^α + 1} [7,8,9,10,11,12] and {2^β−1 − 1, 2^β − 1, 2^β} [13,14,15,16] are special cases of these two-moduli sets. Note that the moduli set {2^β−1 − 1, 2^β − 1, 2^β} is obtained by removing the common factor from one of the two even moduli 2^β − 2 and 2^β in the moduli set {2^β − 2, 2^β − 1, 2^β} to make the moduli relatively prime. The moduli set {2^α − 1, 2^α+γ, 2^α + 1} has been also investigated to give a variable dynamic range (DR) using the additional degree of freedom γ where 0 ≤ γ ≤ α. [17]. This gives an increment of DR by γ bits over the moduli set {2^α − 1, 2^α, 2^α + 1} with a resolution of 1 bit. On the other hand, the moduli set {2m − 1, 2m, 2m + 1} also offers several other options for realizing a desired DR through proper choice of m. As an illustration, the DRs of the popular moduli set starting from α = 3, 4 and 5 are, respectively, 504, 4080 and 32736. The choice of variable γ leads to the DRs that are 1008, 2016, 4032, etc. In the case of {2m − 1, 2m, 2m + 1} starting from m = 3, 4, 5, 6, 7, etc. the DRs are 210, 504, 990, 1716, 2730, 5814, 7980, etc.

Premkumar [5] suggested the three-moduli set M1 {2m − 1, 2m, 2m + 1} and several reverse converters for M1 have been reported in the literature [5, 18,19,20,21]. The first reverse converter for M1 is presented in [5] using CRT. Later, two reverse converters were presented using a modification of CRT for reducing the modulo reduction complexity [18]. Reverse converters for this moduli set using New CRT II [22] also have been investigated [19]. More recently, improved reverse converters for this moduli set using CRT have been presented [20, 21]. However, these converters can be considered to be similar to a Mixed Radix Conversion (MRC)-type design. The intermediate digits derived, however, are not amenable for facilitating comparison since one of the intermediate digits can be negative. It is interesting to note that MRC technique has not been explored for the moduli set M1. It is well known that MRC technique facilitates easy comparison of two RNS numbers as well as scaling by one modulus or product of two moduli [2]. In this paper, we consider the MRC technique for reverse conversion. Several architectures will be described that take advantage of the simper multiplicative inverses in order to arrive at designs with hardware requirement/conversion time trade-off. All the proposed architectures are compared to the state-of-the-art reverse converters reported earlier for the moduli set M1 in the literature as well as two representative four-moduli sets {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ − 1} and {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ + 1} regarding hardware requirement and conversion time.

In section 2, background material has been given in brief. The proposed MRC-based reverse converter architectures are presented in section 3. The performance evaluation and comparison of the proposed converters with converters for M1 reported earlier and implementation results are provided in section 4. Comparison with converters for two representative four-moduli sets is also presented in section 4. The concluding remarks are given in section 5.

2 Background material

The two popular approaches used for the reverse conversion process in RNS are Chinese Remainder Theorem (CRT) and MRC. In CRT, we compute decoded binary number X as

$$ X = \left( {\mathop \sum \limits_{i = 1}^{j} x_{i} M_{i} \left( {\frac{1}{{M_{i} }}} \right)_{{m_{i} }} } \right)\bmod \, M $$

(1)

where M is the product of all moduli m_i, M_i = M/m_i and x_i are the given residues defined such that x_i = X mod m_i. Note that $ y = \left( {\frac{1}{a}} \right)_{b} $ is known as a multiplicative inverse of a with respect to modulus b defined such that remainder of the computation (a × y)/b is 1. The main advantage of CRT is the parallel computation of various terms in (1) corresponding to the given residues followed by the summation of various terms mod M.

In MRC for three-moduli set {m₁, m₂, m₃}, the decoded number X corresponding to residues (x₁, x₂, x₃) is obtained as

$$ X = U_{3} m_{2} m_{1} + U_{2} m_{1} + U_{1} $$

(2)

where the mixed radix digits U_i (i = 1, 2, 3) are computed as follows:

$$ U_{1} = x_{1} ,\;\;U_{2} = \left( {\left( {x_{2} - x_{1} } \right)\left( {\frac{1}{{m_{1} }}} \right)_{{m_{2} }} } \right)\bmod \, m_{2} , $$

(3a)

$$ U_{3} = \left( {\left( {\left( {\left( {x_{3} - U_{1} } \right)\left( {\frac{1}{{m_{1} }}} \right)_{{m_{3} }} } \right)_{{m_{3} }} - U_{2} } \right)\left( {\frac{1}{{m_{2} }}} \right)_{{m_{3} }} } \right)\bmod \,m_{3} . $$

(3b)

Since MRC is a sequential process, in each step a single mixed radix digit is determined. The next step is to compute X in (2). Note that the cumbersome modulo M reduction needed in the case of CRT in (1) is not needed in MRC since 0 ≤ X < M. In the present paper, we use MRC technique for deriving various reverse converters.

In the implementation of MRC, we need modulo subtractors for computing (x_j − x_k) mod m_i. They use structures similar to cost-effective (CE) and high-speed (HS) modulo adders [2]. We can use the HS architecture of figure 1a in which we compute T₁ = (x_j − x_k) and T₂ = (x_j − x_k + m_i) using two parallel adders and based on the sign of T₁ we select either T₁ or T₂ using a 2:1 multiplexer (2:1 MUX). Note that one’s complement of x_k and a carry input of 1 are added to obtain two’s complement of x_k. Thus the hardware requirement is two k-bit carry-propagate adders (CPA1 and CPA2), one k-bit carry-save adder (CSA1) and one k-bit 2:1 MUX where $ k = { \log }_{2} \left( {2m + 1} \right) $. The computation time is (k + 1)Δ_FA + Δ_MUX where Δ_FA and Δ_MUX are delays of a full adder and a 2:1 MUX, respectively. We denote this block as MODSUBA.

Note, however, in the case of m_j = 2m − 1, m_k = 2m + 1, for computing (x_j − x_k) mod m_j for x_j = 0, x_k = 2m, two consecutive additions of m_j are needed since 0 − 2m + (2m − 1) = − 1 and −1 mod (2m − 1) is (2m − 2). Instead, we compute T₃ = (x_j − x_k) or T₄ = (x_j − x_k + 2m − 1) and one among T₃, T₄ and T₅ = 2m − 2 can be selected using a 3:1 MUX based on the sign of T₃ and T₄ as shown in the MODSUBB block in figure 1b. The computation time, however, is about the same as that of MODSUBA.

The CE version of a modulo subtractor (MODSUBC block) can be realized as shown in figure 1c, in which we compute T₆ = (x_j − x_k) followed by T₇ = (T₆ + m_j) using two adders and based on the sign of T₆, we select either T₆ or T₇ using a 2:1 MUX. Thus the hardware requirement is two k-bit CPAs (CPA5 and CPA6) and one k-bit 2:1 MUX. The computation time needed is (2k)Δ_FA + Δ_MUX.

The implementation of (H × m) mod (2m − 1) is also needed in the proposed reverse converter architectures. This can be carried out by considering H = 2Y + h₀, where Y is the word formed by (k − 1)-bit MSBs of the k-bit word H and h₀ is the LSB of H, as

$$ \left( {H \times m} \right) \bmod \left( {2m - 1} \right) = \left( {2Y \times m + h_{0} \times m} \right) \bmod \left( {2m - 1} \right) = \left( {Y + h_{0} \times m} \right) \bmod \left( {2m - 1} \right). $$

(4)

Note that Y is at most H_max/2 = (2m − 2)/2 = m − 1, in which case h₀ = 0, thus making Y + h_o × m = m − 1. In the other cases, H_max/2 < m − 1 and even if h₀ = 1, (Y + h_om) ≤ (m − 2) + m = 2m − 2 < 2m − 1. Thus, (H × m) mod (2m − 1) can be realized by adding Y with h_om (obtained by enabling m by h_o using (k − 1) two-input AND gates) using CPA8 as shown in the MODMUL block of figure 1d. Note that m is available as (k − 1) most significant bits of m₂ = 2m.

3 Proposed RNS-to-binary converters

In this section, we present new RNS-to-binary converters for the three-moduli set M1 {2m − 1, 2m, 2m + 1} using MRC technique. The MRC algorithm for the three-moduli set M1 is shown in figure 2. The various multiplicative inverses needed in the computation are as follows:

$$ a = \left( {\frac{1}{2m + 1}} \right)_{2m} = 1, $$

(5a)

$$ b = \left( {\frac{1}{2m + 1}} \right)_{2m - 1} = m, $$

(5b)

$$ c = \left( {\frac{1}{2m}} \right)_{2m - 1} = 1. $$

(5c)

They can be verified to be true since (2m + 1) × a = 1 mod 2m, ((2m + 1) × b) mod (2m − 1) = 1 and (2m) × c = 1 mod (2m − 1). We denote the residues corresponding to the three-moduli m₁ = 2m − 1, m₂ = 2m and m₃ = 2m + 1 as (x₁, x₂, x₃) and binary number corresponding to this residue set as X. The DR is M = 2m(4m² − 1). The implementation of the MRC algorithm of figure 2 using various multiplicative inverses Eq. (5a)–(5c) is presented in figure 3. This converter is denoted as D6.

The computation of (x₂ − x₃) mod 2m can be carried out using CE version of a modulo subtractor MODSUBC of figure 1c to obtain intermediate result U_A*. The mixed radix digit U_A is thus already available as U_A* since a is 1 (see Eq. (5a)).

The computation of U_B* = (x₁ − x₃) mod (2m − 1) can be realized using MODSUBB block shown in figure 1b. Next, the intermediate result U_B is computed from U_B* by performing multiplication with b modulo (2m − 1) in modulo multiplier block shown in figure 1d since b is m (see Eq. (5b)). Next, the modulo subtraction (U_B − U_A) mod (2m − 1) can be carried out using MODSUBA block to obtain U_C*. The mixed radix digit U_C is thus already available as U_C* since c is 1 (see Eq. (5c)).

The last stage in the converter computes X using Eq. (2) as

$$ X = \left( {x_{3} + U_{\text{A}} m_{3} } \right) + U_{\text{C}} m_{2} m_{3} . $$

(6)

Here the first term (x₃ + U_Am₃) is computed using a (k × k)-bit merged array multiplier MULT1 that multiplies two inputs U_A and m₃ and adds a third input x₃ in the carry save portion of the multiplier [23]. The second term U_Cm₂m₃ in Eq. (6) is computed using a (2k × k)-bit array multiplier MULT2. Thus the decoded integer can be obtained using 3k-bit CPA9 as shown in BLOCK1 of figure 3.

The design D6 is based on conventional MRC that requires sequential modulo reductions in the modulus m₁ channel to obtain the mixed radix digits. We explore techniques to reduce the number of cascaded modulo reductions next. For this purpose, we choose an ordering of moduli different from that shown in figure 2. The various multiplicative inverses needed for this approach shown in figure 4 are as follows:

$$ e = \left( {\frac{1}{2m}} \right)_{2m + 1} = - 1, $$

(7a)

$$ f = \left( {\frac{1}{2m}} \right)_{2m - 1} = 1, $$

(7b)

$$ g = \left( {\frac{1}{2m + 1}} \right)_{2m - 1} = m. $$

(7c)

The correctness of Eq. (7a)–(7c) can be easily verified.

The architecture of the converter D7 following figure 4 is shown in figure 5. The mixed radix digit P can be computed as (x₂ − x₃) + tm₃ since e = − 1 (see Eq. (7a)) where if x₂ ≥ x₃, t is 0, else t is 1. Note that (x₂ − x₃) + tm₃ is computed using MODSUBC block (see figure 5 with x_j = x₂ and x_k = x₃ and m_i = m₃). The sign bit of the result (the output of CPA5 in MODSUBC block in figure 1c) is considered as t.

Next, we consider computation of the mixed radix digit Q. We compute (x₁ − x₂) but we defer modulo m₁ reduction since the multiplicative inverse f with which we need to multiply mod m₁ is unity (see Eq. (7b)). Next, unlike in conventional MRC, we subtract ((x₂ − x₃) + tm₃) from (x₁ − x₂) to obtain the intermediate result:

$$ Q^{*} = \left( {x_{3} - 2x_{2} + x_{1} - tm_{3} } \right) \bmod \, m_{1} = \left( {x_{3} - 2x_{2} + x_{1} - 2t} \right) \bmod \,m_{1} . $$

(8)

Note that in the second equality, we have used the fact m₃ mod m₁ = 2. The subtraction of ((x₂ − x₃) + tm₃) instead of P has the advantage that t is available before P is available, saving one k-bit CPA delay.

The computation of Eq. (8) requires addition of x₃, x₁, (2x₂)_2C (realized as addition of one’s complement of 2x₂ and carry input of 1) and t × (2)_2C (two’s complement of 2 enabled by t) using CSA3 and CSA4 followed by a modulo m₁ adder. The maximum positive and minimum negative values of (x₃ − 2x₂ + x₁ − 2t) are (4m − 4) and (−4m + 2), respectively. The maximum positive value occurs when x₁ = (2m − 2) and x₃ = 2m and since x₂ < x₃ in this case, t = 1, thus making the maximum positive value (4m − 4). On the other hand, when x₁ = x₃ = 0 and for all x₂ values, t = 0, yielding the minimum negative value −2(2m − 1) = − 4m + 2. Hence, for modulo m₁ reduction of the sum of the outputs S₄ and C₄ of CSA4, at most addition of m₁ or 2m₁ or subtraction of m₁ (addition of two’s complement of m₁) is needed. This can be realized using a HS version of a parallel-type modulo m₁ adder that uses a 4:1 MUX to select the correct result Q* as shown in figure 5. The CPA10 computes sum T₉ = C₄ + S whereas the CSA5 followed by CPA11 computes T₈ = C₄ + S − m₁, CSA6 followed by CPA12 computes T₁₀ = C₄ + S + m₁ and CSA7 followed by CPA13 computes T₈ = C₄ + S + 2m₁. Note that one’s complement of m₁ is added with a carry input of C_i = 1 inserted in the free LSB of CARRY vector C₅. The correct result Q* is selected using a 4:1 MUX as shown in BLOCK2 of figure 5. Next, the multiplication of Q* with g (= m) (see figure 5) is carried out to obtain the mixed radix digit Q using MODMUL block shown in figure 1d with H = Q*. We next compute X as

$$ X = x_{2} + Pm_{2} + Qm_{2} m_{3} $$

(9)

using BLOCK3 (similar to BLOCK1 in figure 3). This uses multipliers MULT3 and MULT4 of sizes k × k and 2k × k, respectively, followed by CPA14. Note that MULT3 is a merged multiplier.

In the design of reverse converters following figure 4, we can notice that the computation of mixed radix digit Q is the critical path. Hence we present some alternate designs for computing the mixed radix digit Q employing two different methods for mod m₁ reduction of sum of C₄ and S₄ in figure 5 to obtain Q*. In the design shown in figure 6a, we first add C₄ and S ₄ in CPA15 to obtain T₁₂. Note that CPA15 has a carry input of 1 to realize two’s complement of 2x₂. We reduce the result T₁₂ mod m₁ using one ADD/SUB unit realized by CPA16 and k exclusive-OR gates and one adder adding 2m₁ using CPA17. The correct result is selected using a 3:1 MUX based on the sign bits of outputs of CPA16 and CPA17. Note that the exclusive-OR gates invert the bits of m₁ to facilitate subtraction and a carry input C_i = s′ is added where s is the sign bit of T₁₂. This block can be used in the architecture of converter D7 in figure 5 in place of BLOCK2 to realize converter D8.

In an alternative converter design D9, we use a binary-to-RNS converter to reduce T₁₂ mod m₁ as shown in figure 6b. Since T₁₂ is (k + 2)-bit wide, based on the two MSB bits, we add a constant W to the k-bit LSBs of T₁₂. Denoting x = 2^k mod m₁ it can be seen that the two MSBs correspond to the four values before mod m₁ reduction: 00_b → 0, 01_b → 2^k = x, 10_b → −2^k+1, 11_b → −2^k. (Note that b indicates binary representation and (k + 1)th bit is sign bit of T₁₂)). Thus, using a 4:1 MUX, appropriate value among these can be selected and added with k LSBs of T₁₂ and reduced mod m₁ using CPA18, CSA8, CPA19 and 2:1 MUX to obtain Q*. Note that the sum of W and word corresponding to k LSBs of T₁₂ is at most (2^k − 1) so that a single subtraction of modulus m₁ (addition of (m₁)_1C with a carry input of 1 to CPA19) is sufficient to obtain Q* as shown in figure 6b.

Next, we consider realizing the computation of Q* and multiplication with m in a single block, instead of the cascade designs considered in figures 5 and 6. In the design D10, to determine the mixed radix digit Q, we need to compute [m × ((x₃ − 2x₂ + x₁ − 2t) mod m₁)] mod m₁ = (mx₃ + mx₁ − x₂ − t) mod m₁ in one step. Note that (m × 2x₂) and (m × 2t) are reduced modulo m₁ as x₂ and t, respectively, since (2m) mod x₁ = 1. We consider x₃ = 2x_3H + x₃₀ and x₁ = 2x_1H + x₁₀ where x_3H and x_1H are the words formed by the most significant (k − 1) bits of x₃ and x₁, respectively. The computation of (mx₃ + mx₁) can be realized by adding m × x₃₀, m × x₁₀, x_3H and x_1H since (2m × x_3H) mod m₁ = x_3H and (2m × x_1H) mod m₁ = x_1H. Note that −x₂ − t is realized as (x₂)_1C + (1 − t) = (x₂)_1C + t′ where t′ is inverted bit t. Thus, we need to compute (m × x₃₀ + m×x₁₀ + x_3H + x_1H + x_21C + t′) mod m₁ to obtain Q. Note that m × x₃₀ and m × x₁₀ can be obtained using a pair of (k − 1) AND gates enabled by x₃₀ and x₁₀, respectively, as shown in figure 7.

The five operands can be added using three-level CSA tree (CSA9–CSA11) and CPA20 followed by a mod m₁ adder. Note that the maximum positive and minimum negative values of the result of CPA20 are (4m − 4) and (−2m + 1), respectively. Hence, at most a single addition or subtraction of modulus m₁ is sufficient to obtain Q. Hence, a modulo m₁ adder using an ADD/SUB unit formed by CPA21, k exclusive-OR gates and a 2:1 MUX is used to compute Q.

4 Performance evaluation and comparison

The hardware requirement and conversion time for the various reverse converters described in [5, 18, 19, 21] for the moduli set M1 along with the proposed reverse converters have been presented in table 1. Note that FA, HA, AND and w:1 MUX stand for a full adder, half adder, two-input AND gate and w:1 multiplexer, respectively. The notations L1 and L2 are used to represent 2k × k and k × k multipliers, respectively, and LiM (for i = 1, 2) is used to represent merged multiplier [23]. Note that the hardware requirement of L1 and L2 is (2k² − 2k)FA and (k² − k)FA, respectively, considering that an array multiplier using (k − 2) carry save levels followed by a CPA is used and the delay of L1 and L2 is (3k − 2)Δ_FA and (2k − 2)Δ_FA, respectively.

Table 1 Comparison of hardware requirement and conversion time of various reverse converters for the three- and four-moduli sets M1–M3.

Full size table

The converter D1 due to Premkumar [5] uses CRT. It needs five two-input adders, three 2k × k multipliers each of the range 4m² and 5 numbers of 3k-bit 2:1 MUXs. Premkumar et al [18] suggested two converters D2 (Architecture A) and D3 (Architecture B) later by simplifying the conventional CRT. In this method, the modulo M reduction needed in [5] is simplified as modulo (m₁ × m₃) reduction. This converter needs one 2k × k and another k × k multiplier of the range 4m² and 2m, respectively. Architecture A (D2) presented in [18] needs seven two-input adders and 6k-bit 2:1 MUXs whereas another Architecture B (D3) presented in [18], which is a HS version, needs nine adders and 5k-bit 2:1 MUXs. In the converter D4 for M1 proposed by Wang et al [19] based on new CRT II technique, we need one 2k × k multiplier and one k × k multiplier, a few adders and a few comparators. The recent converter D5 for M1 due to Gbolagade et al [21] is based on the modification of CRT. It realizes modulo m₁ reduction using several MUXs and comparators and it needs one 2k × k multiplier and one k × k multiplier. The hardware requirement and conversion time for these five converters D1–D5 are presented as first five entries in table 1. The proposed converters are presented as D6–D10 in table 1.

Among all the converters using two multipliers L1 and L2 for the moduli set M1, converter D5 needs the least area and D4 needs the highest area. However, it may be noted that the area of multipliers L1 and L2 has quadratic dependence on k and hence, for large k, the area of the multipliers dominates the total area. All the converters need similar conversion time except converters D2, D3, D8 and D9. The converters D2 and D3 need larger conversion time than converters D8 and D9. The m and n values needed for realizing DRs ranging from 8-bit to 64-bit for various moduli sets are presented in table 2. As an illustration, m = 21 for 16-bit DR of M1 implies use of the moduli set {41,42,43}. We have also considered the three reverse converters D11–D13 for the four-moduli set M2 {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ − 1} [24,25,26] and two reverse converters D14 and D15 for the four-moduli set M3 {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ + 1} [25, 27] for the purpose of comparison. Note that they use the efficient RNS to binary converters for the three-moduli set {2ⁿ − 1, 2ⁿ, 2ⁿ + 1} [9,10,11,12] followed by a two-moduli MRC to include the fourth modulus. The hardware resource and conversion time requirements in terms of basic gates for the proposed reverse converters along with the converters for M1–M3 are also presented using unit-gate model [28] in table 3 for the general case and for the six standard DRs in table 4. Note that the equivalent number of gates for full adder, half adder, 2:1 MUX, EXOR/EXNOR, AND and OR gates is considered as 7, 3, 3, 2, 1 and 1 and the delays are considered as 4Δ_g, 2Δ_g, 2Δ_g, 2Δ_g, Δ_g and Δ_g, respectively, where Δ_g is unit-gate delay.

Table 2 Values of ‘m’ and ‘n’ to be considered for various DRs of M1–M3.

Full size table

Table 3 Hardware requirement and delay estimation based on unit gate model for various three- and four-moduli set reverse converters.

Full size table

Table 4 Area and delay comparison for 8-, 16-, 24-, 32-, 48- and 64-bit DR three- and four-moduli set reverse converters using unit gate model.

Full size table

From table 4, it can be observed that for all the standard DRs, among the considered three- and four-moduli sets the design D12 is preferable regarding lower hardware resource requirement and the converter D13 needs least conversion time among all the converters. Among the converters D5–D10 for moduli set M1, the proposed converter D10 needs the lowest hardware resources for DR 8 and 16 bits whereas converter D5 is better for 24-, 32-, 48- and 64-bit DRs. Regarding conversion time, for all considered standard DRs, D5 and D10 are better than other converters D6–D9. It can also be observed that the proposed converters D6–D10 need less conversion time than converters D14 and D15 for all the considered standard DRs.

The proposed converters D6–D10 as well as design D5 [21] were implemented using Cadence (Version 14.20), Compiler: RC 14.25 and synthesized using the Cadence Encounter tool using 180-nm technology. The post place and route results of area, conversion time and power dissipation for all these designs for DRs of 8, 16, 24, 32, 48 and 64 bits are presented in table 5.

Table 5 ASIC implementation results of various reverse converters for the three-moduli set M1.

Full size table

Regarding hardware requirements, for 8- and 64-bit DRs, the design D5 is superior to the converters D6–D10 and the design D6 outperforms converters D5 and D7–D10 for 16-, 24-, 32- and 48-bit DRs. For 16-bit D7 and D8 and for 32-bit and 48-bit DR, converter D8 require less hardware resources than D5. The converter D9 is preferable compared with D5 regarding area for 16-, 24-, 32- and 48-bit DRs.

Regarding conversion time, for 8, 32 and 64 bits, D5 performs better than D6–D10 and for 16-, 24- and 48-bit DRs, D6 outperforms converters D5 and D7–D10. The converter D7 also requires less conversion time than D5 for 16-bit DR.

Regarding power dissipation, 8-, 32-, 48- and 64-bit DRs, the converter D5 is superior than D6–D10 and for 16-bit DR, the converters D6, D7, D9 and D10 outperform converter D5. For 24-bit DR, the converter D7 needs less power dissipation than D5, D6 and D8–D10. Among the proposed converters, for 8-bit DR, the converter D6 is preferable and for 16- and 32-bit DRs, the converter D10 is superior to the other converters regarding power dissipation. For 24-, 48- and 64-bit DRs, the converter D7 needs least power dissipation compared with other proposed converters.

5 Conclusions

In this paper we have presented RNS-to-binary converters for the moduli set {2m−1, 2m, 2m + 1} using MRC technique. All the proposed converters were evaluated based on the hardware resource requirement as well as conversion time with all converters described in literature for the moduli set {2m−1, 2m, 2m + 1}. The proposed converters have also been compared to two four-moduli reverse converters. All the proposed converters are implemented and compared to the area-efficient converter [21] for M1 regarding area and conversion time for different DRs. The proposed converters also need less conversion time than reverse converters for some four-moduli sets. The proposed converters for M1 were shown to be better than some of the other converters regarding area and conversion time while having the advantage of availability of mixed radix digits.

References

Szabo N S and Tanaka R I 1967 Residue Arithmetic and Its Applications to Computer Technology. New York: Mc-Graw Hill
MATH Google Scholar
Ananda Mohan P V 2016 Residue Number Systems: Theory and Applications. Basel: Birkhauser
Book Google Scholar
Omondi A and Premkumar A B 2007 Residue Number System Theory and Implementation, vol. 2. London: Imperial College Press
Book Google Scholar
Soderstrand M A, Jullien G A, Jenkins W K and Taylor F (Eds.) 1986 Residue Number System Arithmetic: Modern Applications in Digital Signal Processing. Piscataway: IEEE Press
MATH Google Scholar
Premkumar A B 1992 An RNS to binary converter in {2n + 1, 2n, 2n − 1} moduli set. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 39: 480–482
Article Google Scholar
Premkumar A B 1995 An RNS to binary converter in a three moduli set with common factors. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 42: 298–301
Article Google Scholar
Andraros S and Ahmad H 1988 A new efficient memory-less residue to binary converter. IEEE Trans. Circ. Syst. 35: 1441–1444
Article Google Scholar
Piestrak S J 1995 A high-speed realization of residue to binary system conversion. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 42: 661–663
Article Google Scholar
Dhurkadas A 1998 Comments on ‘A high-speed realization of a residue to binary number system converter’. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 45: 446–447
Article Google Scholar
Bhardwaj M, Premkumar A B and Srikanthan T 1998 Breaking the 2n-bit carry propagation barrier in residue to binary conversion for the {2ⁿ − 1, 2ⁿ, 2ⁿ + 1} moduli set. IEEE Trans. Circ. Syst. I: Fund. Theor. Appl. 45: 998–1002
Article Google Scholar
Wang Z, Jullien G A and Miller W C 2000 An improved residue to binary converter. IEEE Trans. Circ. Syst. I: Fund. Theor. Appl. 47: 1437–1440
MathSciNet MATH Google Scholar
Wang Y, Song X, Aboulhamid M and Shen H 2002 Adder based residue to binary number converters for {2ⁿ − 1, 2ⁿ, 2ⁿ + 1}. IEEE Trans. Signal Process. 50: 1772–1779
Article MathSciNet Google Scholar
Hiasat A A and Abdel-Aty-Zohdy H S 1998 Residue to binary arithmetic converter for the moduli set {2^k, 2^k − 1, 2^k–1 − 1}. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 45: 204–209
Article Google Scholar
Wang W, Swamy M N S, Ahmad M O and Wang Y 2000 A high-speed residue-to-binary converter for three moduli {2^k, 2^k − 1, 2^k−1 − 1} RNS and a scheme for its VLSI implementation. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 47: 1576–1581
Article Google Scholar
Wang W, Swamy M N S, Ahmad M O and Wang Y 2002 A note on ‘A high-speed residue-to-binary converter for thee moduli {2^k, 2^k − 1, 2^k−1 − 1} RNS and a scheme for its VLSI implementation. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 49: 230
Article Google Scholar
Ananda Mohan P V 2008 New residue to binary converters for the moduli set {2^k, 2^k − 1, 2^k−1 − 1}. In: Proceedings of the IEEE Region 10 Conference (TENCON 2008), pp. 1–6
Chaves R and Sousa L 2004 {2ⁿ + 1, 2^n+k, 2ⁿ − 1}: a new RNS moduli set extension. In: Proceedings of the Euromicro Symposium on Digital System Design (DSD): Architectures, Methods and Tool, pp. 210–217
Premkumar A B, Bhardwaj M and Srikanthan T 1998 High-speed and low-cost reverse converters for the {2n − 1, 2n, 2n + 1} moduli set. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 45: 903–908
Article Google Scholar
Wang Y, Swamy M N S and Ahmad M O 1999 Residue-to-binary number converters for three moduli sets. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 46: 180–183
Article Google Scholar
Gbolagade K A and Cotofana S D 2008 An efficient RNS to binary converter using the moduli set {2n − 1, 2n, 2n + 1}. In: Proceedings of the XXIII Conference on Design of Circuits and Integrated Systems (DCIS)
Gbolagade K A, Voicu G R and Cotofana S D 2011 An efficient FPGA design of residue-to-binary converter for the moduli set {2n + 1, 2n, 2n–1}. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 19: 1500–1503
Article Google Scholar
Wang Y 2000 Residue to binary converters based on New Chinese Remainder theorems. IEEE Trans. Circ. Syst. II: Analog Digit. Signal Process. 47: 197–205
Article Google Scholar
Swartzlander Jr. E E 1980 Merged arithmetic. IEEE Trans. Comput. 29: 946–950
Article Google Scholar
Cao B, Srikanthan T and Chang C H 2005 Efficient reverse converters for the four-moduli sets {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ − 1} and {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁻¹ − 1}. IEE Proc. Comput. Digit. Tech. 152: 687–696
Article Google Scholar
Ananda Mohan P V and Premkumar A B 2007 RNS to binary converters for two four moduli sets {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ − 1} and {2ⁿ − 1, 2ⁿ, 2ⁿ + 1, 2ⁿ⁺¹ + 1}. IEEE Trans. Circ. Syst. I: Reg. Papers 54: 1245–1254
Article Google Scholar
Hosseinzadeh M, Molahosseini A and Navi K 2008 An improved reverse converter for the moduli set {2ⁿ + 1, 2ⁿ − 1, 2ⁿ, 2ⁿ⁺¹ − 1}. IEICE Electron. Exp. 5: 672–677
Article Google Scholar
Sousa L, Antao S and Chaves R 2013 On the design of RNS reverse converters for the four-moduli set {2ⁿ + 1, 2ⁿ − 1, 2ⁿ, 2ⁿ⁺¹ + 1}. IEEE Trans. VLSI Syst. 21: 1945–1949
Article Google Scholar
Bakalis D, Vergos H T and Spyrou A 2011 Efficient modulo 2n ± 1 squarers. Integr. VLSI J. 44: 163–174
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Electronics and Communication Engineering, Manipal Institute of Technology, Manipal Academy of Higher Education, Manipal, 576 104, India
P S Phalguna & Dattaguru V Kamat
R&D, Centre for Development of Advanced Computing, 1, Knowledge Park, Bayappanahalli, Bangalore, 560 038, India
P V Ananda Mohan

Authors

P S Phalguna
View author publications
You can also search for this author in PubMed Google Scholar
Dattaguru V Kamat
View author publications
You can also search for this author in PubMed Google Scholar
P V Ananda Mohan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to P V Ananda Mohan.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Phalguna, P.S., Kamat, D.V. & Mohan, P.V.A. Novel RNS-to-binary converters for the three-moduli set {2m − 1, 2m, 2m + 1}. Sādhanā 44, 99 (2019). https://doi.org/10.1007/s12046-019-1078-0

Download citation

Received: 14 December 2017
Revised: 23 August 2018
Accepted: 11 January 2019
Published: 27 March 2019
DOI: https://doi.org/10.1007/s12046-019-1078-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Novel RNS-to-binary converters for the three-moduli set {2m − 1, 2m, 2m + 1}

Abstract

Similar content being viewed by others

Reverse Conversion Using Core Function, CRT and Mixed Radix Conversion

Residue to binary converter for the extended four moduli set {2^k, 2ⁿ−1, 2ⁿ+1, 2ⁿ⁺¹+1} for n odd

Reverse Converters for the Moduli Set {\(2^{n}, 2^{n-1}-1,2^{n}-1, 2^{n+1}-1\}(n\,\hbox {Even})\)

1 Introduction

2 Background material

3 Proposed RNS-to-binary converters

4 Performance evaluation and comparison

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Novel RNS-to-binary converters for the three-moduli set {2m − 1, 2m, 2m + 1}

Abstract

Similar content being viewed by others

Reverse Conversion Using Core Function, CRT and Mixed Radix Conversion

Residue to binary converter for the extended four moduli set {2k, 2n−1, 2n+1, 2n+1+1} for n odd

Reverse Converters for the Moduli Set {\(2^{n}, 2^{n-1}-1,2^{n}-1, 2^{n+1}-1\}(n\,\hbox {Even})\)

1 Introduction

2 Background material

3 Proposed RNS-to-binary converters

4 Performance evaluation and comparison

5 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation

Residue to binary converter for the extended four moduli set {2^k, 2ⁿ−1, 2ⁿ+1, 2ⁿ⁺¹+1} for n odd