Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Addition is a primitive operation for most arithmetic functions, so that FPGA vendors have dedicated a particular attention to the design of optimized adders. As a consequence, in many cases the synthesis tools are able to generate fast and cost-effective adders from simple VHDL expressions. Only in the case of relatively long operands can it be worthwhile to consider more complex structures such as carry-skip, carry-select and logarithmic adders.

Another important topic is the design of multi-operand adders. In this case, the key concept is that of carry-save adder or, more generally, of parallel counter.

Obviously, the general design methods presented in Chap. 3 (pipelining, digit-serial processing, self-timing) can be applied in order to optimize the proposed circuits. Numerous examples of practical FPGA implementations are reported in Sect. 7.9.

7.1 Addition of Natural Numbers

Consider two radix-B numbers

$$ x = x_{n - 1} \cdot B^{n - 1} + x_{n - 2} \cdot B^{n - 2} + \cdots + x_{1} \cdot B + x_{0} $$

and

$$ y = y_{n - 1} \cdot B^{n - 1} + y_{n - 2} \cdot B^{n - 2} + \cdots + y_{1} \cdot B + y_{0} , $$

where all digits x i and y i belong to {0, 1,…, B−1}, and an input carry c 0 belonging to {0, 1}. An n-digit adder generates a radix-B number

$$ z = z_{n - 1} \cdot B^{n - 1} + z_{n - 2} \cdot B^{n - 2} + \cdots + z_{1} \cdot B + z_{0} , $$

and an output carry c n , such that

$$ x + y + c_{0} = c_{n} \cdot B^{n} + z. $$

Observe that x + y + c 0 ≤ 2(B n−1) + 1 = 2B n−1, so that c n belongs to {0, 1}.

The common way to implement an n-digit adder consists of connecting in series n 1-digit adders (Fig. 7.1). For each of them

$$ x_{i} + y_{i} + c_{i} = c_{i\; + \;1} \cdot B + z_{i} , $$

where c i and c i + 1 belong to {0, 1}. In other words

$$ z_{i} = \left( {x_{i} + y_{i} + c_{i} } \right)\bmod B,\;c_{i\, + \,1} = \left\lfloor {\left( {x_{i} + y_{i} + c_{i} } \right)/B} \right\rfloor . $$

The critical path is

$$ \left( {x_{0} ,y_{0} ,c_{0} } \right) \to c_{1} \to c_{2} \to \cdots \to c_{n - 1} \to \left( {z_{n - 1} ,c_{n} } \right), $$

so that the total computation time is approximately equal to n·T carry where T carry is the computation time of c i + 1 in function of x i , y i and c i .

Fig. 7.1
figure 1

n-digit adder

In order to reduce T carry , it is convenient to compute two binary functions p (propagate) and g (generate) of x i and y i :

$$ p\left( {x_{i} ,y_{i} } \right) = 1{\text{ if}}\;x_{i} + y_{i} = B - 1,p\left( {x_{i} ,y_{i} } \right) = 0\;{\text{otherwise}}; $$

\( g\left( {x_{i} ,y_{i} } \right) = 1{\text{ if}}\;x_{i} + y_{i} \ge B,g\left( {x_{i} ,y_{i} } \right) = 0{\text{ if}}\;x_{i} + y_{i} \le B - 2,\;{\text{otherwise}},\;{\text{any}}\;{\text{value}}. \) So, ci + 1 can be expressed under the following way:

$$ c_{i\, + \,1} \, = p(x_{i} ,\;y_{i} ) \cdot c_{i} + \overline{{p(x_{i} ,\;y_{i} )}} \cdot g(x_{i} ,\;y_{i} ) $$

The last relation expresses that if x i  + y i  = B−1, then c i+1 is equal to c i ; if x i  + y i  ≥ B, then c i+1 = 1; if x i  + y i  ≤ B−2, then c i+1 = 0. The corresponding implementation is shown in Fig. 7.2. It is made up of two 2-operand combinational circuits that compute p(x i , y i ) and g(x i , y i ), and a multiplexer. In an n-digit adder (Fig. 7.1), all functions p(x i , y i ) and g(x i , y i ) are computed in parallel, so that the value of T carry is practically equal to the multiplexer delay T mux .

Fig. 7.2
figure 2

Carry computation

7.2 Binary Adder

If B = 2, then p(x i , y i ) = x i XOR y i , and g(x i , y i ) can be chosen equal to x i (or y i ). A complete n-bit adder is shown in Fig. 7.3. Its computation time is equal to

$$ T_{adder} \left( n \right) = T_{xor} + \left( {n - 1} \right) \cdot T_{mux} + \max \left\{ {T_{mux} ,\;T_{xor} } \right\}, $$
(7.1)

and the delay from the input carry to the output carry is equal to

Fig. 7.3
figure 3

n-bit adder

$$ T_{carry - to - carry} \left( n \right) = n \cdot T_{mux} . $$
(7.2)

Comment 7.1

Most FPGA’s include the basic components to implement the structure of Fig. 7.3, and the synthesis tools automatically generate this optimized adder from a simple VHDL expression such as

7.3 Radix-2k Adder

If B = 2k, then p(x i , y i ) = 1 if x i  + y i  = 2k−1, that is, if the k less significant bits of s i  = x i  + y i are equal to 1, and g(x i , y i ) = 1 if x i  + y i  ≥ 2k, that is, if the most significant bit of s i is equal to 1. The iterative cell of a radix-2k adder is shown in Fig. 7.4. The critical path of the part of the circuit that computes g(x i , y i ) and p(x i , y i ) has been shaded. Its computation time is equal to T adder (k) +T and . An m-digit radix-2k adder is equivalent to an n-bit adder with n = m · k. The total computation time is

$$ T_{adder} \left( n \right) = T_{adder} \left( k \right) + T_{and} + \left( {m - \, 1} \right) \cdot T_{mux} + T_{half - adder} \left( k \right), $$
(7.3)

and the delay from the input carry to the output carry to

$$ T_{carry - to - carry} \left( n \right) = m \cdot T_{mux} . $$
(7.4)

The following VHDL model describes the basic cell of Fig. 7.4. An alternative way of computing p = s 0·s 1·… ·s k-1 is The corresponding circuit is a k-bit half adder that computes t = (s mod 2k) + 1. The most significant bit t k of t is equal to 1 if, and only if, all the bits of (s mod 2k) are equal to 1. As mentioned above (Comment 7.1) most FPGA’s include the basic components to implement the structure of Fig. 7.3. In the particular case where x = 0, y = s and c 0 = 1, the circuit of Fig. 7.4 is obtained. The apparently unnecessary XOR gates are included because there is generally no direct connection between the adder inputs and the multiplexer control inputs. Actually, the XOR gates are LUTs whose outputs are permanently connected to the carry-logic multiplexers.

Fig. 7.4
figure 4

Radix 2k adder

A complete generic model base_2 k_adder.vhd is available at the Authors’ web page and examples of FPGA implementations are given in Sect. 7.9.

According to (7.3), the non-constant terms of T adder (n) are:

  • m·T mux ,

  • k·T mux included in T adder (k) according to (7.1),

  • k·T mux included in Thalf-adder(k) according to (7.1).

Thus, the sum of the non-constant terms of T adder (n) is equal to (2k + mT mux . The value of 2k + m, with m·k = n, is minimum when 2k ≅ m, that is, when k ≅ (n/2)1/2. With this value of k, the sum of the non-constant terms of T adder (n) is equal to (8n)½·T mux . Thus, the computation time is O(n)½ instead of O(n).

Comments 7.2

  1. 1.

    The circuit of Fig. 7.4 is an example of carry-skip adder. For every group of k bits, both the carry-propagate and carry-generate functions are computed. If the carry-propagate function is equal to 1, the input carry is directly propagated to the carry output of the k-bit group, thus skipping k bits.

    Fig. 7.5
    figure 5

    FPGA implementation of a k-input AND

  2. 2.

    A mixed-radix numeration system could be used. Assume that n = k 1 + k 2 + … + k m ; then a radix

    $$ (2^{{k}_{1}} ,\;2^{{k}_{2}} , \cdots ,2^{{k}_{m}} ) $$

    representation can be considered. The corresponding adder consists of m blocks, similar to that of Fig. 7.3, whose sizes are k 1, k 2,…, and k m , respectively. Nevertheless, within an FPGA it is generally better to use adders that fit within a single column. Assuming that the chosen device has r carry-logic cells per column, a good option could be a fixed-radix adder with k ≤ r. In order to minimize the computation time, k must be approximately equal to (n/2)1/2, so that n must be smaller than 2r 2, which is a very large number.

7.4 Carry Select Adders

Another way of reducing the computation time of a radix-2k adder consists in computing, at each step, the next carry and the output digit for both values of the input carry. The corresponding circuit is shown in Fig. 7.6.

Fig. 7.6
figure 6

Carry select adder

The critical path of the part of the circuits that computes the two possible values of the next carry and of the output digit has been shaded. Its computation time is equal to T adder (k) + T adder (2). The total computation time is (n = m·k)

$$ T_{adder} \left( n \right) = T_{adder} \left( k \right) + T_{half\_adder} \left( 2 \right) + \left( {m - \, 1} \right) \cdot T_{mux} + T_{mux} , $$
(7.5)

and the delay from the input carry to the output carry to

$$ T_{carry - to - carry} \left( {m \cdot k} \right) = m \cdot T_{mux} . $$
(7.6)

The following VHDL model describes the basic cell of Fig. 7.6. A complete generic model carry_select_adder.vhd is available at the Authors’ web page and examples of FPGA implementations are given in Sect. 7.9.

The non-constant term of T adder (n) is equal to (k + mT mux . The minimum value is obtained when k ≅ m, that is k ≅ (n)½. With this value of k, the non-constant term of T adder (n) is equal to (4n)½·T mux . Thus, the computation time is O(n)½ instead of O(n).

Comment 7.3

As before (Comments 7.2) a mixed-radix numeration system could be considered.

As a matter of fact, the FPGA implementation of a half-adder is generally not more cost-effective than the implementation of a full adder. So, the circuit of Fig. 7.6 could be slightly modified: instead of computing c i0 and c i1 with a full adder and a half adder, two independent full adders of any type can be used (Fig. 7.7).

The following VHDL model describes the modified cell:The computation time of the modified circuit is

$$ T_{adder} \left( n \right) = T_{adder} \left( k \right) + \left( {m - \, 1} \right) \cdot T_{mux} + T_{mux} = T_{adder} \left( k \right) + m \cdot T_{mux} . $$
(7.7)

A complete generic model carry_select_adder2.vhd is available at the Authors’ web page and examples of FPGA implementations are given in Sect. 7.9.

Fig. 7.7
figure 7

Carry-select adder (second version)

7.5 Logarithmic Adders

Several types of adders whose computation time are proportional to the logarithm of n have been proposed. For example: carry-lookahead adders ([1], Chap. 6), Ling adders [2], Brent-Kung prefixed adders [3], Ladner-Fischer prefixed adders [4]. Nevertheless, their FPGA implementations are generally not as fast as what could be theoretically expected. There are two reasons for that. On the one hand, the special purpose carry-logic included in most FPGAs is very fast, so that ripple-carry adders are fast. Their computation time is approximately equal to a + b·n, where a and b are very small constants: a is the delay of a LUT and b is the delay of a multiplexer belonging to the carry logic. On the other hand, the structure of most logarithmic adders is not so regular as the structure of ripple-carry adders, so that they include long connections which in turn introduce long additional delays. The practical result is that, except for very great values of n, the adders described in Sects. 7.27.4 are faster and more cost-effective.

Obviously, any optimization method that allows the dividing up of an n-bit adder into smaller k-bit and m-bit adders, with k·m = n, in such a way that

$$ T_{adder} \left( n \right) \cong T_{adder} \left( k \right) + T_{adder} \left( m \right), $$

can be recursively used in order to generate a logarithmic adder. As an example, consider again a carry-select adder. According to (7.7)

$$ T_{adder} \left( n \right) = T_{adder} \left( k \right) + m.T_{mux} . $$

Assume that k = r·s. Then each k-bit adder (Fig. 7.7) can in turn be decomposed in such a way that

$$ T_{adder} \left( k \right) = T_{adder} \left( r \right) + s.T_{mux} , $$

so that the computation time of the corresponding 2-level carry-select adder is

$$ T_{adder} \left( n \right) = T_{adder} \left( r \right) + (s + m) \cdot T_{mux} , $$

where n = r·s·m. Finally, if n = n 1·n 2·… ·n t , then a (t-1)-level carry-select adder, whose computation time is equal to

$$ T_{adder} \left( {n_{1} \cdot n_{2} \cdot \ldots \cdot n_{t} } \right) = T_{adder} \left( {n_{1} } \right) + \left( {n_{2} + \ldots + n_{t} ) \cdot T_{mux} = O(n_{1} + n_{2} + \ldots + n_{t} } \right), $$

can be generated.

Example 7.1

The following VHDL model describes an n-bit 2-level carry-select adder with n = n 1·n 2·n 3. First, define the basic cell carry_select_step3, in which two 1-level carry-select adders, with k = n 1 and m = n 2, are used:The complete circuit is made up of n 3 basic cells:A complete generic model carry_select_adder3.vhd is available at the Authors’ web page and examples of FPGA implementations are given in Sect. 7.9.

7.6 Long-Operand Adder

In the case of long-operand additions, the n-digit operands can be broken down into s-digit groups and the addition computed according to the following algorithm in which natural_addition is a procedure that computes

$$ z_{i} = \left( {x_{i} + y_{i} + c_{i} } \right)\bmod B^{s} \;{\text{and}}\;c_{i + 1} = \left\lfloor {\left( {x_{i} + y_{i} + c_{i} } \right) \, /B^{s} } \right\rfloor , $$

where x i , y i and z i are s-digit numbers, and c i and c i+1 are bits.

Algorithm 7.1: Long-operand addition

The complete circuit (Fig. 7.8, with k = n/s) is made up of an s digit adder, connection resources (k-to-1 multiplexers) giving access to the s-digit groups, a D-flip-flop which stores the carries (c i in Algorithm 7.1), an output register storing z, and a control unit whose main component is a k-state counter.

Fig. 7.8
figure 8

Long-operand adder

The following VHDL model describes the circuit of Fig. 7.8 (B = 2).A complete generic model long_operand_adder.vhd is available at the Authors’ web page.

7.7 Multioperand Adders

Consider m natural numbers x 0, x 1,…, xm-1. A multioperand adder computes

$$ z = x_{0} + x_{1} + \cdots + x_{m - 1} . $$
(7.8)

7.7.1 Sequential Multioperand Adders

In order to compute (7.8), the following (obvious) algorithm can be used.

Algorithm 7.2: Basic multioperand addition

The corresponding sequential circuit (Fig. 7.9) is made up of an n-digit adder, an n-digit register, an m-to-1 n-digit multiplexer, and a control unit whose main component is an m-state counter.

Fig. 7.9
figure 9

Multioperand addition

The following VHDL model describes the circuit of Fig. 7.9 (B = 2). The n·m-bit vector x is the concatenation of x 0, x 1,…, x m-1.A complete generic model multioperand_adder.vhd is available at the Authors’ web page.

The computation time of the preceding m-operand n-digit sequential adder is approximately equal to

$$ T_{sequential} \left( {m,n} \right) \cong m \cdot T_{adder} \left( n \right). $$
(7.9)

In order to reduce the computation time, a carry-save adder can be used. The basic component is shown in Fig. 7.10: it consists of n 1-digit adders working in parallel. Given two n-digit numbers x and y, and an n-bit number c, it expresses the sum (x + y + c) mod B n under the form z + d, where z is an n-digit number and d an n-bit number. In other words, the carries are stored within the output binary vector d instead of being propagated (stored-carry encoding). As all cells work in parallel the computation time is independent of n.Let CSA be the function implemented by the circuit of Fig. 7.10, that is

$$ CSA\left( {x,\;y,\;c} \right) = \left( {z,\;d} \right), $$

where

$$ z_{i} = \left( {x_{i} + y_{i} + c_{i} } \right)\bmod B,d_{i} = \left\lfloor {\left( {x_{i} + y_{i} + c_{i} } \right) \, /B} \right\rfloor ,\forall i \in \left\{ {0, \, 1, \ldots ,n - 1} \right\}. $$

Assume that at every step of Algorithm 7.2 the value of accumulator is represented under the form u + v, where u is an n-digit number and v an n-bit number. Then, at step j, the following operation must be executed:

$$ \left( {u,v} \right):\; = CSA\left( {u,x_{j} ,v} \right). $$

The following formal algorithm computes z.

Fig. 7.10
figure 10

Carry-save adder

Algorithm 7.3: Multioperand addition with stored-carry encoding

The sequential circuit corresponding to Algorithm 7.3 (Fig. 7.11) is made up of an n-digit carry-save adder (Fig. 7.10), an n-digit register, an n-bit register, an m-to-1 n-digit multiplexer, a conventional n-digit adder implementing the last step of Algorithm 7.3, and a control unit whose main component is an m-state counter.The following VHDL model describes the circuit of Fig. 7.10 (B = 2). As before, x is the concatenation of x0, x1,…, xm-1.A complete generic model CSA_multioperand_adder.vhd is available at the Authors’ web page.

Fig. 7.11
figure 11

Carry save adder

Taking into account that the computation time of the circuit of Fig. 7.10 is independent of the number n of digits, the computation time of the circuit of Fig. 7.10 is approximately equal to

$$ T_{sequential\_csa} \left( {m,n} \right) \cong m \cdot T_{adder} \left( 1 \right) + T_{adder} \left( n \right). $$
(7.10)

7.7.2 Combinational Multioperand Adders

The combinational circuit that corresponds to Algorithm 7.2 is an iterative circuit made up of m-1 2-operand n-digit adders. If every adder is a simple ripple-carry adder, then the complete circuit is a 2-dimensional array made up of (m-1)·n one-digit adders, as shown in Fig. 7.12 in which one of the critical paths has been shaded. The corresponding computation time is equal to

$$ T_{combinational} \left( {m,\;n} \right) = \left( {m + n - 2} \right) \cdot T_{adder} \left( 1 \right). $$
(7.11)

The following VHDL model describes the circuit of Fig. 7.12 (B = 2). As before, x is the concatenation of x 0, x 1,…, x m-1.A complete generic model comb_multioperand_adder.vhd is available at the Authors’ web page.

Fig. 7.12
figure 12

Multioperand addition array

A most time-effective solution is a binary tree of 2-operand n-digit adders instead of an iterative circuit. An example, with n = 3 and m = 8, is shown in Fig. 7.13:

$$ x_{0} = x_{0,2} x_{0,1} x_{0,0} ,x_{1} = x_{1,2} x_{1,1} x_{1,0} , \ldots ,x_{7} = x_{7,2} x_{7,1} x_{7,0} . $$

The depth of the tree is equal to ⌈log 2 m⌉ and its computation time (one of the critical paths has been shaded) is approximately equal to

$$ T_{adder - tree} \left( {m,n} \right) \cong \left( {n \, + \log_{2} m \, - 1} \right) \cdot T_{adder} \left( 1 \right). $$
(7.12)

The following VHDL model describes the circuit of Fig. 7.13 (B = 2).A complete generic model eight_operand_adder.vhd is available at the Authors’ web page.

Fig. 7.13
figure 13

Multioperand addition tree

Another way to reduce the computation time, with an iterative architecture similar to that of Fig. 7.12, is to use the carry-save principle. An m-operand carry-save array (Algorithm 7.3) is shown in Fig. 7.14 (if B > 2, x 2 must be an n-bit number or an initial file that computes x 0 + x 1 + 0 must be added). The result is the sum of two n-digit numbers u and v. In order to get the actual result, an additional 2-operand n-digit adder is necessary for computing u + v (last instruction of Algorithm 7.3). The corresponding computation time is equal to

$$ T_{combinational\_csa} \left( {m,n} \right) = \left( {m - 2} \right) \cdot T_{adder} \left( 1 \right) + T_{adder} \left( n \right). $$
(7.13)

The following VHDL model describes a 2-operand carry-save adder, also called 3-to-2 counter (Sect. 7.7.3). It corresponds to a file of the circuit of Fig. 7.14.The complete circuit is made up of m-2 3-to-2 counters:A complete generic model comb_CSA_mutioperand_adder.vhd is available at the Authors’ web page and examples of FPGA implementations are given in Sect. 7.9.

Fig. 7.14
figure 14

Combinational carry-save adder

Comment 7.4

In all of the previously described multioperand adders, the operands, as well as the result, were assumed to be n-digit numbers. If all of the operands belong to the same range, and the result is known to be an n-digit number, whatever the value of the operands, then the operands can be represented with (nk) digits where k ≅ log B m, and the previously described circuits can be pruned.

7.7.3 Parallel Counters

Given two n-digit numbers x and y, and an n-bit number c, the carry-save adder of Fig. 7.10 allows the expression of the sum (x + y + c) mod B n under the form z + d, where z is an n-digit numbers and d an n-bit number. In other words, it reduces the sum of three digits x, y and c to the sum of two digits z and d. For that reason, it is also called a 3-to-2 counter.

This 3-to-2 counter can be used as a computation resource for reducing the sum of m digits x 0, x 1,…, x m-1 to the sum of two digits u and v as shown in Fig. 7.14. Thus, the circuit of Fig. 7.14 could be considered as an m-to-2 counter.

This type of construction can be generalized. As an example, consider an adder that computes the sum of 6 bits x 0, x 1,…, x 5. The result, smaller than or equal to 6, is a 3-bit number. Thus, this 6-operand 1-bit adder computes

$$ x_{0} + x_{1} + \cdots + x_{5} = 4z_{2} + 2z_{1} + z_{0} $$
(7.14)

and can be implemented by three 6-input Look Up Tables (LUT6) working in parallel:Then, by connecting in parallel n circuits of this type, a binary 6-to-3 counter is obtained (Fig. 7.15): The counter of Fig. 7.15 can in turn be used as a building block for generating more complex counters. As an example, the circuit of Fig. 7.16 is a 24-to-3 counter.

Fig. 7.15
figure 15

6-to-3 counter

Fig. 7.16
figure 16

24-to-3 counter

The computation time of the circuit of Fig. 7.16 is equal to 3T LUT6. More generally, a tree made up of 2k-1 6-to-3 counters generates a 6·2k−1-to-3 counter, with a computation time equal to k·T LUT6. In the case of Fig. 7.16, k = 3 and 6·2k−1 = 24.

Finally, with an additional 3-to-2 counter and an n-bit adder a 24-operand adder is obtained (Fig. 7.17). Complete VHDL models six_to_three_counter.vhd and twenty_four_operand_adder.vhd are available at the Authors’ web page and examples of FPGA implementations are given in Sect. 7.9.

Fig. 7.17
figure 17

24-operand adder

To summarize, an m-operand adder, with m = 6·2k−1, can be synthesized with 2k-1 6-to-3 counters plus a 3-to-2 counter and an n-bit adder. Its computation time is

$$ T\left( {m,n} \right) \cong k \cdot T_{LUT6} + T_{FA} + T_{adder} \left( n \right), $$

where k = log 2 m + 1−log 26 < log 2 m.

Comment 7.5

More complex types of counters have been proposed (see, for example, Chap. 8 of [1], Chap. 3 of [5], Chap. 11 of [6]). Nevertheless, they do not necessarily give high performance FPGA implementations. As a matter of fact, in many cases the best FPGA implementations are based on relatively simple algorithms, to which correspond regular circuits that allow taking advantage of the special purpose carry logic circuitry, and permit the use of efficient design techniques such as pipelining and digit-serial processing.

7.8 Subtractors and Adder–Subtractors

Given two radix-B naturals x and y, the difference z = xy could be negative. So, the subtraction operation must be considered over the set of integers. A convenient way to represent integers is B’s complement: the vector

$$ x_{n} x_{n - 1} x_{n - 2} \ldots x_{1} x_{0} ,{\text{ with}}\;x_{n} \in \left\{ {0, \, 1} \right\}{\text{ and}}\;x_{i} \in \left\{ {0, \, 1, \ldots ,B - 1} \right\}\forall i < n, $$

represents

$$ x = - x_{n} \cdot B^{n} + x_{n - 1} \cdot B^{n - 1} + x_{n - 2} \cdot B^{n - 2} + \cdots + x_{1} \cdot B + x_{0} . $$

Thus, x n is a sign bit: if x n  = 0, x is a non-negative integer (a natural), and if x n  = 1, x is a negative integer. The range of represented integers is

$$ - B^{n} \le x < B^{n} . $$

Let x n x n−1 x n−2x 1 x 0 and y n y n−1 y n−2y 1 y 0 be the B’s complement representations of x and y. If the sum z = x + y + c in , being c in an initial carry, belongs to the interval −B n ≤ z < B n, then z is represented by the vector z n z n-1 z n-2z 1 z 0 generated by the mixed-radix adder of Fig. 7.18 (all radix-B digits but the most significant binary digits x n , y n and z n ).

Fig. 7.18
figure 18

Radix-B B’s complement adder

If the difference z = xy belongs to the interval −B n ≤ z < B n, then z is represented by the vector z n z n-1 z n-2z 1 z 0, generated by the circuit of Fig. 7.19 in which y i ’ is the (B-1)’s complement of y i , ∀i < n.

Fig. 7.19
figure 19

Radix-B B’s complement subractor

The sum z = x + y or the difference z = xy could lie outside the interval −B n ≤ z < B n (an overflow situation) . In order to avoid overflows, both x and y should be represented with an additional digit. In the case of B’s complement representations, digit extension is performed as follows:

$$ x_{n} x_{n - 1} x_{n - 2} \ldots x_{1} x_{0} \to x_{n} w_{n} x_{n - 1} x_{n - 2} \ldots x_{1} x_{0} ,{\text{ with}}\;w_{n} = x_{n} \cdot \left( {B - 1} \right). $$

For example, if B = 10 and x = 249, then x is represented by

$$ 0249, \, 00249, \, 000249,{\text{ etc}}. $$

If B = 10 and x = −249, then x is represented by

$$ 1751, \, 19751, \, 199751,{\text{ etc}}. $$

Observe that if B = 2, then the bit extension operation amounts to repeating the most significant bit. In Fig. 7.20 a 2′s complement adder and a 2′s complement subtractor are shown. In both cases the comparison of bits z n+1 and z n allows the detection of overflows: if z n+1 ≠ z n then the result does not belong to the interval − B n ≤ z < B n.

Fig. 7.20
figure 20

2′s complement adder and subractor

The following VHDL models describe the circuits of Fig. 7.20.Generic models two_s_comp_adder.vhd and two_s_comp_subtractor.vhd are available at the Authors’ web page.

7.9 FPGA Implementations

Several adders have been implemented within a Virtex 5-2 device. All along this section, the times are expressed in ns and the costs in numbers of Look Up Tables (LUTs) and flip-flops (FF’s). All VHDL models as well as several test benches are available at the Authors’ web page.

7.9.1 Binary Adder

The circuit is shown in Fig. 7.3. The synthesis results for several numbers n of bits are given in Table 7.1 .

Table 7.1 Binary adders

7.9.2 Radix 2k Adders

The circuit is shown in Fig. 7.4. The synthesis results for several numbers n = 2k of bits are given in the Table 7.2. In these implementations, the carry propagation multiplexer muxcy has been explicitly instantiated within the VHDL description.

Table 7.2 Radix-2k n-bit adders

7.9.3 Carry Select Adder

The circuit is shown in Fig. 7.6. The synthesis results for several numbers n = m·k of bits are given in Table 7.3.

Table 7.3 n-bit carry-select adders

The alternative circuit of Fig. 7.7 has also been implemented for several values of n. The results are given in Table 7.4.

Table 7.4 n-bit carry-select adders (version 2)

7.9.4 Logarithmic Adders

The synthesis results for several numbers n = n 1·n 2·n 3 of bits are given in Table 7.5.

Table 7.5 n-bit adders with n = n1·n2·n3

7.9.5 Long Operand Adder

The circuit is shown in Fig. 7.8. The synthesis results for several numbers n = k·s of bits are given in Table 7.6. Both the clock period T clk and the total delay (k·T clk ) are given.

Table 7.6 Long-operand adders

7.9.6 Sequential Multioperand Adders

The circuit is shown in Fig. 7.9. The synthesis results for several numbers n of bits and m of operands are given in Table 7.7. Both the clock period T clk and the total delay (m·T clk ) are given.

Table 7.7 Sequential multioperand adders

The carry-save adder of Fig. 7.10 has also been implemented. The results are given in Table 7.8.

Table 7.8 Sequential carry-save adders

7.9.7 Combinational Multioperand Adders

The circuit of Fig. 7.12 has been implemented. The synthesis results for several numbers n of bits and m of operands are given in Table 7.9.

Table 7.9 Multioperand addition array

The carry-save adder of Fig. 7.14 has also been implemented. The results are given in Table 7.10.

Table 7.10 Combinational carry-save adder

As an example of multioperand adddition trees (Fig. 7.13), several 8-bit adders have been implemented, with the results given in Table 7.11.

Table 7.11 8-operand addition trees

Examples of implementation results for 24-operand adders based on 6-to-3 counters (Fig. 7.17) are given in Table 7.12.

Table 7.12 24-operand adders based on 6-to-3 counters

7.9.8 Comparison

A comparison between four types of 2-operand adders, namely binary (normal), radix-2k, carry-select and logarithmic adders, has been done: Fig. 7.21 gives the corresponding adder delays (ns) in function of the number n of bits.

Fig. 7.21
figure 21

Delay in function of the number of bits for several 2-operand adders

7.10 Exercises

  1. 1.

    Generate a generic model of a 2′s complement adder–subtractor with overflow detection.

  2. 2.

    An integer x can be represented under the form (–1)s · m where s is the sign of x and m its magnitude (absolute value). Design an n-bit sign-magnitude adder–subtractor.

  3. 3.

    Design several n-bit counters, for example

    • 7-to-3,

    • 31-to-3,

    • 5-to-2,

    • 26-to-2.

  4. 4.

    Design a self-timed 64-bit adder with end of computation detection (done signal).

  5. 5.

    Generate several generic models of an incrementer/decrementer, that is, a circuit that computes x ± 1 mod m under the control of an upb/down binary signal.