

# A Low-Area, Low-Power, Wide Tuning Range Digitally Controlled Oscillator for Power Management Systems in 28 nm CMOS Technology

M. Mestice<sup>(⊠)</sup>, G. Biondi, G. Ciarpi, D. Rossi, and S. Saponara

Department of Information Engineering, University of Pisa, Pisa, Italy marco.mestice@phd.unipi.it

**Abstract.** Nowadays, in the world of high-performance computing, saving energy when great computing power is not needed is a must-to-have feature. This usually involves the implementation of Power Management Systems (PMS) to apply power saving polices such as frequency scaling. In particular, for this feature, the actuators of PMS are usually implemented with Phase- or Frequency-Locked Loops, which should occupy a small area and exhibit a low-power consumption. Additionally, they should be able to generate a wide range of frequencies in the order of a few GHz with a fine granularity of a few hundreds of MHz. Since the core of such loops is a tunable oscillator, in this work we present a pseudo-differential Ring Digitally Controlled Oscillator (DCO) implemented with a standard 28 nm CMOS technology to be used in PMS. The proposed DCO features a well-balanced behavior between the noise performance and a wide tuning range, a low-area, and a low-power consumption.

# 1 Introduction

A requirement in today's world of consumer electronics is to generate high-performance high-frequency clock signals. Beside the actual clock generation, many modern applications need tunable features over a wide range of frequencies. Frequency hopping in wireless communication and power management policies in processors and digital systems are excellent examples of these kinds of applications [1, 2]. The tunability and controllability of the clock usually involves the use of a Controlled Oscillator, e.g., Voltage Controlled Oscillators (VCO) or Digitally Controlled Oscillators (DCO), inserted in feedback loops, such as Phase-Locked Loops (PLL) [3, 4] or Frequency-Locked Loops (FLL) [5, 6]. In this work, we present a pseudo-differential Ring DCO implemented with a standard 28 nm CMOS technology to be used in clock generators for the system level power management in computing applications. The target architecture is a Ring-DCO since, given the application, the area of the clock generator should be limited. Indeed, beside the restrained area that characterizes Ring-Oscillators (RO) in general, a digital control signal lets the feedback loop to be digital as well, leading to a further reduced area of the whole system. Moreover, together with the low area, a wide frequency range

from hundreds of MHz to 3 GHz, and frequency resolution below 100 MHz were targeted to be suitable for the chosen applications [6, 7]. Furthermore, the design effort was also focused on reducing the power consumption for the given frequency and frequency resolution constraints. The paper is organized as follows: in Sect. 2 the proposed DCO is described, the design choices are reported from the architectural ones to the schematic ones, and the results from PVT simulations are presented; in Sect. 3 the layout design is summarized, and the post-layout simulations are presented. Finally, in Sect. 4 a comparison with the state-of-the-art DCOs is reported, and in Sect. 5 the conclusions are written.

### 2 Proposed DCO

ROs consist of delay lines in which the last output is connected to the first input. The operating frequency of an N-stages RO is [8]:

$$f = \frac{1}{2Nt_D}$$
(1)

where t<sub>D</sub> is the delay of the single delay element and can be written as:

$$t_{\rm D} = \frac{C_{\rm L} \Delta V}{I_{\rm D}} \tag{2}$$

where  $C_L$  is the load capacitance,  $\Delta V$  is the output voltage swing, and  $I_D$  is the driving current to the load. Looking at Eqs. 1 and 2, different approaches to control the output frequency can be identified: i) varying the number of delay cells, i.e., changing the number of stages N thanks to multiplexers or three-state buffers. Even though this solution is an easy-to-implement solution, generally it is characterized by a large frequency drift and large area occupation; ii) Varying the load of the single delay element, i.e., changing  $C_L$  in Eq. 2 by changing the nodes' capacitance [9]. The capacitor bank may be implemented with Metal-Insulator-Metal (MIM) capacitor or MOS capacitor, which can be easily digitally controlled; iii) varying the output current of the delay elements, i.e., changing  $I_D$  in Eq. 2 [10]. Indeed, by changing the current which charges and discharges the nodes' capacitance, the oscillation frequency can be controlled. Compared to the capacitor bank solution, this one requires less area. However, the design of a controllable current source with a monotonic behavior for a wide tuning range is challenging [11].

Apart from the way the frequency can be controlled, ROs can also differ in the implementation of the delay elements. Indeed, an RO can be single-ended, differential or pseudo-differential. Single-ended Ring Oscillators offer many advantages, such as design simplicity, rail-to-rail output swing and low area occupation, but show a great sensitivity to PVT variations, common mode noise, and ripple of the supply voltage. On the other hand, differential [12] or pseudo-differential [13] ROs exhibit higher noise immunity, even though they need more area and power consumption. In this work, a pseudo-differential RO was selected as it represents a good trade-off between the single-ended and the fully differential solutions. Indeed, the pseudo-differential solution shares the advantages in term of noise and disturbances rejection with the fully differential one.



Fig. 1. Proposed DCO architecture.

However, it features lower power consumption and area occupation, similar to a singleended solution. Figure 1 illustrates the proposed DCO. A Digital to Analog Converter (DAC) generates a current that is mirrored in the delay elements of the RO. As anticipated in the introduction, the DCO was designed to reach three primary goals: i) low power ii) low area, and iii) a frequency resolution below 100 MHz over a frequency range between 500 MHz and 3 GHz. The main goals in designing the DAC, instead, were to obtain a monotonic characteristic, to reduce the power consumption, and to contain the PVT variations. In the next subsections the detailed description of the DAC and of the Current Controlled Oscillator that compose the DCO is reported.

### 2.1 Pseudo Differential Ring Oscillator



Fig. 2. Delay element of the RO: (a) digital logic view; (b) schematic level view.

Since the technology targeted for the implementation of the DCO (i.e., the 28 nm TSMC technology) can work at frequencies well above the frequency range needed for this work, the design of the RO is focused on the power and area reduction, achieving good noise performance and, most of all, a wide tuning range. Therefore, a three stages solution was selected. As shown in Fig. 2a, every basic element of the RO is composed by two inverters and one latch that implement a pseudo-differential architecture. Apart from implementing the delay element, the positive feedback formed by the latch improves the rising and falling edges of the output signal, leading to a better noise performance. The

schematic of the basic elements of the DCO is shown in Fig. 2b. The input signals drive the main inverters (M1, M2 and M3, M4), which in turn drive the positive feedback latch composed by M5, M6, M7, and M8. The output frequency of the RO realized with these delay elements is determined by the strength ratio between the input inverters and the latch. The strength of the input inverters is controlled by the current generated by the DAC, while the strength of the latch is determined by the ratio W/L of M5, M6, M7, and M8. A high ratio between the latch strength and the inverter strength would lead to an improved noise performance thanks to the increased strength of the positive feedback. On the other hand, it would also lead to a degradation of the tuning range, which would be narrower. Therefore, the DAC's current and the sizes of the main inverters and of the latch were chosen to obtain a wide tuning range, without degrading the noise performance, and minimizing the power consumption. In particular, concerning the power consumption, the current value was chosen to achieve the highest target frequency (above 3 GHz) in all the technology corners and to compensate for the parasitic effects occurring after the layout phase [14, 15]. The DAC's current (I<sub>DAC</sub> in Fig. 1) chosen with these considerations was, therefore, about 10 mA, shared between all the stages.

#### 2.2 DAC Architecture



Fig. 3. DAC schematic.

Since the oscillator core is controlled by a current, a current steering DAC was designed, as shown in Fig. 3. It consists of 10 weighted current sources (M2–M11), each one controlled by one bit of a 10-bit digital word. An additional current source (M1) was added to implement a dithering feature since this technique [6] is often exploited in all-digital loops to increase further the DCO's resolution. The output current is given by Eq. 3. Ideally, each current I<sub>n</sub> is given by Eq. 4, where w<sub>n</sub> equals  $2^n$  and I<sub>0</sub> is a unit current. Therefore, to enhance the matching between the current sources, the weight of every transistor was implemented with w<sub>n</sub> equal transistors connected in parallel, obtaining the output current of Eq. 5, where I<sub>0</sub> is the unit current produced by the unit nMOS (i.e., M1 and M2). This current is then collected by a pMOS transistor (M12), which belongs to the mirror that, in turn, supplies the current to the elements of the RO.

$$I_{DAC} = \sum b_n I_n \tag{3}$$

$$I_n = w_n I_0 \tag{4}$$

$$I_{DAC} = I_0 + b_{DITH}I_0 + b_0I_0 + b_12^{1}I_0 + \dots + b_92^{9}I_0$$
(5)

The output of the DAC is not connected directly to the ring-oscillator to increase the monotonicity of the characteristic. Indeed, if the DAC's output had been connected directly to the RO, the effective capacitance seen from the transistors of the RO would have depended on which transistor of the DAC was on [11]. Therefore, also the delay of the elements of the RO would have been influenced by the capacitance of the DAC's transistors switched on, leading to a possible non-monotonic behavior. Transistor M0, which is always ON, determines the smallest current the DAC can supply, and therefore, it establishes the lower limit of the output frequency range, while the maximum output frequency is determined by all the other transistors together. Therefore, M0 was sized to obtain the current needed by the RO for the lower limit of the frequency range. All the other transistors, instead, were sized to obtain the maximum current chosen during the design of the RO as described in the previous subsection (i.e., 10 mA). In particular, they were divided in binary weighted current sources composed by identical replicas of the unit-current transistor, whose channel is 200 nm wide and 1  $\mu$ m long with two fingers. With this sizing the unit-current transistor's contribution is  $10 \,\mu$ A, and the DAC current vs input digital word characteristic shows a minimum slope of 2 µA/code, an average slope of 5 µA/code, and maximum slope of 10 µA/code depending on the Digital Control Word.

#### 2.3 Simulation Results

In Fig. 4, the tuning range is shown for both the typical case, i.e., typical process corner, 27 °C, and 0.9 V of supply voltage, and the worst case, i.e., slow process corner, 125 °C, and 10% reduction in the supply voltage. Even though there is a loss in the maximum frequency, the required tuning range is achieved also in the worst-case. As can be seen, a compression of the characteristic is seen for higher values of the input digital word. This is mostly due to the DAC and, in particular, to the limited resistance of M12, as well as of M0–M11 [16]. Indeed, varying the Digital Control Word causes a variation in the output node resistance that goes from  $r_P l r_0$  for the LSB to  $r_P l (r_0/N)$  for N transistors activated, where  $r_P$  is the resistance seen towards the drain of M12, i.e., 1/gm, and  $r_0$  is the resistance seen towards the output node of the DAC. For values of N above 400,  $r_0/N$  is comparable with 1/gm and, therefore, the effect described is seen.

In Fig. 5a, the power consumption in the tuning range (i.e., as function of the digital input word) is shown. As expected, the power consumption reaches its maximum of about 7 mW at the highest frequency. In Fig. 5b, the frequency resolution as function of the input digital word is drawn.

The resolution is better than the 100 MHz-constraint in all the tuning range, and it can be further reduced at system level by exploiting the dithering feature. As the tuning range characteristic is compressed for high values of the Digital Control Word, for the same reason the derivative of the frequency resolution tends to zero at higher frequencies. Finally, the phase noise at 1.5 GHz of output frequency resulted to be -83.4 dBc/Hz @ 1 MHz.



Fig. 4. DCO's tuning range in the typical and worst corner.



Fig. 5. (a) Power consumption vs Digital Control Word; (b) Frequency resolution vs Digital control Word in the typical and worst case.

# 3 DAC and RO Layout

#### 3.1 Layout Design



Fig. 6. Layout of the DCO.

The layout of the DCO is depicted in Fig. 6. The layout of the DAC was designed following a common centroid approach. While for the DAC the classic transistors were exploited for the design, the rf-ones, which include a deep N-well and guard rings to reduce the substrate disturbances, were instead used for the RO, given their superior performance and their more detailed model at high-operating frequencies. The total DCO's area is 5083.7  $\mu$ m<sup>2</sup>.

### 3.2 Post Layout Simulation Results

Figure 7 shows the tuning range of the DCO for pre-layout simulation and post layout simulation for the entire DCO. There is a noticeable difference in terms of frequency

loss between the two cases. This is due to the parasitic capacitances and resistances of the metal paths that play an important role in the frequency reduction. Indeed, according to Eqs. 1 and 2 if the capacitance seen from the output of one single cell of the RO increase, then there is a reduction of the output frequency. In our design, the DCO shows a reduction of the maximum frequency of 36%, from 5.28 GHz to 3.35 GHz, yet still reaching the required tuning range.



Fig. 7. Pre- and Post-layout simulated tuning range of the DCO.

## 4 State-of-the-Art Comparison

Table 1 reports a summary of the proposed DCO's characteristic compared to the current state-of-the-art DCOs in similar frequencies ranges and realized in the 28 and 65 nm CMOS technologies. The DCO proposed in this work shows a good trade-off between area, power, and noise performance, and it has a really wide tuning range of 2.8 GHz and 10 bits of resolution. Compared to the DCO presented in [17] the noise performance and power consumption are worse, but in [17] a LC-tank topology has been exploited, leading to an area 8 times greater. Moreover, only 730 MHz of tuning range has been achieved, which is almost 4 times smaller than the tuning range obtained by the DCO presented in this work. In [18], instead, a Ring-DCO in 28 nm is presented with a slightly better power consumption and an area comparable to our DCO. However, even though it has a good 9-bit resolution, the tuning range is limited to few hundred of MHz, and it shows worse noise and jitter performance. In [13] a DCO in 65 nm is reported instead. It achieves better power consumption with a 1.9 GHz tuning range, a 9-bit resolution, and with good noise performance. However, there, no layout is proposed, and no post-layout simulations have been performed. Finally, the DCO in [19] has 16 bits of resolution and occupies an astonishing small area compared to the DCO proposed in this work, but its tuning range is limited to only 400 MHz, and it shows worse noise performance. In conclusion, this work shows the widest tuning range among the state-of-the-art DCOs. Even though such tuning range has been obtained at the expense of the area occupation, the power consumption, and the Phase Noise, this solution is competitive with the sate-of-the-art DCO also from these points of view.

|                                 | This work <sup>a</sup> | [17] <sup>b</sup> | [18] <sup>a</sup> | [13] <sup>a</sup> | [19] <sup>a</sup> |
|---------------------------------|------------------------|-------------------|-------------------|-------------------|-------------------|
| Technology                      | 28 nm                  | 28 nm             | 28 nm             | 65 nm             | 65 nm             |
| Supply Voltage                  | 0.9 V                  | 0.8 V             | 1 V               | 1.2 V             | 1.2 V             |
| Control bits                    | 10 + 1 dithering       | 12 <sup>c</sup>   | 9 <sup>d</sup>    | 9                 | 16                |
| Frequency range (GHz)           | 0.5–3.3                | 3.95-4.68         | 1.13–1.54         | 2.7–4.6           | 1.8–2.2           |
| Area (µm <sup>2</sup> )         | 5083.7                 | 42000             | 5616              | 1                 | 370               |
| Simulated/Measured<br>Frequency | 1.5 GHz                | 4.6 GHz           | 1.3 GHz           | 4 GHz             | 2.2 GHz           |
| Power (mW)                      | 1.476                  | 0.35              | 0.84              | 1                 | 1.6               |
| Phase noise @ 1 MHz<br>(dBc/Hz) | -83.4                  | -109.5            | -74               | /                 | -78,9             |
| Cycle-to-cycle jitter<br>(rms)  | 0.46 ps                | 1                 | 2.3               | 0.53 ps           | /                 |

Table 1. State-of-the-art Comparison

<sup>a</sup> Simulated - <sup>b</sup> Measured – <sup>c</sup> 4 bits for coarse tuning in binary code and 255 bits for fine tuning in thermometric code – <sup>d</sup> 7 bits for coarse tuning in thermometric code and 6 bits for fine tuning in binary code.

### 5 Conclusions and Future Work

In this work, we have presented a DCO designed for the power management policies in computing applications. For these kinds of applications, a wide frequency range is needed, and, indeed, the target frequency range for this work was from hundreds of MHz to 3 GHz. As shown by the post-simulation results, this constraint has been comfortably achieved with a frequency resolution well below the required 100 MHz, which can be further reduced by exploiting the dithering feature implemented in the DAC. Also, the layout of the DAC and of the RO have been presented. The layout was a critical design step because of the parasitic capacitances that reduce the oscillation frequency. The design choices at architectural and circuit levels have been demonstrated to be a good trade-off in terms of noise performance, area, and power consumption with respect to the current state-of-the-art DCOs. Indeed, the use of an RO has led to a low area and low power solution, and the pseudo-differential implementation has brought to good noise performance, leading therefore to a well-balanced solution between a single-ended and a fully differential implementation.

Given the push towards ever more advanced technology nodes in digital applications and in high-performance processors, where PMS are usually employed, future developments will be the implementation of the proposed solution in a 12 nm FinFet technology. This upgrade may lead not only to a reduction of the area occupation, but also to an improvement of the performance. Moreover, next to the technology scaling, further future work could involve the substitution of the mirror between the DAC and the RO (transistors M12, M13, M14, M15, and M16 in Fig. 1) with a current magnifier. Even though a well-matched current magnifier would need more design effort with respect to a simple current mirror, this solution could lower the DAC's current leading to a reduction of the total power consumption as well.

Acknowledgments. Work partially supported by European Union's Horizon 2020 research and innovation program under grant agreement No 101036168 (*European Process Initiative SGA2*), and by MIUR with the *Dipartimenti di Eccellenza 2018-2022 Crosslab* project.

# References

- 1. Lojko, B.: A contribution to the design of a frequency synthesizer for fast frequency-hopped spread-spectrum systems. In: 17th International Conference Radioelektronika (2007)
- 2. Benini, L., et al.: A survey of design techniques for system-level dynamic power management. IEEE Trans. Very Large Scale Integr. Syst. **8**(3), 299–316 (2000)
- 3. Gardner, F.M.: Phaselock Techniques. Wiley, New York (2005)
- 4. Mestice, M., et al.: Analysis and design of integrated blocks for a 6.25 GHz Spacefibre PLL. Sensors **20**, 4013 (2020)
- Golestan, S., et al.: Single-phase frequency-locked loops: a comprehensive review. IEEE Trans. Power Electron. 34(12), 11791–11812 (2019)
- Bellasi, D.E., et al.: Smart energy-efficient clock synthesizer for duty-cycled sensor SoCs in 65 nm/28nm CMOS. IEEE Trans. Circuits Syst. I Regul. Pap. 64(9), 2322–2333 (2017). https://doi.org/10.1109/TCSI.2017.2694322
- Zhang, X., et al.: An evaluation of {Per-Chip} nonuniform frequency scaling on multicores. In: 2010 USENIX Annual Technical Conference (USENIX ATC 2010) (2010)
- 8. Rabaey, J.M.: Digital Integrated Circuits: A Design Perspective. Prentice-Hall, Inc., Upper Saddle River (1996)
- 9. Andreani, P., et al.: A digitally controlled shunt capacitor CMOS delay line. Analog Integr. Circ. Sig. Process **18**, 89–96 (1999)
- Suman, S., Sharma, K.G., Ghosh, P.K.: Analysis and design of current starved ring VCO. In: 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT) (2016)
- Maymandi-Nejad, M., et al.: A digitally programmable delay element: design and analysis. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 11(5), 871–878 (2003)
- 12. Jalil, J., et al.: CMOS differential ring oscillators: review of the performance of CMOS ROs in communication systems. IEEE Microwave Mag. **14**(5), 97–109 (2013)
- Gorji, J., et al.: A 2.7 to 4.6 GHz multi-phase high resolution and wide tuning range digitally-controlled oscillator in CMOS 65 nm. In: 2016 24th Iranian Conference on Electrical Engineering (ICEE), 2016, pp. 1694–1699 (2016)
- 14. Ciarpi, G., et al.: Design and characterization of 10 Gb/s and 1 Grad TID-tolerant optical modulator driver. IEEE Trans. Circ. Syst. I Regul. Pap. **69**(8), 3177–3189 (2022)
- Monda, D., et al.: Design and verification of a 6.25 GHz LC-Tank VCO integrated in 65 nm CMOS technology operating up to 1 Grad TID. IEEE Trans. Nucl. Sci. 68(10), 2524–2532 (2021)
- Razavi, B.: The current-steering DAC [A Circuit for All Seasons]. IEEE Solid-State Circ. Mag. 10(1), 11–15 (2018). https://doi.org/10.1109/MSSC.2017.2771102
- Levinger, R., et al.: A 3.9-4.7 GHz 0.35 mW DCO with -187.4 dBc FoM in 28nm CMOS. In: 2018 13th European Microwave Integrated Circuits Conference (EuMIC), pp. 194–197 (2018)

- Bisiaux, P., et al.: Design of a 1.5 GHz low jitter DCO ring in 28 nm CMOS process. In: 2020 European Conference on Circuit Theory and Design (ECCTD), pp. 1–5 (2020)
- Selvaraj, S., et al.: Low phase noise, high resolution digitally-controlled ring oscillator operating at 2.2 GHz. In: 2020 9th International Conference on Modern Circuits and Systems Technologies (MOCAST), pp. 1–4 (2020)