## **Ultra-Low-Voltage Clock References**



Ka-Meng Lei, Pui-In Mak, and Rui P. Martins

## 1 Introduction

An Internet of Things (IoT) network is a crucial component of different revolutionary concepts such as Industry 4.0 [1] and smart homes/smart cities [2]. The IoT devices within the networks gather vast amounts of data for dedicated processors/AI models, which boost the precision of analyses. An essential criterion for the IoT device is low power consumption. Ultra-low-power (ULP) radio, intermittently turned on for a short amount of time for data transmission to reduce the average power of the IoT device, is popular for the IoT device as it reduces the power consumption of power-hungry blocks such as the transceiver (TRX) and extends the lifetime of the device [3]. The system will place the device into sleep mode for a specific period, with only critical blocks such as memory and wakeup timers powered on for timing purposes.

On the other hand, there is a trend to power the IoT device with energy harvesters to realize perpetual operation. As the battery has a finite lifetime, there may be chances that the IoT device will miss critical data if it runs out of battery. Also, replacing batteries will be a tremendous task considering that there will be trillions of IoT devices. Further, the battery may pose environmental issues and create safety

K.-M. Lei (🖂) · P.-I. Mak

State-Key Laboratory of Analog and Mixed-Signal VLSI/IME and FST-ECE, University of Macau, Macao SAR, China

e-mail: kamenglei@um.edu.mo; pimak@um.edu.mo

R. P. Martins

State-Key Laboratory of Analog and Mixed-Signal VLSI/IME and FST-ECE, University of Macau, Macao SAR, China

Instituto Superior Técnico, Universidade de Lisboa, Lisbon, Portugal e-mail: rmartins@um.edu.mo

<sup>©</sup> The Author(s), under exclusive license to Springer Nature Switzerland AG 2023 R. Paulo da Silva Martins, P.-I. Mak (eds.), *Analog and Mixed-Signal Circuits in Nanoscale CMOS*, Analog Circuits and Signal Processing, https://doi.org/10.1007/978-3-031-22231-3\_3

risks if not handled properly. By replacing the batteries with energy harvesters (EH), the lifetime of the device increases, and we can obviate the labor to replace the batteries, which otherwise requires a substantial effort. EH, such as solar cells (typical available power indoor:  $10-100 \mu$ W/cm<sup>2</sup>) and thermoelectric generators (typical available power:  $10-1000 \mu$ W/cm<sup>2</sup>), are promising in this perspective [4–6]. Yet, they usually only output voltage with amplitudes ~0.3 V–0.4 V and are unstable with environmental factors (temperature and light intensity) [4]. We can use a boost converter to stabilize and step up the voltage to the standard I/O voltage, but this increases the footprint (cost) and power consumption of the IoT device. These criteria open a prospective research direction for ultra-low-voltage (ULV) circuits, powered directly by these energy harvesters, and avert the penalties of the interim converters.

Clock references are indispensable parts of the TRX. Wide-ranging purposes such as the low-power wakeup timer, the phase-locked loop, the data converters, etc. require different clock references. Hence, this chapter elaborates on the design and measurement results of two ultra-low-voltage clock references in deep-submicron silicon processes. Section 2 introduces the regulation-free sub-0.5 V 16/24 MHz crystal oscillator for energy-harvesting Bluetooth Low Energy (BLE) radios implemented in 65 nm CMOS [7], whereas Sect. 3 demonstrates a fully integrated 0.35-V 2.1 MHz temperature-resilient relaxation oscillator using an asymmetric swing-boosted RC network implemented in 28 nm CMOS [8].

## 2 Regulation-Free Sub-0.5 V 16/24 MHz Crystal Oscillator for Energy-Harvesting BLE

### 2.1 Motivation

The crystal oscillator (XO) is an essential circuit module for modern TRXs. It provides a stable clock reference for different parts such as data converters, phase-locked loops, sensors, etc. Despite its excellent frequency stability, it can take a few milliseconds for the XO to settle into the steady state [9–11] without any fast startup technique [12] due to the high-quality factor of the crystal (~10<sup>5</sup>). This startup time ( $t_s$ ) dominates the "on" latency of the radio, and its startup energy ( $E_s$ ) may significantly degrade the effectiveness of duty-cycling of an ultra-low-power radio. If the active energy ( $E_{TRX}$ ) of a TRX is 1280 nJ (on-time of 128 µs [13] and active power of 10 mW [14]), the percentage of energy spent for starting the XO in every working cycle is ~42% for  $E_s$  of 1000 nJ for a conventional XO and a duty cycle of 0.1%. Such a percentage will go further up as recent circuit techniques can manage to suppress the active power of the TRX ( $P_{TRX}$ ) [15–17]. Then, reducing  $E_s$  for the ULP radios is of paramount importance to reduce its average power consumption. Recent efforts in both academia and industry succeeded in shortening the  $t_s$  and  $E_s$  of the XO [13, 14, 18–23].



**Fig. 1** Overview of the proposed XO and illustration of  $t_{\rm S}$  improvement by two techniques: SSCI and inductive three-stage  $g_{\rm m}$ . The  $L_{\rm M}$ ,  $C_{\rm M}$ , and  $R_{\rm M}$  are the modeled inductance, capacitance, and resistance of the crystal, respectively, whereas  $C_{\rm S}$  is the crystal's stray capacitance

This section reports a regulation-free sub-0.5 V XO according to the system aspect of the EH BLE radios described in [24–27]. Unlike the existing fast startup XOs based on standard or I/O voltages to power up their inverter-like or active-load amplifiers [13, 18–21], the proposed XO is ULV-enabled by using single-/multi-stage resistive-load amplifiers [28]. This architecture circumvents the ineluctable voltage headroom limit, rendering it compatible with the ULV application. Specifically, we propose a *dual-mode*  $g_m$  scheme and a *Scalable Self-reference Chirp Injection (SSCI)* technique for the XO to surmount the operating challenges in both startup and steady state (Fig. 1). The reported XO includes load capacitors of 6 pF and suits common commercially available crystals. Yet, we can also apply the technique to crystals with different load capacitances.

## 2.2 Fast Startup XO Using Dual-Mode g<sub>m</sub> Scheme and SSCI

For a crystal's resonant frequency  $(f_m)$  at tens of MHz, its  $t_s$  (milliseconds) dominates the "on" latency of a duty-cycled radio, raising the average power consumption. In addition, for energy-limited EH sources, the  $E_S$  of the XO is crucial as it may demand a large instant current from the EH source or reservoir. Recent XOs [13, 18–22] succeeded in reducing both  $t_s$  and  $E_S$ . Herein, we propose two techniques, the dual-mode  $g_m$  and the SSCI, for balancing the XO performances in both startup (i.e.,  $t_s$  and  $E_S$ ) and steady state [i.e., power consumption and phase noise (PN)]. The envelope of the XO during startup at the time t is

$$A_{\rm env}(t) = A_i \cdot e^{\frac{R_{\rm N} - R_{\rm M}}{2L_{\rm M}}t},\tag{1}$$

where  $A_i$  is the initial amplitude and  $R_N$  is the negative resistance of the overall impedance viewed from the crystal core. The  $L_M$  and  $R_M$  are the motional inductance and resistance of the crystal, respectively. The aim of the SSCI is to increase  $A_i$ instantly after enabling the XO, while the dual-mode  $g_m$  allows a boosted  $R_N$ afterward. They together bring down  $t_S$  without momentarily raising the startup power, culminating in a lower  $E_S$  and a relaxed power-source design.

#### Scalable Self-Reference Chirp Injection (SSCI)

Signal injection to the XO can bring down  $t_s$  if the injection frequency is close to  $f_m$  of the crystal [19]. Instead of waiting for the XO to build up its oscillation amplitude, we can use an auxiliary oscillator (AO) to excite the crystal. Yet, due to the high Q nature of the crystal, such signal injection is only effective if its frequency error from  $f_m$  is <0.5% [13]. There were several signal injection techniques for kick-starting the XO reported. We can categorize them into three groups: constant frequency injection (CFI) [18, 21, 22], dithering injection [13], and chirp injection (CI) [19].

CFI injects a clock signal into the crystal with a constant frequency precisely matching  $f_{\rm m}$ . Albeit this scheme is very efficient and simple in concept, the AO requires calibration as well as a delicate design that will be challenging in a sub-0.5 V design. As an example, the XO in [21] achieves  $t_{\rm s}$  values of 58/10/2 µs from 1.84/10/50 MHz crystals. Yet, it has a supply voltage of 1 V. Also, the ring oscillator entails frequency calibration after fabrication.

Dithering injection toggles the AO frequencies to compensate for the frequency deviation caused by temperature and voltage variations. As such, the injection signal can cover a wider frequency range than that of CFI. Still, trimming is necessary to compensate for the process variation. When compared with CFI, its effect on shortening  $t_s$  is lower since the signal power spreads to a wider spectrum. For instance, the XO in [13] exhibits a slashed  $t_s$  of <400 µs by using dithered-signal injection (dithered step size: 2%).

Here, we consider CI to be more robust and low cost, as it relies on a frequencyrich signal to excite the crystal and avoids frequency calibration. The principle is alike dithering but covers a wider frequency range. It gradually sweeps the oscillating frequency and progressively decreases/increases the frequency. As such, this chirping sequence can generate a spectrum between the highest frequency  $f_{\rm H}$  to the lowest frequency  $f_{\rm L}$ , as evinced by its Fourier transform [29]. If  $f_{\rm L} < f_{\rm m} < f_{\rm H}$ regardless of PVT variations, the crystal will persistently receive the power. Despite its weaker effectiveness on  $t_{\rm S}$  reduction since the power spreads to a wider band, CI has the benefit of no trimming on the AO. It is especially suitable for low-cost and ULV radios, where there is the possibility of exacerbating the frequency variation of the AO against voltage and temperature. In [19], a  $R_{\rm N}$ -boosting technique applies together with CI, showing a  $t_{\rm S}$  of 158 µs without trimming or calibration on the

|                                       | Characteristics of the injecting signal |           |                    |  |  |
|---------------------------------------|-----------------------------------------|-----------|--------------------|--|--|
|                                       | Constant frequency                      | Dithering | Chirping           |  |  |
| $t_{\rm S}$ and $E_{\rm S}$ reduction | <i>✓ ✓ ✓</i>                            | ~~        | <b>v</b>           |  |  |
| Excitation bandwidth                  | Narrow                                  | Moderate  | Wide               |  |  |
| Trimming on AO                        | Required                                | Required  | Not required       |  |  |
| Precision of AO                       | Very critical                           | Critical  | Relaxed            |  |  |
| Literature                            | [20, 21]                                | [13]      | [19] and this work |  |  |
|                                       |                                         |           |                    |  |  |

Table 1 Overview of different signal injection techniques to kick-start the XO



**Fig. 2** Proposed SSCI. It generates a chirping signal to kick-start the XO using an untrimmed RO with *relaxed* precision. The FSM (finite state machine) provides feasibility to scale  $t_{CI}$ , accommodating different crystal packages (i.e.,  $L_{M}$  and  $C_{S}$ )

AO. Still, the related RC sweeping unit for modulating the frequency of the AO is area hungry (estimated ~90% of the chip area) due to its large time constant (at the order of 10  $\mu$ s) for generating the chirping sequence. Table 1 summarizes the key features of the three signal injection techniques.

Herein, we introduce the SSCI (Fig. 2) that only entails an untrimmed oscillator with relaxed precision. Its frequency range can easily cover  $f_m$  variation against PVT. Unlike the RC-based chirping [19], we incorporate a five-stage RO with a finite state machine (FSM) to control the oscillating frequency of the RO via a cap-bank. Subsequently, the circuit can generate the chirping sequence by referencing its own signal and requiring no area-hungry RC units to modulate the oscillating frequency. The FSM counts the number of pulses and sequentially raises  $C_{\text{OSC}}$  by sending the control signal  $f_{\text{ctrl}}$  to the RO. Additionally, compared to the analog sweeping technique in [19], the FSM can digitally scale the total injection time ( $t_{\text{CI}}$ ), decided by the number of exciting cycles at each cap-bank value  $C_{\text{OSC}}$ :

$$t_{\rm CI} = N \times \sum_{i} t_i, \tag{2}$$

where *N* is the number of cycles to repeat at each  $C_{OSC}$  and  $t_i$  is the period of a single cycle at *i*-th  $C_{OSC}$ . The average amplitude of oscillation on the crystal after the chirping sequence is proportional to  $\sqrt{t_{CI}}$  [19, 29]. Thus, *N* can be programmed to adjust  $t_{CI}$ , rendering the XO easily compatible with different crystal parameters (i.e., an optimum  $t_{CI}$  depends on  $L_M$ ,  $R_M$  and  $R_N$  ( $C_S$ ) [19]). This digital-intensive architecture is more area-efficient. The oscillation signal at the RO has a varying duty cycle with VT variation. To maximize the injection energy (i.e., 50% duty cycle), the chirp-modulated signal is a div-by-2 output of the RO. This output serves as both the exciting signal for the crystal via the output driver and the trigger signal for the FSM. After the injection, the FSM automatically powers down the RO.

#### Dual-Mode g<sub>m</sub> Scheme

The XO using a one-stage  $g_{\rm m}$  ( $A_{\rm XO-1}$ ), especially for the Pierce oscillator, is popular as it can optimize the steady-state PN [13, 19–21]. The  $g_{\rm m}$  offers a negative resistance compensating for the equivalent resistance of the crystal. Its value also determines the growth of the oscillation amplitude before the XO reaches the steady state.

From Fig. 3a, by omitting the resistive loss induced by  $A_{XO-1}$ , the impedance between the I/O ( $Z_{amp-1}$ ) becomes

$$Z_{\rm amp-1} = -\frac{g_{\rm m}}{4\omega_0^2 C_{\rm L}^2} + \frac{1}{j\omega_0 C_{\rm L}},\tag{3}$$



**Fig. 3** XO using (**a**) a one-single  $g_m(A_{XO-1})$  for the steady state and (**b**) a three-stage  $g_m(A_{XO-3})$  for the startup

where  $C_{\rm L}$  is the designated crystal's load capacitance and  $\omega_0$  is the angular oscillating frequency  $2\pi f_0$ . With  $Z_{\rm amp}$  shunted by the crystal's stray capacitance ( $C_{\rm S}$ ), it affects the negative resistance ( $R_{\rm N}$ ) of the overall impedance looking from the crystal core ( $Z_{\rm C}$ ):

$$R_{\rm N} \equiv -\operatorname{Re}\left(Z_{\rm c}\right) = \frac{-\operatorname{Re}\left(Z_{\rm amp}\right)}{\left[\omega_0 C_{\rm s} \operatorname{Re}\left(Z_{\rm amp}\right)\right]^2 + \left[1 - \omega_0 C_{\rm s} \operatorname{Im}\left(Z_{\rm amp}\right)\right]^2}$$
(4)

If  $\omega_0 C_S |Z_{amp}| \gg 1$ , we can have  $R_N \approx -\text{Re}(Z_{amp})$  that matches the expression in [13] for  $A_{\text{XO-1}}$ . A large  $R_N$  favors more  $t_S$  reduction according to Eq. (1). Yet, for  $|Z_{amp}|$  to be comparable with  $1/\omega_0 C_S$  [i.e., a higher  $g_m$  and thus  $|\text{Re}(Z_{amp})|$  to speed up the startup], we have to cogitate the effect from  $C_S$ . Then, we can deduce the specific  $R_N$  of  $A_{\text{XO-1}}$  (i.e.,  $R_{N,1}$ ) from Eq. (4) as

$$R_{\rm N,1} = \frac{4g_{\rm m}C_{\rm L}^2}{\left(g_{\rm m}C_{\rm s}\right)^2 + 16C_{\rm L}^2\omega_0^2\left(C_{\rm L} + C_{\rm S}\right)^2},\tag{5}$$

Taking the derivative of Eq. (5), we can obtain the maximum value of  $R_{N,1}$  with respect to  $g_m$  at a fixed  $C_L$ :

$$R_{\rm N,1,\,max} = \frac{C_{\rm L}}{2\omega_0 C_{\rm s} (C_{\rm L} + C_{\rm s})},\tag{6}$$

where we apply  $g_m = 4\omega_0 C_L(1 + C_L/C_s)$ . Obviously,  $\text{Im}(Z_{\text{amp-1}})$  can only be negative (capacitive) for  $A_{\text{XO-1}}$ , and  $R_{\text{N,1}}$  has an upper limit if only  $g_m$  is the sizing parameter [19, 20]. For instance, the  $R_{\text{N,1}}$  is limited to 1.2 k $\Omega$  with  $C_S = 2 \text{ pF}$ ,  $f_0 = 24 \text{ MHz}$  and  $C_L = 6 \text{ pF}$ , even if we apply an oversized  $g_m = 14.5 \text{ mS}$ . There were efforts to raise  $R_{\text{N,1}}$  by increasing  $g_m$  or tuning  $C_L$  temporarily during the startup [20, 30, 31]. Yet, increasing  $g_m$  incurs larger power consumption and is unfavorable toward the reduction of  $E_S$ . Further, Eq. (6) binds  $R_{\text{N,1}}$ , with a maximum of  $1/2\omega_0 C_S$  (i.e., 1.66 k $\Omega$  in the above example when  $C_L \ll C_S$  and  $g_m \approx 4\omega_0 C_L^{-2}/C_s$ ).

Inspecting Eq. (4), if a positive  $\text{Im}(Z_{\text{amp}})$  is possible to counteract the effect of  $C_{\text{S}}$ , we can boost  $R_{\text{N}}$  to surmount the aforesaid  $R_{\text{N}}$  limit. The idea is to mimic a µH-range inductor on-chip for this purpose. Interestingly, a three-stage  $g_{\text{m}}$  ( $A_{\text{XO-3}}$ ) with designated capacitive loads ( $Z_{\text{o}1-2}$ ) can effectively mimic an inductive effect during the startup (Fig. 3b). Although [32] applied a multistage  $g_{\text{m}}$  to save the XO's steady-state power, here, we explore first its inductive feature for  $t_{\text{S}}$  reduction. For  $A_{\text{XO-3}}$ , we define its  $Z_{\text{amp}}$  as  $Z_{\text{amp-3}}$ . We can maneuver both the  $\text{Re}(Z_{\text{amp-3}})$  and  $\text{Im}(Z_{\text{amp-3}})$  between a positive and a negative values by adjusting the inter-stage impedances, as demonstrated in [7]. For instance, if we set  $g_{\text{m}1,2} = 0.4$  mS,  $g_{\text{m},3} = 1.5$  mS,  $r_{01,2} = 7$  k $\Omega$ ,  $C_{\text{L}} = 6$  pF,  $\omega_0 = 2\pi \times 24$  MHz, and  $C_{01} = C_{02} = 0.5$  pF, we can obtain a  $Z_{\text{amp-3}} = -1.6 + 1.2$  jk $\Omega$ . We can utilize the Im( $Z_{\text{amp-3}}$ ) > 0, manifesting that  $Z_{\text{amp-3}}$  is inductive, to mitigate  $C_{\text{s}}$  and break the limitation (Eq. (6)). Foregoing, we can have  $\text{Re}(Z_{\text{C-3}}) = -2.4$  k $\Omega$  due to the inductive  $A_{\text{XO-3}}$ . Then, we can achieve a

higher  $R_{\rm N}$  even with similar power consumption when compared with the  $A_{\rm XO-1}$ , enabling an energy-efficient startup. Due to the intricate expression of  $R_{\rm N,3}$ , we do its optimization numerically, before proceeding to the transistor level implementation. Besides, the technique is also applicable to different  $f_0$ . Apparently, for the same power budget,  $A_{\rm XO-3}$  is inferior to  $A_{\rm XO-1}$  in terms of the steady-state PN, as each stage shares a smaller bias current and the noises accumulate. Also,  $\rm Im}(Z_{\rm C-3})$ , which determines the XO's oscillating frequency, deviates from the designated value due to the presence of  $C_{\rm o1}$  and  $C_{\rm o2}$ . This affects the accuracy of  $f_0$ . Consequently, it is desirable to implement a dual-mode  $g_{\rm m}$  scheme that can balance the startup and steady-state performances. During the startup where the PN and accuracy of  $f_0$  are irrelevant, we enable  $A_{\rm XO-3}$  and connect to the crystal to attain a larger  $R_{\rm N}$  for fast startup. When the crystal gains sufficient energy for oscillation,  $A_{\rm XO-3}$  is off and disconnected from the crystal while  $A_{\rm XO-3}$  (fast startup) and  $A_{\rm XO-1}$  (low PN and accurate  $f_0$ ).

#### 2.3 Transistor-Level Implementation

We design the core elements of the XO (e.g.,  $A_{XO-1}$ ,  $A_{XO-3}$ , and RO) to operate below a 0.5 V  $V_{DD}$ . Only the static and DC circuits (digital logics and constant- $g_m$  bias circuit) operate at 0.7 V to facilitate the design. These circuits, mostly powered off during the steady state, consume <5  $\mu$ A. Thus, an on-chip switched capacitor charge pump can easily generate the 0.7 V supply and share it with other blocks at the system level as described in [26].

Subtreshold common-source (CS) amplifiers with *resistive loads* (Fig. 4a, b) constitute the basis of both  $A_{XO-1}$  and  $A_{XO-3}$ . Unlike other solutions that use current-source loads [13, 20, 21], the resistive load aids in preserving a moderate  $g_m$  even with  $V_{DD} < 0.35$  V, for a small bias current (simulated at  $I_{dc} = 100 \ \mu$ A). For instance, the simulated  $g_m$  of  $A_{XO-1}$  is 1.3 mS at  $V_{DD} = 0.3$  V and -40 °C, being four times higher than that of the current-source load (assuming an identical  $g_m$  with



Fig. 4 Circuit implementation of (a)  $A_{XO-1}$  and (b)  $A_{XO-3}$ 

 $V_{\rm DD} = 0.35$  V at 20 °C). Further, at high temperature, the intrinsic output resistance of the transistor decreases rapidly. This affects the stability of  $R_{\rm N}$  and causes variation on  $t_{\rm s}$ , especially for  $A_{\rm XO-3}$ . The  $A_{\rm XO-1}$  with resistive load has a trade-off of lower immunity to the power supply noise (noise power from  $V_{\rm DD}$  modulated to the output of XO with resistive load that is 3 dB larger than its current-source-load counterpart at 1 kHz offset). Also, it has a large  $f_0$  variation with the  $g_{\rm m}$  of the  $A_{\rm XO-1}$ not fixed. Still, this is manageable for the BLE standard (< ±50 ppm [33]), as well as other IoT protocols (e.g., ZigBee: ±40 ppm). A small nominal  $I_{\rm dc}$  of 100 µA is adequate for the expected PN.

A feedback resistor  $R_{\rm F}$  self-biases  $A_{\rm XO-1}$ , whereas  $A_{\rm XO-3}$  is an AC-coupled threestage CS amplifier aided by a constant- $g_{\rm m}$  bias circuit. As the  $g_{\rm m}$  of the  $A_{\rm XO-3}$  has a considerable impact on  $R_{\rm N,3}$ , the constant- $g_{\rm m}$  bias circuit secures  $A_{\rm XO-3}$  to be inductive and a stable  $R_{\rm N,3}$  for robust-and-fast startup against PVT. We choose the channel lengths of the transistors such that their output resistances are  $\sim 10^{\times}$  larger than the resistors  $R_{1-3}$ . This soothes the temperature dependency of  $R_{\rm N,3}$  as  $R_{1-3}$  and then dominates  $r_{o1-3}$ . We design  $A_{\rm XO-3}$  to have similar power consumption ( $\sim 100 \ \mu$ A) as  $A_{\rm XO-1}$ . As such, the power consumption does not vary instantaneously, easing the design and layout of the power supply. Each current branch includes CMOS switches where we can isolate  $A_{\rm XO-1}$  or  $A_{\rm XO-3}$  from the crystal, while lowering their leakage power (simulated <14 nW at 0.35 V and 20 °C) when disabled. Their sizes allow that their on-resistances are negligible when compared with  $R_{1-3}$ .

Both the parasitic capacitances of the transistors and the finite I/O resistance of  $A_{XO-3}$  affect the  $R_{N,3}$ . Thus, we should further optimize  $R_{N,3}$  via simulation. The total  $g_m$  budget is 2.3 mS (total bias current: 100 µA, assuming a  $g_m/I_D = 23 \text{ V}^{-1}$ ), with  $r_{o1-3}$  set according to the  $g_m$  of each gain stage. Figure 5a shows the locus plots of  $Z_{\text{amp-1}}$  and  $Z_{\text{amp-3}}$  implemented with practical transistors and integrated passives.



**Fig. 5** (a) Locus plot of the  $Z_{\text{amp-1,-3}}$  against frequency. (b) Simulated  $R_{\text{N,1}}$  and  $R_{\text{N,3}}$  with a fixed total  $g_{\text{m}}$  budget of 2.3 mS and the boosting ratio against frequency

 $Z_{\text{amp-1}}$  is capacitive over all frequencies, while  $Z_{\text{amp-3}}$  is inductive over the 13–46 MHz range, which is compatible with different  $f_0$ . Optimized at the most popular XO frequency of 24 MHz, the optimum  $R_{\text{N},3}$  is 2.4 k $\Omega$  after paralleling it with a  $C_{\text{S}}$  of 2 pF. This result is ~9× higher than  $R_{\text{N},1}$  under the same  $g_{\text{m}}$  budget and surpasses  $R_{\text{N},1,\text{max}}$  (Fig. 5b). The boosting effect is insensitive to the frequency between 15 and 34 MHz, under  $R_{\text{N},3}/R_{\text{N},1} > 6$ .

Ideally, we should enable  $A_{XO-3}$  during the entire startup phase. Yet, the  $g_m$ 's of  $M_{1-3}$  deviate from their small-signal values when the oscillation amplitude is growing. This results in an aggravated  $R_{N,3}$ . As a consequence, the optimum active time of  $A_{XO-3} t_{sw}$  is the time when  $R_{N,3} \approx R_{N,1}$ , which means  $A_{XO-3}$  no longer helps  $t_s$  reduction. We can find the optimal  $t_{sw}$  via simulations with measured crystal parameters to avoid any extra detection and control mechanism.

To realize the SSCI, we implement a five-stage RO constituted by CS amplifiers with source degeneration. Compared to the RO with inverters or relaxation oscillator, a RO with CS amplifiers balances the frequency stability and compatibility with the sub-0.5 V  $V_{\rm DD}$ . The source resistor ( $R_{\rm S}$  in Fig. 2) also reduces the variation of the oscillating frequency against  $V_{\rm DD}$ . From simulation, the frequency variation of RO reduces by ~20% over a 0.3–0.5 V  $V_{\rm DD}$ . We set  $R_{\rm D}$  as 36 k $\Omega$ . The current consumption of the RO is 20  $\mu$ A. We implemented the div-by-2 unit and FSM with standard logic.

We designed the  $f_{\rm H}$  and  $f_{\rm L}$  of the SSCI module as 36 and 12 MHz, respectively, chosen to satisfy  $f_{\rm L} < f_{\rm m} < f_{\rm H}$  even with PVT variation (Fig. 6). The total size of the  $C_{\rm OSC}$ , simulated to be 135 fF, outputs an  $f_{\rm L}$  of 12 MHz (after div-by-2). Then, we determine the resolution of the cap-bank, decided by the minimum duration of  $t_{\rm CI}$ ; since for a complete chirping sequence, we need to sweep all of the states at least once, we set the minimum  $t_{\rm CI}$  (i.e., N = 1) as the resolution (number of pulses),



**Fig. 6** (a) Monte Carlo-simulated  $f_L$  with  $V_{DD} = 0.4$  V and T = 90 °C; (b) Monte Carlo-simulated  $f_H$  with  $V_{DD} = 0.3$  V and T = -40 °C. N = 30 for both cases



defined in Eq. (2). The optimum  $t_{\rm CI}$ , according to [19] and the measured crystal parameter, becomes 4.6 µs. Thus, we set  $C_{\rm OSC}$  as a binary-coded 6-bit cap-bank (unit cap: 2.14 fF), corresponding to a minimum  $t_{\rm CI}$  of 4 µs with the designated  $f_{\rm H}$ and  $f_{\rm L}$ . Even though there is a discrepancy between the applied and optimum  $t_{\rm CI}$ , it almost does not affect the  $t_{\rm s}$  as the  $t_{\rm CI}$  is only present for a short period when compared with  $t_{\rm s}$ . As the amplitude of oscillation after the CI is proportional to  $\sqrt{t_{\rm CI}}$ , even the applied  $t_{\rm CI}$  is 13% shorter than the optimum; the amplitude is only 7% smaller. Due to the high growth of the oscillation amplitude of the  $A_{\rm XO-3}$  (time constant in Eq. (1): 9.33 µs), we can compensate for the discrepancy between the applied and optimum  $t_{\rm CI}$  by the  $A_{\rm XO-3}$  quickly, for example, the growth of oscillation amplitude countervails the 0.6 µs discrepancy (~1.07×). No significant difference in  $t_{\rm s}$  will emerge, even with PVT variation on the  $t_{\rm CI}$  (Fig. 7).

The RO generates an oscillating signal at  $2f_{\rm H}$  with  $C_{\rm OSC} = 0$  fF (with oscillating frequency governed by the parasitic capacitances) and  $C_{\rm OSC}$  progressively increased by the FSM bit-by-bit according to N to  $C_{\rm OSC} = 135$  fF wherein the RO oscillates at  $2f_{\rm L}$ . In this work, the variable N is digitally configurable among 1, 2, 4, and 8.

# 2.4 Experimental Results and Comparison with State of the Art

The XO, fabricated in 65 nm CMOS with fixed on-chip  $C_{\rm L}$  of 6 pF, occupied an active area of 0.023 mm<sup>2</sup> (Fig. 8a), of which 36% corresponds to the  $C_{\rm L}$  (Fig. 8b). The target  $f_0$  can be flexible between 16 and 24 MHz. We first verify the SSCI functionality. Figure 9a exhibits the measurement of the oscillating frequency of the RO (after div-by-2) against  $C_{\rm OSC}$ , which is consistent with the post-layout simulation. The average  $f_{\rm L}$  and  $f_{\rm H}$  across five dies at room temperature are 10.93 MHz ( $\sigma$ : 0.32 MHz) and 35.96 MHz ( $\sigma$ : 1.21 MHz), respectively. Figure 9b confirms the chirping sequence with N = 1, and Fig. 9c plots the duration of  $t_{\rm CI}$  against N.

Then, we tested the XO with a 24 MHz crystal (package:  $3.2 \times 2.5 \text{ mm}^2$ ) without any startup aid at room temperature (20 °C) and  $V_{DD} = 0.35 \text{ V}$ . The measured crystal



Fig. 8 (a) Chip micrograph. (b) Area breakdown of the XO



Fig. 9 (a) Measured and simulated oscillating frequencies of the RO versus  $C_{\text{OSC}}$  at different conditions, robust to cover  $f_0$  of the crystal even with  $V_{\text{DD}}$  and temperature variations. (b) Measured chirping sequence (N = 1). (c) Injection duration  $t_{\text{CI}}$  against N. For the latter two figures,  $V_{\text{DD}} = 0.35 \text{ V}$ , T = 20 °C



Fig. 10 Measured startup waveform (a) without startup aid and (b) with SSCI and  $A_{XO-3}$  enabled



parameters  $L_M$ ,  $R_M$ ,  $C_M$ , and  $C_S$  are 11.1 mH, 19  $\Omega$ , 3.95 fF, and 1.3 pF, respectively. Under these conditions, we have  $t_s = 1.3$  ms (Fig. 10a). The  $t_s$  decreases to 530 µs with  $A_{XO-3}$  enabled during the startup.

We estimate  $R_{N,1}$  and  $R_{N,3}$  from the growth of the oscillation amplitude according to Eq. (1), which we can write as

$$\ln\left(\frac{A_{\rm env}(t_0 + \Delta t)}{A_{\rm env}(t_0)}\right) = \frac{R_{\rm N} - R_{\rm M}}{2L_{\rm M}} \cdot \Delta t.$$
(7)

By measuring the growth of the oscillation amplitude within a specific time interval, we can estimate the  $R_{\rm N}$  of the XO. For  $A_{\rm XO-1}$ , the growth of oscillation is  $1.01 \times /\mu$ s, and thereby we calculate  $R_{\rm N,1}$  as 230  $\Omega$  (Fig. 11), which is close to the prediction (as described in Sect. 2.3). Similarly, we find  $R_{\rm N,3} \approx 2.2 \ \text{k}\Omega$ . Owing to two reasons, the reduction of  $t_{\rm s}$  is not commensurate with the  $R_{\rm N}$ -boosting ratio





between  $A_{XO-3}$  and  $A_{XO-1}$ . Firstly, as described in Sect. 2.3,  $M_{1-3}$  will deviate from their nominal operating points and deteriorate  $R_{N,3}$ . We can reveal this by measuring  $t_s$  against  $t_{sw}$  (Fig. 12). When  $t_{sw}$  is short (<60 µs) where  $M_{1-3}$  are in the subthreshold region, the small-signal model is still valid to estimate  $t_s$  against  $t_{sw}$  (i.e., slope of the curve (~ -10) closely matches with  $-R_{N,3}/R_{N,1} + 1$ ). As  $t_{sw}$  further increases, the oscillation drives  $M_{1-3}$  away from its original operating point and worsens  $R_{N,3}$ . Hence the slope of the curve declines and eventually reaches zero whereas the  $A_{XO-3}$ no longer aids  $t_s$ -reduction. Secondly, the XO entails an overhead time to enter the steady state after switching to  $A_{XO-1}$ . After this, the XO still takes ~380 µs to enter the steady state. Here, the nonideality of the ULV  $A_{XO-3}$  limits the improvement on  $t_s$ . In fact, for the amplifiers with standard I/O voltage and higher output swing, the reduction of  $t_s$  should be more profound and better matched with the  $R_N$ -boosting ratio.

With both  $A_{XO-3}$  and SSCI enabled, we further decrease  $t_S$  to 400 µs (3.3× reduction) and the corresponding  $E_S$  is 14.2 nJ (2.8× reduction) (Fig. 10b). When switching from  $A_{XO-3}$  to  $A_{XO-1}$  that have different output impedances and, subsequently, operating frequencies, there is an instantaneous change in the output swing, since the magnitude of current passing through the crystal does not change abruptly. The percentage of energy consumed in the startup phase by the SSCI,  $A_{XO-3}$ , and  $A_{XO-1}$  is: 7%, 39%, and 53%, respectively. We verified that  $t_{sw}$  can tolerate ±50% uncertainty for <10% $t_S$  variation, implying that we can obtain an adequate  $t_s$  even with nonoptimal  $t_{sw}$  (e.g., variation on PVT and crystal's parameters). This also justifies that the existing RO will be good enough to control  $t_{sw}$ , avoiding any external detection and control mechanism.

For the transient frequency of the XO, it takes ~300  $\mu$ s to settle for a  $\pm 20$  ppm  $f_0$  accuracy (i.e., 50 kHz drifting from the center frequency of 2.44 GHz in a packet, as defined in [33]). This result is 3.5× faster than the case without startup aid (Fig. 13). The steady-state power is 31.8  $\mu$ W at 0.35 V, and the PN is -134 dBc/Hz at 1 kHz offset, being adequate for most IoT applications and comparable to other state-of-



Fig. 14 Measured XO ( $f_0 = 24$  MHz) performances. (a) Startup time against  $V_{DD}$ . (b) Startup time against temperature

the-art XOs with a standard voltage (e.g., PN of -136 dBc/Hz at 1 kHz and  $f_0 = 26$  MHz in [10]).

The XO can uphold a steady-state output swing >80% of  $V_{DD}$  for  $V_{DD} = 0.3-0.5$ -V. The  $t_s$  varies <25% from its mean (400 µs) for  $V_{DD} = 0.3-0.5$  V (Fig. 14a). Only the RO of the SSCI fails to start if  $V_{DD}$  drops down to 0.25 V, but  $A_{XO-3}$  is still in place to aid  $t_S$  reduction. Over -40-90 °C,  $t_S$  variation is <7.5% (Fig. 14b). We obtained similar results for a 16 MHz crystal (i.e.,  $\Delta f_0/f_0 = 13.4$  ppm over 0.3-0.5 V,  $\Delta f_0/f_0 = 21.9$  ppm over -40-90 °C, and  $t_S$  variation, 9.8%).

Table 2 benchmarks the performance of the XO with the prior art. In terms of  $E_s$ , this work is >2.6× better than [20] and slightly higher than [21]. Furthermore, we can consider this circuit in the vanguard, since it proves the feasibility of regulation-free operation under a wide range of sub-0.5 V  $V_{DD}$ , while conforming to the frequency-stability specification of the BLE (Bluetooth Low Energy) standard.

|                                                   | 1                         |                                                 |                     |                                  | -                                     |             |                               | -                       |              |
|---------------------------------------------------|---------------------------|-------------------------------------------------|---------------------|----------------------------------|---------------------------------------|-------------|-------------------------------|-------------------------|--------------|
|                                                   |                           | This v                                          | vork                | JSSC'16 [3.19]                   | ISSC<br>[3.1                          | C'16<br>13] | ISSCC'17<br>[3.20]            | JSS<br>[3.2             | C'18<br>26]¶ |
| Applications                                      |                           | BLE                                             |                     | Bluetooth                        | BLE                                   |             | BLE                           | N/A                     |              |
| Fast startup t                                    | echniques                 | tiques $ULV$ inductive three-stage $g_m$ + SSCI |                     | Chirp injection +<br>gm-boosting | Dithered injection                    |             | Dynamic load +<br>gm-boosting | Precisely-<br>timed CFI |              |
| Steady-state                                      | techniques                | ULV one-<br>+ resistiv                          | stage gm<br>ve load | One-stage<br>inverter            | One-stage $g_m$ + current-source load |             |                               |                         |              |
| CMOS proce                                        | ess (nm)                  | 65                                              | 5                   | 180                              | 65 90                                 |             | 90                            | 65                      |              |
| Active area (mm <sup>2</sup> )                    |                           | 0.023                                           |                     | 0.12                             | 0.08                                  |             | 0.072                         | 0.09 (per<br>XO)        |              |
| Supply voltage, V <sub>DD</sub> (V)               |                           | 0.35 <sup>a</sup>                               |                     | 1.5                              | 1.68                                  |             | 1.0                           | 1.0                     |              |
| Temperature, T <sub>Range</sub> (°C)              |                           | -40 -                                           | - 90                | -30 - 125                        | -40 -                                 | - 90        | -40 - 90                      | -40                     | - 85         |
| $C_{\rm L}({\rm pF})$                             |                           | 6                                               |                     | 8 (off-chip)                     | 6                                     | 9           | 10                            | 9                       | 8            |
| Frequency, for                                    | (MHz)                     | 16                                              | 24                  | 39.25                            | 24                                    | 24          | 24                            | 50                      | 10           |
| Startup energy, $E_{\rm S}$ (nJ)                  |                           | 15.8                                            | 14.2                | 349                              | -                                     | -           | 36.7                          | 13.3                    | 12           |
| Startup time, t <sub>S</sub> (µs)                 |                           | 460                                             | 400                 | 158                              | 64                                    | 435         | 200 <sup>d</sup>              | 2.2                     | 10           |
| $\Delta t_{\rm S}/t_{\rm S}$ over $T_{\rm range}$ |                           | 9.8%                                            | 7.5%                | 7%                               | ±35%                                  | ±20%        | 26.6%                         | 7%                      | 3%           |
| $\Delta f_0/f_0$ (ppm)                            | versus T <sub>Range</sub> | 21.9 <sup>b</sup>                               | 14.1 <sup>b</sup>   | ±5.5                             | N/A                                   |             | N/A                           | N/A                     |              |
|                                                   | versus V <sub>DD</sub>    | 13.4 °                                          | 17.9 °              | ±0.6 (1.2-1.8 V)                 | N/A                                   |             | N/A                           | N/A                     |              |
| Steady-state                                      | power (µW)                | 31.6                                            | 31.8                | 181                              | 393                                   | 693         | 95                            | 195                     | 45.5         |

Table 2 Performance summary and comparison with recent art

<sup>a</sup>Digital and constant- $g_m$  bias circuits are at 0.7 V (current budget: 5  $\mu$ A) generated by an on-chip charge pump as [29])

<sup>b</sup>@ 0.35V

<sup>c</sup>Across 0.3–0.5 V @ 20° C

<sup>d</sup>Amplitude >90% and  $\Delta f_0/f_0 < \pm 20$  ppm

<sup>e</sup>Only results from similar crystal packages compared

## 3 A 0.35 V 5200 μm<sup>2</sup> 2.1 MHz Temperature-Resilient Relaxation Oscillator with 667 fJ/cycle Energy Efficiency Using an Asymmetric Swing-Boosted RC Network and a Dual-Path Comparator

## 3.1 Motivation

For the crystal-less IoT node [34] and wakeup receiver [35], low-power and fully integrated kHz-to-MHz clock sources with moderate frequency inaccuracy are pivotal to their operations. For instance, [35] requires a frequency reference with ~2.5% frequency accuracy to calibrate the digitally controlled oscillator of the wakeup receiver. Although the crystal oscillator offers better frequency stability, a typical MHz-range crystal oscillator can consume tens of  $\mu$ W, which is

impermissible for the always-on module of an IoT node. In fact, we expect a  $\mu$ W-range power budget in the standby mode [23]. Also, the presence of an off-chip crystal can restrict the volume miniaturization of the IoT nodes.

The ring oscillator is a viable solution among the fully integrated oscillators due to its outstanding power efficiency, tuning range, and compactness [36]. Yet, the oscillating frequency of the ring oscillator is prone to PVT variations that require extra circuitry for compensation. For the LC oscillator, it has a proper balance between the integration level and frequency stability [37, 38]. Yet, the LC tank is too bulky for MHz-range applications.

Recent relaxation oscillators (RxOs) [39–47] proved their potential by attaining fast settling time, moderate intrinsic frequency stability, tiny footprint, and high energy efficiency. A typical RxO consists of a period-defining network, amplifiers, and logic gates. The period-defining network periodically (dis)charges the capacitors therein, and the amplifiers compare the voltages on the capacitors with a reference voltage. The logic gates read the output from the amplifiers and generate the required output correspondingly.

For IoT nodes powered by sub-0.5 V energy-harvesting sources such as the thermoelectric generator and solar cell, ULV operation adds to the RxO design constraints. Existing RxO architectures [39–44] do not favor sub-0.5 V operation, which severely confines the voltage headroom. Hence the linearity and accuracy of the current and voltage references are inferior, and their degraded precisions can affect the RxO's stability. Also, at high temperature, the transistor's leakage current ( $I_{\text{Leak}}$ ) limits the performance of the current/voltage reference.

Recently, a swing-boosted differential RxO proposed in [45] featured a symmetric swing-boosted RC network to define the period of the RxO, enabling no current or voltage reference while delivering a swing-boosted output to improve the noise performance. As this architecture does not entail current or voltage reference, it allows scaling down of the  $V_{\rm DD}$  without affecting the RC network precision. Nevertheless, it has the common-mode voltage ( $V_{\rm CM}$ ) of the RC network restricted to mid  $V_{\rm DD}$ , which implies  $V_{\rm CM} < 0.25$  V for sub-0.5 V operation, thereby hindering the operation of its subsequent comparator.

This section proposes a RxO that surmounts the challenges of sub-0.5 V operation and achieves high area and energy efficiencies. The key techniques are (1) an asymmetric RC network to free the  $V_{\rm CM}$  restriction while preserving a swingboosted output and (2) a dual-path comparator with delay compensation to allow temperature resilience. Prototyped in 28 nm CMOS, the RxO occupied a tiny area (5200 µm<sup>2</sup>) and attained superior energy efficiency (667 fJ/cycle) and figure of merit (FoM<sub>1</sub> = 181 dB) with respect to the prior art.

#### 3.2 Asymmetric Swing-Boosted RC Network

Figure 15a depicts the schematic of the swing-boosted RC network. As demonstrated in [45], the RxO utilizing this RC network exhibits low jitter ( $\sigma_{jit}$ ) attributed to its swing-boosted output voltages ( $V_{x,y}$ ) from the symmetric RC network (k = 1).

Considering  $\emptyset_1$  (Fig. 15b),  $V_x$  is initially at the ground and  $V_{top}$  connects to  $V_{DD}$ , whereas  $V_y$  is initially at  $V_{DD}$  and  $V_{bot}$  connects to the ground.  $V_x$  charges to  $V_{DD}$  and  $V_y$  charges to the ground with time constant ( $\tau$ ) RC. When they cross at  $V_{CM}$  such that  $V_y < V_x$ , the comparator inverts its outputs. Consequently, the chopper alternates the connections, where  $V_{top}$  now connects to the ground and  $V_{bot}$  connects to  $V_{DD}$ . As the charges across the capacitors conserve,  $V_x$  and  $V_y$  change to  $V_{CM} + V_{DD}$  and  $V_{CM} - V_{DD}$  after the transition. The process in  $\emptyset_2$  is complementary, and the operation repeats  $\emptyset_1$  after another transition. Hence, the differential signal  $V_{x,y}$  has



Fig. 15 (a) Simplified schematic of the swing-boosted differential RxO. (b) Timing diagram of the output of the RC network with k = 1, with  $V_{CM}$  fixed to 0.5  $V_{DD}$ . (c) Timing diagram of the output of the RC network with k > 1 such that  $V_{CM,U}$  and  $V_{CM,D}$  suit the design of the subsequent ULV comparator (this work)

a swing of  $2 \times V_{DD}$ . Since the  $\sigma_{jit}$  of the RxO is inversely proportional to the slope of  $V_{x,y}$  at the threshold  $(S_{xy})$ , raising the swing of  $V_{x,y}$  increases  $S_{xy}$  and improves the  $\sigma_{jit}$ .

The RC network symmetry restricts  $V_{\rm CM}$  to mid  $V_{\rm DD}$  regardless of the oscillation phases ( $\emptyset_{1,2}$ ). As  $V_{\rm DD}$  decreases to <0.5 V, the  $V_{\rm CM}$  shrinks to <0.25 V, which is insufficient to properly bias a differential pair with a tail current source. To break this limit, we propose an asymmetric RC network (k > 1), in which one RC branch has a larger  $\tau$ . From Fig. 15c, this act facilitates  $V_{x,y}$  to (dis)charge at different  $\tau$ . The leaps on  $V_x$  and  $V_y$  after the chopping are still  $\pm V_{\rm DD}$ , whereas the  $V_{\rm CM}$  of  $V_x$  and  $V_y$ alternate between  $V_{\rm CM,U}$  and  $V_{\rm CM,D}$  in  $\emptyset_1$  and  $\emptyset_2$ , respectively. As such, we can design k that allows proper  $V_{\rm CM,U}$  ( $V_{\rm CM,D}$ ) and thereby favors the operation of the subsequent ULV comparator.

Analyzing the waveform in Fig. 15c, we can derive four equations governing the (dis-)charge of the asymmetric RC network:

$$(V_{\rm CM,D} + V_{\rm DD})e^{-\frac{T_{\rm I}}{RRC}} = V_{\rm CM,U},$$
 (8)

$$(V_{\rm CM,D} - 2V_{\rm DD})e^{-\frac{T_{\rm I}}{\rm RC}} + V_{\rm DD} = V_{\rm CM,U},$$
(9)

$$(V_{\rm CM,U} + V_{\rm DD})e^{-\frac{T_2}{RC}} = V_{\rm CM,D},$$
 (10)

$$(V_{\rm CM,U} - 2V_{\rm DD})e^{-\frac{T_2}{\rm kRC}} + V_{\rm DD} = V_{\rm CM,D}.$$
 (11)

Assuming that  $T_1 = T_2$ , solving Eqs. (8)–(11) leads to

$$\left(\frac{V_{\rm DD} - V_{\rm CM,D}}{V_{\rm DD} + V_{\rm CM,D}}\right)^k = \frac{V_{\rm CM,D}}{2V_{\rm DD} - V_{\rm CM,D}},\tag{12}$$

$$\left(\frac{V_{\rm CM,U}}{2V_{\rm DD} - V_{\rm CM,U}}\right)^{k} = \frac{V_{\rm DD} - V_{\rm CM,U}}{V_{\rm DD} + V_{\rm CM,U}},\tag{13}$$

$$k = \frac{T}{2\text{RC}} / \ln\left(\frac{1 + 3e^{-T/2\text{RC}}}{1 - e^{-T/2\text{RC}}}\right),\tag{14}$$

where  $T_1 = T_2 = T/2$ . Therefore, we can calculate the required k to achieve a sufficient separation of  $V_{\text{CM},\text{U}}$  ( $V_{\text{CM},\text{D}}$ ) by numerically solving Eqs. (12) and (13), as well as the corresponding T by Eq. (14). Figure 16a illustrates the  $V_{\text{CM},\text{U}}$ ,  $V_{\text{CM},\text{D}}$ , and T versus k.

The  $S_{xy}$  around the threshold crossing determines the  $\sigma_{jit}$  with the following equation [48]:

$$\sigma_{jit} = \alpha \frac{V_{n,xy}}{S_{xy}},\tag{15}$$

where  $\alpha$  is a constant of proportionality and  $V_{n,xy}$  is the equivalent noise from the RC network and the subsequent comparator appearing at its output. We can determine



**Fig. 16** (a) The simulated  $V_{CM,D}$ ,  $V_{CM,D}$ , and the oscillating frequency versus k. Choosing a k > 1 enables a lower (higher)  $V_{CM,D}$  ( $V_{CM,U}$ ), facilitating the ULV operation. (CLK) The  $S_{XY}$  from mathematical modeling and simulated  $1/\sigma_{jit}$  from an ideal RxO with asymmetric RC network versus k. Overdesigning k decreases the  $S_{XY}$  and thus aggravates  $\sigma_{jit}$ 

 $S_{xy}$  by solving for the difference between the derivative of  $V_X$  and  $V_Y$  when t = T/2 (the time when crossing occurs),

$$S_{\rm xy} = \frac{dV_{\rm x,y}}{dt} \left(t = \frac{T}{2}\right). \tag{16}$$

For instance, in  $\emptyset_2$ ,  $V_X$  and  $V_Y$  become

$$V_X(t) = (V_{CM,U} + V_{DD})e^{-\frac{t}{RC}},$$
 (17)

$$V_Y(t) = (V_{\rm CM,U} - 2V_{\rm DD})e^{-\frac{t}{kRC}} + V_{\rm DD},$$
(18)

where we set t = 0 as the beginning of  $\emptyset_2$ . Taking the derivative of  $V_X$  with respect to t and substituting t = T/2, we can get

$$\frac{dV_X}{dt}\left(t = \frac{T}{2}\right) = -\frac{1}{\mathrm{RC}}\left(V_{\mathrm{CM},\mathrm{U}} + V_{\mathrm{DD}}\right)e^{-\frac{T}{2\mathrm{RC}}},\tag{19}$$

and substituting Eq. (10) into Eq. (19):

$$\frac{dV_X}{dt}\left(t = \frac{T}{2}\right) = -\frac{1}{\mathrm{RC}}V_{\mathrm{CM,D}}.$$
(20)

Similarly, we can obtain the slope of  $V_Y$  at t = T/2:

$$\frac{dV_Y}{dt}\left(t = \frac{T}{2}\right) = -\frac{1}{kRC}(V_{CM,D} - V_{DD}).$$
(21)

Then,  $S_{xy}$  in  $Ø_2$  is

$$S_{xy} = -\frac{1}{\mathrm{RC}} \left( V_{\mathrm{CM},\mathrm{D}} - \frac{V_{\mathrm{CM},\mathrm{D}}}{k} + \frac{V_{\mathrm{DD}}}{k} \right), \tag{22}$$

where we can find the relationship between  $V_{\text{CM,D}}$  and k from Eq. (12). Note in (3.22) that when k = 1 (symmetric RC network as in [45]),  $S_{xy} = -V_{\text{DD}}/\text{RC}$ , showing that a higher  $V_{\text{DD}}$  improves  $S_{xy}$  and thus  $\sigma_{jii}$ . Figure 16b shows the  $S_{xy}$  as a function of k. Under the identical RC and  $V_{\text{DD}}$ , increasing k results in decreasing  $S_{xy}$ . We can calculate  $S_{xy}$  similarly in  $\emptyset_1$ ; provided that  $T_1 = T_2$ ,  $S_{xy}$  in  $\emptyset_1$  should be equivalent (in negative) to  $S_{xy}$  in  $\emptyset_2$ .

Based on Fig. 16a, b, we can have the following takeaway: a large k allows  $V_{CM II}$  $(V_{\rm CM,D})$  to approach  $V_{\rm DD}$  (ground), easing the use of an NMOS (N-metal-oxide semiconductor) (PMOS [p-channel metal-oxide semiconductor])-input amplifier for comparisons. Yet, upsizing k penalizes  $\sigma_{iit}$  since  $\sigma_{iit} \propto 1/S_{xy}$ . Besides, pushing  $V_{CM,U}$  $(V_{\rm CM,D})$  close to  $V_{\rm DD}$  (ground) saturates the input pairs of the subsequent amplifiers. Then, there is a trade-off between the minimum  $V_{DD}$  and  $\sigma_{iit}$  for the RxO utilizing the asymmetric RC network. The minimum gate voltage at the NMOS-input amplifier is  $\sim 0.2$  V (i.e., 0.1 V for the tail current source +0.1 V for the gate-source voltages of the differential pair), and the minimum  $V_{DD}$  of the comparator is ~0.35 V (explained in Sect. 3.3). To yield a minimum  $V_{CM,U}$  of 0.2 V to drive the NMOS-input amplifier with 15% margin, we choose k = 2.4 such that  $V_{CM,U}$  is 0.23 V (0.66 ×  $V_{DD}$ ). During the fabrication, the mismatch between the resistors diverts  $V_{\text{CM},\text{U}}$  ( $V_{\text{CM},\text{D}}$ ) from their desired values. Nevertheless, since k is the ratio between the resistors, we can minimize its variation through a delicate layout and a common centroid technique. This means that a 15% margin is adequate to safeguard the operation of the RxO. Correspondingly, we positioned  $V_{\rm CM,D}$  at 0.33  $\times$   $V_{\rm DD}$  to favor the PMOS-input amplifier.

With k = 2.4 in Fig. 16b,  $S_{xy}$  reduces by 39%. To verify the degradation of  $\sigma_{jit}$ , we built an ideal RxO utilizing the asymmetric RC network with a noise source and simulated the  $\sigma_{jit}$  with different values of k. We juxtapose the simulated  $1/\sigma_{jit}$  of such RxO in Fig. 16b. The  $1/\sigma_{jit}$  decreases (hence  $\sigma_{jit}$  increases) at a similar rate of k with  $S_{xy}$ . The  $1/\sigma_{jit}$  at k = 2.4 decreases by 36%, thus verifying our analysis.

#### 3.3 Circuit Implementation

#### **ULV Comparator with Dual-Path Amplifiers**

In [45], the RxO utilizes an inverter-based amplifier for voltage comparison. Although this amplifier has excellent noise performance, it is not suitable for ULV operation as it requires a minimum voltage headroom of  $2(V_{GS} + V_{DS})$ . We proposed the asymmetric RC network in Sect. 3.2 for ULV operations, where we can adjust the  $V_{CM,U}$  ( $V_{CM,D}$ ) according to k. To cope with different  $V_{CM}$  at two phases of oscillations under a ULV headroom, we utilize a comparator with dual-path amplifiers to handle the voltage comparisons across  $V_{x,v}$ . The comparator consists of an

NMOS-input, a PMOS-input amplifier, and logic gates to generate the CLK signal. The NMOS-input amplifier, enabled in  $\emptyset_1$ , is capable of handling a higher input  $V_{\text{CM}}$ , where  $V_X$  and  $V_Y$  cross at  $V_{\text{CM},U}$ , with the PMOS-input amplifier disabled. The complementary operation happens in  $\emptyset_2$ . As such, both amplifiers can perform comparisons under the ULV headroom. When compared with the case using k = 1 and only a PMOS-input amplifier, the variation of the RxO's oscillating period ( $T_{\text{OSC}}$ ) reduces by ~40%.

Figure 17a, b presents the proposed ULV RxO, with each amplifier built by cascading three gain stages, each formed by a fully differential common-source (CS) amplifier (Fig. 18a), to boost the overall voltage gain. The simulated gains of the cascaded amplifiers are >27 dB. Following the amplifiers, the logic gates generate the CLK signals and operate the chopper of the RC network after boosting to CLK<sub>H</sub> (explained below).

Since we can adjust the  $V_{\rm CM,U}$  ( $V_{\rm CM,D}$ ) of the RC network between  $V_{\rm DD}$  and ground by choosing an appropriate k, the main limitation for the minimum  $V_{\rm DD}$  of the RxO derives from two factors: the dual-path amplifier and the logic gates. Assuming all transistors biased in the subthreshold region with the gate voltages bounded between  $V_{\rm DD}$  and ground, the minimum  $V_{\rm DD}$  of the differential CS amplifier is  $V_{\rm SD,1} + V_{\rm DS,3} + V_{\rm DS,5}$  (in Fig. 18a) if we assume the  $V_{\rm DS}$ -drop on M<sub>6</sub>, the transistor for power-gating, is negligible. To maintain operation in the subthreshold region, the  $|V_{\rm DS}|$  of a transistor should be  $>3 \times V_{\rm T}$ , where  $V_{\rm T}$  is the thermal voltage. The  $V_{\rm T}$  reaches 34 mV at 120 °C. Hence, the minimum  $V_{\rm DD}$  of the differential CS amplifier is 306 mV in theory. We allow ~10% margin for the design and choose a  $V_{\rm DD}$  of 0.35 V. On the other hand, the necessary  $V_{\rm DD}$  for the logic gates to operate under the desired oscillating frequency also limits the minimum  $V_{\rm DD}$ . In the selected CMOS 28 nm process, the delay of the logic gates with  $V_{\rm DD}$  of 0.35 V varies <1% of  $T_{\rm OSC}$  from -20 to 120 °C, evincing that a  $V_{\rm DD}$  of 0.35 V is sufficient to power the logic gates.

The comparator's delay ( $t_{delay}$ ) affects the  $T_{OSC}$  stability. As described later, a delay generator compensates for  $t_{delay}$  under different operating conditions. Here, we target a maximum  $\Delta t_{delay} \sim 25\%$  of  $T_{OSC}$  across -20 to 120 °C such that the resultant  $T_{osc}$  variation after compensation is <2.5%, reserving a 10% mismatch margin between  $t_{delay}$  and the delay generator. The simulated  $t_{delay}$  (N + P channel) ranges from 17 ns at 120 °C to 146 ns at -20 °C under a power consumption of 500 nW (at 27 °C), with a variation  $\sim 10\%$  above the target.

The gate voltages of  $M_3$  and  $M_4$  determine the operating region of  $M_5$  (Fig. 18a). To guarantee  $M_5$  operates in the subthreshold region,  $V_{DS,5}$  needs to be higher than  $3 \times V_T$ . We can either increase  $V_{in,P}$  ( $V_{in,N}$ ), which is the RC network output for the first amplifier, by upsizing k or decreasing the  $V_{GS}$  of  $M_3$  and  $M_4$ . As explained in Sect. 3.2, upsizing k deteriorates the  $\sigma_{jit}$ . On the other hand, under the same bias current and channel length, decreasing  $V_{GS}$  incurs a wider  $M_3(M_4)$ , thus exacerbating the  $t_{delay}$  and the RxO's frequency stability. From the simulation, the amplifier's delay raises by 26% with the  $V_{GS}$  of  $M_3(M_4)$  reduced by 10 mV (with the width of  $M_3(M_4)$  enlarged). We aim for a  $V_{GS}$  of 0.1 V for  $M_3(M_4)$  to achieve a proper trade-off between the  $t_{delay}$  and  $\sigma_{jit}$ .



**Fig. 17** (a) Proposed ULV swing-boosted RxO featuring an asymmetric RC network and a dualpath comparator. We track the delays of the amplifiers to tackle the frequency fluctuation against temperature and voltage variations. (b) Schematic of the logic gates. The SR latch, together with the delay unit, guarantees that the RxO only generates desired oscillating signal without glitch

Since each amplifier is only responsible for comparing  $V_x$  and  $V_y$  in one phase, we can have them power-gated based on the CLK state to reduce the power consumption. For instance, in  $\emptyset_1$  where CLK is high and the common-mode voltage of  $V_x$  and  $V_y$  is at  $V_{CM,U}$ , we enable the NMOS-input amplifier for comparison, while powering down the PMOS-input amplifier. The operation reverses in  $\emptyset_2$ . This duty-cycling scheme saves 26% of the total RxO power budget.

To ensure that  $M_1$  and  $M_2$  operate in the subthreshold region, a common-mode feedback (CMFB) circuit generates their gate voltages (Fig. 18b). The CMFB circuit compares the common-mode output voltage of the amplifier to  $V_{ref}$  and corrects  $V_{FB}$ . We scaled the transistors' sizes of the CMFB circuit from the main amplifier such



Fig. 18 (a) Schematic of the differential CS amplifier (NMOS). (b) CMFB circuit for the NMOS CS amplifier

that the PVT variations have the same effect on the amplifier and CMFB circuit to enhance its robustness.

We utilized a SR latch to read the results from the amplifiers and yield the desired state of CLK. Also, we used a delayed CLK ( $\overline{\text{CLK}}$ ) signal  $\text{CLK}_{\text{D}}$  ( $\overline{\text{CLK}}$ ) to mask out the glitches and avert the undesired transition of CLK due to glitches from the amplifiers during the switching. For instance, as illustrated in Fig. 17b, before the end of  $\emptyset_1$  (CLK and CLK<sub>D</sub> are high), both S and R of the SR latch are high and maintain the state of CLK. Therein, with the NMOS-input amplifier enabled, we disable the PMOS-input amplifier. Once  $V_X > V_Y$ , R becomes low and S is still at high (since  $\overline{\text{CLK}_{D}}$  is low), which forces CLK to low. Then, the circuit enables the PMOS-input amplifier, while disabling the NMOS-input amplifier. During the switching of the amplifiers, we may have an undesired transition on  $V_{\text{out,N}}/V_{\text{out,P}}$ . The CLK<sub>D</sub> signal and the NAND gates guarantee that these undesired glitches do not affect the state of CLK. After a delay of  $\tau_d$ , CLK<sub>D</sub> goes low. Both S and R are high again, and the SR latch maintains the state of CLK until  $V_{out,P}$  goes high ( $V_X < V_Y$ ). The operation repeats itself after another transition of CLK. A simple RC circuit and inverters with  $\tau_d$  of ~80 ns implement the delay unit. We selected  $\tau_d$  to allow sufficient margin before the zero-crossing point of  $V_{XY}$  without affecting the comparison, yet it would be long enough to filter out the glitches from the amplifiers during the switching amid PVT variation.

A constant- $g_m$  bias circuit aids the amplifiers in withstanding voltage and temperature variations [49]. A switched-capacitor voltage doubler (Fig. 19a) powers the bias circuit, which extends the voltage headroom  $(2 \times V_{DD} \approx 0.7 \text{ V})$ . As we can reuse the CLK signal from the RxO itself to operate the voltage doubler, the power (11%) overhead is low. During the start-up, there is no CLK signal yet to drive the voltage doubler, and hence there would be no output from the bias circuit without any auxiliary signal. Thus, a start-up pulse (duration ~1 µs, generated on-chip after  $V_{DD}$  rises) enables an auxiliary ring oscillator (RO) to operate the voltage doubler in this start-up phase (Fig. 19b, c). With the  $V_{2X}$  boosted up to ~2 ×  $V_{DD}$ , the bias circuit



Fig. 19 (a) Schematic of the switched capacitor voltage doubler. (b) The auxiliary RO that drives the voltage doubler during the startup. (c) Timing diagram of the auxiliary RO and the voltage doubler

functions properly within this period. Then, we disable the start-up pulse and the auxiliary RO, with the RxO starting to operate. Like this, the RO does not pose interference to the RxO nor affect the accuracy of the RxO's frequency. The RO's frequency ranges from 15.2 to 35.1 MHz across -20-120 °C.

#### **Delay Generators**

The temperature dependency of  $t_{delay}$  affects RxO's  $T_{OSC}$ . Ideally,  $T_{OSC}$  is only dependent on the RC network. However, the  $t_{delay}$  after the zero-crossings of  $V_{x,y}$  prolongs the duration of each phase. As  $t_{delay}$  is temperature-dependent, it deteriorates the RxO's frequency stability. Raising the amplifiers' power budget can diminish the ratio  $t_{delay}/T_{OSC}$ , but it penalizes the RxO energy efficiency. In [42], a period controller compensates  $t_{delay}$  by doubling the current injected into the period-



**Fig. 20** (a) Proposed delay generator to track the  $t_{delay}$  at different operating conditions and its timing diagram. (b) Matching between  $t_{delay}$  and  $t_{DN} + t_{DP}$  against temperature variation (under nominal case). (c) Principle of the delay compensation: when  $\emptyset_{FH}$  is high,  $\tau$  of the RC branches halved thus  $V_{x,y}$  (dis)charge at a double rate to compensate  $t_{delay}$ . (d, e) The Monte Carlo-simulated  $t_{DP}$  and  $t_{DN}$  (100 runs) at 27 °C with different input codes for the capacitor banks

defining capacitors, in which the current injection duration tracks  $t_{delay}$ . As such, it can correct  $T_{OSC}$  to minimize its temperature sensitivity. Yet, the period controller entails an extra comparator for copying  $t_{delay}$ , penalizing the power budget.

Since the delay of an amplifier relates to its bias current, we introduce a delay generator to create a pulse, with its width inversely proportional to the bias current. As demonstrated in Fig. 20a, two delay generators (for NMOS- and PMOS-input

amplifiers) with scaled currents from the main amplifiers generate the pulses after the edges of CLK<sub>H</sub>. From the simulation, the width of the pulses  $Ø_F$  closely tracks  $t_{delay}$  (error <7.6% of  $t_{delay}$  or <2.3% of  $T_{OSC}$ ). To compensate  $t_{delay}$ , we halve the  $\tau$  of the RC branches when  $Ø_{FH} = 1$  by closing switches S<sub>1</sub> and S<sub>2</sub> in Fig. 17a. The open-loop compensation scheme alleviates the long settling time of the oscillator. Furthermore, this compensation method can even off the temperature dependency of the resistors in the RC network, avoiding area-hungry composite resistors to obtain a zero temperature coefficient (TC) [42, 46].

We implemented the delay-controlling capacitors  $C_{\rm N}$  and  $C_{\rm P}$  as four-bit capacitor banks, with their values programmed to balance the process variation once after fabrication. The design of the tuning ranges of the capacitances can cover the variations of  $t_{delay}$  amid process variations. The  $t_{delay}$  of NMOS-input and PMOSinput amplifiers vary from 15 to 45 ns and 36 to 60 ns, respectively, from the Monte Carlo simulation (100 runs, at 27 °C). Consequently, we design the delay generator and the capacitor banks capable of generating pulses of width in this range by adjusting their codes correspondingly (Fig. 20d, e). With the proposed compensation scheme, the simulated variation of  $T_{\rm OSC}$  decreases from 25% to 2.1% over -20–120 °C. For the constant- $g_m$  biasing, the current decreases with temperature. Hence, both  $I_{BN}$  and  $I_{BP}$ , the biasing currents of the NMOS-input and PMOS-input amplifiers, are minimum at -20 °C. Consequently, the  $t_{\rm DN}$  and  $t_{\rm DP}$  are largest at -20 °C and decrease to their minimum toward 120 °C. Therefore, we have the overall resolutions of  $t_{\rm DP}$  and  $t_{\rm DP}$  confined at low temperature (7 ns and 13 ns). Still, these resolutions are sufficient to uphold the 2.5% frequency error requirement. In case a finer resolution is necessary, the number of bits of the capacitor banks can increase.

#### **CLK Boosters**

The non-idealities of the switches influence the performance of the RxO. For example, the nonzero on-resistances ( $R_{ON}$ ) of the transistors that constitute switches  $S_{1-6}$  (in Fig. 17a) affect the  $\tau$  of the RC network. Under sub-0.5 V, the transistors work in the subthreshold region. Then, the situation emerges as  $R_{ON}$  increases exponentially with –( $V_{GS} - V_{TH}$ ), where the worst case of  $|V_{GS}|$  is  $0.5 \times V_{DD}$  without any boosting technique. Further, as  $R_{ON}$  is prone to temperature variations ( $R_{ON}$  increases with a decreasing temperature), it inevitably affects the frequency stability of the RxO. To alleviate the impact, we should minimize  $R_{ON}$  in comparison with R in the RC network. One possibility is reducing  $R_{ON}$  by upscaling the widths of the transistors that compose the switches. Yet, this act leads to another problem: in the deep submicron CMOS process, the  $I_{Leak}$  in the off-state, especially at high temperature, restricts the RxO's performance and operation range. Considering the switches  $S_{1-2}$  in Fig. 17a again, at high temperature, the transistors with high  $I_{Leak}$  equivalently reduce  $\tau$ . Altogether, there is a trade-off between their  $R_{ON}$  at low temperature and  $I_{Leak}$  at high temperature.

To tackle this challenge, we employ clock boosters [50] to triple the swing of the digital signals (CLK<sub>H</sub>,  $\overline{\text{CLK}_{\text{H}}}$ , and  $\emptyset_{\text{FH}}$ ). The clock booster, powered from  $V_{\text{DD}}$ ,



**Fig. 21** (a)  $R_{\rm ON}$  of an NMOS from -20 to  $120 \, ^{\circ}{\rm C}$  with different  $V_{\rm G}$ . For both cases,  $V_{\rm D} = V_{\rm S} = 0.175 \, {\rm V}$ . The increased swing on  $V_{\rm G}$  reduces the variations of  $R_{\rm ON}$  by 8600×. (b)  $I_{\rm Leak}$  of the same NMOS in (a) in the off-state. With a negative  $V_{\rm G}$ , the  $I_{\rm Leak}$  reduces by 389× at 120 °C. For both cases,  $V_{\rm D} = 0.35 \, {\rm V}$  and  $V_{\rm S} = 0 \, {\rm V}$ 

increases the swing of the periodic signal (high,  $2 \times V_{\text{DD}}$ ; low,  $-V_{\text{DD}}$ ) without additional power supply. With a boosted swing, the worst  $|V_{\text{GS}}|$  for the transistors now becomes  $1.5 \times V_{\text{DD}}$ . Besides, benefitting from the negative voltage  $(-V_{\text{DD}})$  at the logic low level, it effectively suppresses  $I_{\text{Leak}}$ , even at 120 °C. For example, this scheme not only tightens the variations of the  $R_{\text{ON}}$  of an NMOS switch across – 20–120 °C by  $8600 \times (V_{\text{D}} = \text{V}_{\text{S}} = 0.5 \times V_{\text{DD}}$ , Fig. 21a) but also shrinks  $I_{\text{Leak}}$  in the off-state at 120 °C from 307 to 0.8 nA (Fig. 21b), rendering the RxO robust in an extreme environment.

#### 3.4 Measurement Results

We fabricated a prototype of the RxO in 28 nm CMOS 1P10M technology. It occupied a core area of 5200  $\mu$ m<sup>2</sup>, dominated by the comparator (28%) and RC network (26%) (Fig. 22a, b). The RxO consumed 1.4  $\mu$ W at 22 °C on average (N = 7) (Fig. 23a, b)), where the comparator (49%, from simulation) dominates (Fig. 22c). After the fabrication, we apply three-point trim to the capacitor banks of the delay generator based on the measured frequency of the RxO.

Peripheral equipment such as the oscilloscope (for observing the waveform in real-time) and the frequency counter (for measuring the frequency f) have high input capacitances. The digital buffers with a  $V_{DD}$  of 0.35 V and reasonable sizing are not capable of driving these equipment. Thus, we utilize on-chip-level shifters to raise the output signals for swings of 0.9 V. Afterward, we feed such signals to digital



**Fig. 22** (a) Chip micrograph of the fabricated RxO in 28 nm CMOS. (b) Area breakdown of the RxO. (c) Power breakdown of the RxO (from simulation)

buffers with a  $V_{DD}$  of 0.9 V (supplied independent of the RxO's  $V_{DD}$ ) to drive the peripheral equipment.

The mean oscillating frequency of the RxO is 2.1 MHz. It has an energy efficiency of 667 fJ/cycle, rendering it the most energy-efficient RxO reported in the MHz-range. After calibrations, the deviations of the RxOs' frequencies are <2.5% from -20 to 120 °C (Fig. 23c). The resulting TC is 158 ppm/°C on average. The mean variation of the RxO's frequencies from 0.35 to 0.38 V (~9% of  $V_{DD}$ ) is 2.5% (Fig. 23d). The line sensitivity, where we also take the supply voltage into account  $\left[\left(\frac{\Delta f}{f}\right)/\left(\frac{\Delta V}{V}\right)\right]$ , is 26.8%. The large sensitivity of the RxO to voltage variation is attributable to the subthreshold operation and low  $V_{DS}$  across the transistors of the amplifiers. From the simulation, the bias current of the NMOS-input amplifier increases by 25% from 0.35 to 0.38 V, hence affecting the  $t_{delay}$  and the RxO's frequency. Still, the 0.35–0.38 V range is sufficient for IoT devices powered by solar cells and installed in the typical indoor environment (e.g., home and office), as the open-circuit voltage of a solar cell varies 30 mV amid a change in light intensity of



Fig. 23 Measured performance of the RxO from seven chip samples. (a) Power consumption versus temperature. (b) Power consumption versus  $V_{DD}$ . (c) Frequency stability versus temperature. (d) Frequency stability versus  $V_{DD}$ 



Fig. 24 (a) Measured period jitter of the RxO (52,000 hits on the oscilloscope). (b) Accumulated jitter of the RxO

~3× [51, 52]. If we relax the requirement on frequency stability or recalibration of the frequency at different  $V_{DD}$  is feasible, the working range of the RxO can extend to 0.5 V and then limited by the breakdown voltage of the CMOS process (1 V) due to the voltage doubler and clock booster.

The RMS period jitter of the RxO is 800 ps (0.15% of  $T_{OSC}$ ) (Fig. 24a). The accumulated jitter increases at a rate of  $\sqrt{N}$  up to ~60 cycles, in which the thermal noise is the dominant noise source (Fig. 24b). When compared with [45], the high period jitter is attributable to the low supply voltage, low power, and different amplifiers handling the comparison in  $\emptyset_1$  and  $\emptyset_2$ . Still, the RxO is appropriate for the devices in which ULV and ultra-low power are the priorities (e.g., wakeup receiver [35]). The long-term stability is 210 ppm (gating time >0.1 s). To



**Fig. 25** (a) Startup waveform of the RxO, with  $V_{DD}$  switched on at t = 0 s. (b) Transient frequency during startup. The RxO reaches steady state within three clock cycles or 3.6 µs after enabling  $V_{DD}$ . (c) The startup time of the RxO at different temperatures

characterize the supply noise rejection of the RxO, we superimpose a sinusoidal signal on  $V_{\rm DD}$  and measure the corresponding period jitter. In the presence of a 20 mV<sub>pp</sub> sinusoidal signal (1 kHz) at the supply, the period jitter of the RxO exhibits a value of 2 ns.

We also characterize the startup time of the RxO, which is crucial if the RxO is power gating to further suppress the power consumption of the IoT node. As the asymmetric RC network requires finite clock cycles to produce a consistent output signal, the RxO's frequency settles after the third clock pulse (Fig. 25a, b). Over the entire temperature range, the RxO enters the steady state within 3.6  $\mu$ s after enabling  $V_{\text{DD}}$  (Fig. 25c).

Herein we benchmark the RxO using two FoM. First, we evaluated the RxO using the FoM proposed in [44]

$$FoM_{1} = 10 \log \left( \frac{f \cdot T_{range}}{Power \cdot TC} \right),$$
(23)

with the temperature range  $T_{\text{range}}$ . This FoM takes into account the trade-off among f, power,  $T_{\text{range}}$ , and TC. The FoM<sub>1</sub> of the RxO is 181 dB, which is comparable to the state of the art in spite of the ULV  $V_{\text{DD}}$  of 0.35 V. Then, we evaluated the RxO using the conventional FoM:

|                                                 | Vaa        | Milarliá       | T in    | Sourceth     | Las       |            |
|-------------------------------------------------|------------|----------------|---------|--------------|-----------|------------|
|                                                 | K00,       | ESSCIPC'17     | Liu,    | ISSC'10      | Lee,      |            |
|                                                 | 135CC 17   |                | 1330 19 | JSSC 19      | 1330 20   | This work  |
|                                                 | [43]       | [40]           | [44]    | [41]         | [45]      | THIS WOLK  |
| Process                                         | 180        | 350            | 65      | 65           | 180       | 28         |
| (nm)                                            |            |                |         |              |           |            |
| Frequency                                       | 0.44       | 1              | 1.05    | 1.2          | 10.5      | 2.1        |
| (MHz)                                           |            |                |         |              |           |            |
| $V_{\rm DD}$ (V)                                | 1.4-3.3    | 3-4.5          | 0.98-   | 0.9-1.8      | 1.4-2.0   | 0.35-0.38  |
|                                                 |            |                | 1.02    |              |           |            |
| Power                                           | 21.3       | 210            | 69      | 0.82         | 219.8     | 1.4        |
| (uW)                                            |            |                |         | 0.02         | -1710     |            |
| Energy offi                                     | 48.4       | 210            | 65.7    | 0.68         | 20.0      | 0.67       |
| cionov                                          | +0.4       | 210            | 05.7    | 0.08         | 20.9      | 0.07       |
| (pl/avala)                                      |            |                |         |              |           |            |
|                                                 | 20,100     | 40, 125        | 15.55   | 20, 125      | 40, 125   | 20, 120    |
| I <sub>range</sub> (°C)                         | -20-100    | -40-125        | -15-55  | -20-125      | -40-125   | -20-120    |
| TC (ppm/°                                       | 169        | 24.3           | 4.3     | 100          | 137       | 158        |
| <u>C)</u>                                       |            |                |         |              |           |            |
| Variation                                       | 0.04%      | 0.42%          | 0.17%   | .0.5407      | 2.64%     | 2.3%       |
| across $V_{DD}$                                 |            |                |         | $\pm 0.54\%$ |           |            |
| Line sensi-                                     | 0.03%      | 0.84%          | 4.25%   |              | 6.16%     | 26.8%      |
| tivity $\left(\frac{\Delta f}{\Delta V}\right)$ |            |                |         | $\pm 0.54\%$ |           |            |
| (f, f)                                          |            |                |         |              |           |            |
| )                                               |            |                |         |              |           |            |
| Area (µm <sup>2</sup> )                         | 58,000     | 40,000         | 51,000  | 5000         | 15,000    | 5200       |
| Period iitter                                   | 1060       | _              | 160     | 1_           | 9.86      | 800        |
| (ps)                                            |            |                |         |              |           |            |
| Startup                                         |            | 1 <sup>a</sup> | 8       | 10           |           | 36         |
| time (us)                                       | -          | 1              | 0       | 10           |           | 5.0        |
|                                                 | 100        | -              |         | b            |           | -          |
| No. of                                          | 100        | 5              | -       | 170          | 15        | 17         |
| samples                                         |            |                |         |              |           |            |
| FoM <sub>1</sub> (dB)                           | 162        | 165            | 174     | 183          | 168       | 181        |
| FoM <sub>2</sub>                                | -152.7     | -              | -       | -            | -157.7    | -143.4     |
| (dBc/Hz)                                        | (@ 10 kHz) |                |         |              | (@ 1 kHz) | (@ 10 kHz) |

Table 3 Performance summary and comparison with the state-of-the-art RXOs

<sup>a</sup>Deduced from the numbers of cycles to start, which may underestimate the true startup time <sup>b</sup>For temperature stability measurement

$$FoM_2 = PN - 20 \log\left(\frac{f}{f_{offset}}\right) + 10 \log\left(\frac{Power}{1 \text{ mW}}\right), \tag{24}$$

where PN is the phase noise at the offset frequency from the carrier  $f_{\text{offset}}$ . The PN of the RxO at 10 kHz offset is -68.4 dBc/Hz, resulting in an FoM<sub>2</sub> of -143.4 dBc/Hz.

Table 3 summarizes the performance of the RxO and compares it with recent art. This work is the first sub-0.5 V temperature-resilient (<2.5%) RxO achieving a high power efficiency of 667 fJ/cycle (Fig. 26). When compared with the RxO with a



Fig. 26 Comparison with state-of-the-art fully integrated oscillators. Red circle, relaxation oscillator; blue circle, frequency-locked-loop type oscillator. A larger circle implies a relatively higher oscillating frequency. The figure only shows selected oscillators with frequencies between 0.1 and 10 MHz

symmetric swing-boosted RC network [45], this RxO operates at a  $4 \times$  less  $V_{DD}$ , while achieving a comparable TC after compensation.

#### 4 Conclusions

This chapter detailed the analysis and design of two ULV MHz-range clock references for different purposes, with both clock references implemented and taped out in deep-submicron CMOS, exhibiting well-founded and pioneering measurement results. The first is a regulation-free sub-0.5 V XO for energy-harvesting BLE radios. We introduced two circuit techniques, *dual-mode*  $g_m$  and *SSCI*, to reduce the startup time  $t_s$  and energy  $E_s$ . The dual-mode  $g_m$  exploits the inductive feature of three-stage  $g_m$  ( $A_{XO-3}$ ) to counteract the crystal's  $C_s$  during the startup and the low-noise feature of one-stage  $g_m$  ( $A_{XO-1}$ ) to preserve the PN in the steady state. The XO prototyped in 65 nm CMOS has a compact area (0.023 mm<sup>2</sup>) that is >3.1× smaller than the prior art. The measured  $t_s$  and  $E_s$  of the XO, with a 24 MHz crystal, are 400 µs and 14.2 nJ, respectively. The frequency stability against voltage (0.3–0.5 V) is 17.9 ppm and temperature (-40–90 °C) is 14.1 ppm; both conform to the BLE standard.

The second clock reference is a 2.1 MHz temperature-resilient RxO with a 0.35 V supply voltage for ultra-low-power IoT nodes. We jointly design an asymmetric

swing-boosted RC network and a dual-path comparator to tackle the challenges of ULV (<0.5 V) operation. The open-loop delay generator compensates for the temperature-sensitive delay of the comparator. Fabricated in 28 nm CMOS, it has an active area of only 5200  $\mu$ m<sup>2</sup> and achieves the best energy efficiency of 667 fJ/ cycle among the previously reported MHz-range RxOs. Further, it also has a high figure of merit of 181 dB in spite of the ULV headroom and can settle within 3.6  $\mu$ s after enabling the supply voltage.

### References

- 1. Wollschlaeger, M., Sauter, T., & Jasperneite, J. (2017, March). The future of industrial communication. *IEEE Industrial Electronics Magazine*, 11, 17–27.
- Ahmed, E., Yaqoob, I., Gani, A., Imran, M., & Guizani, M. (2016, November). Internet-ofthings-based smart environments: State of the art, taxonomy, and open research challenges. *IEEE Wireless Communications*, 23(5), 10–16.
- 3. Bahai, A. (2016, September). Ultra-low energy systems: Analog to information. In *Proceedings of the European Solid-State Circuits Conference (ESSCIRC)* (pp. 3–6).
- Bandyopadhyay, S., & Chandrakasan, A. P. (2012, September). Platform architecture for solar, thermal, and vibration energy combining with MPPT and single inductor. *IEEE Journal of Solid-State Circuits*, 47(9), 2199–2215.
- Weng, P. S., Tang, H. Y., Ku, P. C., & Lu, L. H. (2013, April). 50 mV-input batteryless boost converter for thermal energy harvesting. *IEEE Journal of Solid-State Circuits*, 48(4), 1031–1041.
- Bito, J., Bahr, R., Hester, J. G., Nauroze, S. A., Georgiadis, A., & Tentzeris, M. M. (2017, May). A novel solar and electromagnetic energy harvesting system with a 3-D printed package for energy efficient Internet-of-Things wireless sensors. *IEEE Transactions on Microwave Theory* and Techniques, 65(5), 1831–1842.
- Lei, K.-M., Mak, P.-I., Law, M.-K., & Martins, R. P. (2018, September). A regulation free sub-0.5-V 16–/24-MHz crystal oscillator with 14.2-nJ startup energy and 31.8-µW steady-state power. *IEEE Journal of Solid-State Circuits*, 53(9), 2624–2635.
- Lei, K.-M., Mak, P.-I., & Martins, R. (2021, September). A 0.35-V 5,200-μm<sup>2</sup> 2.1-MHz temperature-resilient relaxation oscillator with 667 fJ/cycle energy efficiency using an asymmetric swing-boosted RC network and a dual-path comparator. *IEEE Journal of Solid-State Circuits*, 56(9), 2701–2710.
- Tsai, M.-D., Yeh, C.-W., Cho, Y.-H., Ke, L.-W., Chen, P.-W., & Dehng, G.-K. (2008, June). A temperature-compensated low-noise digitally-controlled crystal oscillator for multi-standard applications. In *Proceedings of the IEEE Radio Frequency Integrated Circuits Symposium* (*RFIC*) (pp. 533–536).
- Chang, Y., Leete, J., Zhou, Z., Vadipour, M., Chang, Y.-T., & Darabi, H. (2012, February). A differential digitally controlled crystal oscillator with a 14-bit tuning resolution and sine wave outputs for cellular applications. *IEEE Journal of Solid-State Circuits*, 47(2), 421–434.
- Iguchi, S., Sakurai, T., & Takamiya, M. (2017, November). A low-power CMOS crystal oscillator using a stacked-amplifier architecture. *IEEE Journal of Solid-State Circuits*, 52(11), 3006–3017.
- Lei, K.-M., Mak, P.-I., & Martins, R. P. (2021, January). Startup time and energy reduction techniques for crystal oscillators in the IoT era. *IEEE Transactions on Circuits and Systems II: Express Briefs*, 68(1), 30–35.

- Griffith, D., Murdock, J., & Røine, P. T. (2016, February). A 24MHz crystal oscillator with robust fast start-up using dithered injection. In *IEEE International Solid-State Circuits Conference – (ISSCC) Digest of Technical Papers* (pp. 104–105).
- Nordic Semiconductor. (2018). nRF52840 Data Sheet [Online]. Available:http://infocenter. nordicsemi.com/pdf/nRF52840\_PS\_v1.0.pdf
- 15. Liu, Y.-H., Bachmann, C., Wang, X., Zhang, Y., Ba, A., Busze, B., et al. (2015, February). A 3.7 mW-RX 4.4 mW-TX fully integrated Bluetooth Low-Energy/IEEE802. 15.4/proprietary SoC with an ADPLL-based fast frequency offset compensation in 40nm CMOS. In *IEEE International Solid-State Circuits Conference – (ISSCC) Digest of Technical Papers* (pp. 236–237).
- 16. Kuo, F. W., Ferreira, S. B., Chen, H. N. R., Cho, L. C., Jou, C. P., Hsueh, F. L., et al. (2017, April). A bluetooth low-energy transceiver with 3.7-mW all-digital transmitter, 2.75-mW high-IF discrete-time receiver, and TX/RX switchable on-chip matching network. *IEEE Journal of Solid-State Circuits*, 52(4), 1144–1162.
- 17. Liu, H., Sun, Z., Tang, D., Huang, H., Kaneko, T., Deng, W., et al. (2018, February). An ADPLL-centric bluetooth low-energy transceiver with 2.3mW interference-tolerant hybrid-loop receiver and 2.9mW single-point polar transmitter in 65nm CMOS. In *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers* (pp. 444–445).
- Blanchard, S. A. (2003, June/July). Quick start crystal oscillator circuit. In Proceedings of the IEEE University/Government/Industry Microelectronics Symposium (pp. 78–81).
- Iguchi, S., Fuketa, H., Sakurai, T., & Takamiya, M. (2016, February). Variation-tolerant quickstart-up CMOS crystal oscillator with chirp injection and negative resistance booster. *IEEE Journal of Solid-State Circuits*, 51(2), 496–508.
- 20. Ding, M., Liu, Y.-H., Zhang, Y., Lu, C., Zhang, P., Busze, B., et al. (2017, February). A 95µW 24MHz digitally controlled crystal oscillator for IoT applications with 36nJ start-up energy and >13× start-up time reduction using a fully-autonomous dynamically-adjusted load. In *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers* (pp. 90–91).
- Esmaeelzadeh, H., & Pamarti, S. (2018, March). A quick startup technique for high-Q oscillators using precisely timed energy injection. *IEEE Journal of Solid-State Circuits*, 53(3), 692–702.
- 22. Kwon, Y.-I., Park, S.-G., Park, T.-J., Cho, K.-S., & Lee, H.-Y. (2012, February). An ultra low-power CMOS transceiver using various low-power techniques for LR-WPAN applications. *IEEE Transactions on Circuits and Systems I: Regular Papers*, 59(2), 324–336.
- Texas Instruments. (2013). CC2541 Data Sheet [Online]. Available: http://www.ti.com/lit/ds/ symlink/cc2541.pdf
- 24. Zhang, F., Miyahara, Y., & Otis, B. P. (2013, December). Design of a 300-mV 2.4-GHz receiver using transformer-coupled techniques. *IEEE Journal of Solid-State Circuits*, 48(12), 3190–3205.
- 25. Babaie, M., Kuo, F. W., Chen, H. N. R., Cho, L. C., Jou, C. P., Hsueh, F. L., et al. (2016, July). A fully integrated Bluetooth Low-Energy transmitter in 28 nm CMOS with 36% system efficiency at 3 dBm. *IEEE Journal of Solid-State Circuits*, 51(7), 1547–1565.
- 26. Yu, W.-H., Yi, H., Mak, P.-I., Yin, J., & Martins, R. P. (2017, February). A 0.18 V 382µW bluetooth low-energy (BLE) receiver with 1.33 nW sleep power for energy-harvesting applications in 28nm CMOS. In *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers* (pp. 414–415).
- 27. Yin, J., Yang, S., Yi, H., Yu, W.-H., Mak, P.-I., & Martins, R. P. (2018, February). A 0.2V energy-harvesting BLE transmitter with a micropower manager achieving 25% system efficiency at 0dBm output and 5.2nW sleep power in 28nm CMOS. In *IEEE International Solid-State Circuits Conference (ISSCC) Digest of Technical Papers* (pp. 450–451).
- Lei, K.-M., Mak, P.-I., Law, M.-K., & Martins, R. (2018, February). A regulation-free sub-0.5V 16/24MHz crystal oscillator for energy-harvesting BLE radios with 14.2nJ startup energy and

31.8µW steady-state power. In *IEEE International Solid-State Circuits Conference – (ISSCC)* Digest of Technical Papers (pp. 52–53).

- Klauder, J. R., Price, A. C., Darlington, S., & Albersheim, W. J. (1960, July). The theory and design of chirp radars. *The Bell System Technical Journal*, 39(4), 745–808.
- Vittoz, E. A., Degrauwe, M. G., & Bitz, S. (1988, March). High-performance crystal oscillator circuits: Theory and application. *IEEE Journal of Solid-State Circuits*, 23(3), 774–783.
- 31. Lei, K.-M., Mak, P.-I., & Martins, R. P. (2017, May). A 0.4 V 4.8 μW 16MHz CMOS crystal oscillator achieving 74-fold startup-time reduction using momentary detuning. In *IEEE International Symposium on Circuits and Systems (ISCAS)* (pp. 2791–2794).
- 32. Iguchi, S., Saito, A., Zheng, Y., Watanabe, K., Sakurai, T., & Takamiya, M. (2013, June). 93% power reduction by automatic self power gating (ASPG) and multistage inverter for negative resistance (MINR) in 0.7 V, 9.2 μW, 39MHz crystal oscillator. *IEEE Proceedings of the Symposium on VLSI Circuits*, C142–C143.
- 33. Bluetooth Core Specification v5.0 [Online]. Available: https://www.bluetooth.com/specifications/bluetooth-core-specification
- 34. Khan, O., et al. (2016, May). Frequency reference for crystal free radio. In *IEEE International Frequency Control Symposium* (pp. 1–2).
- 35. Pletcher, N. M., Gambini, S., & Rabaey, J. (2009, January). A 52 μW wakeup receiver with 72 dBm sensitivity using an uncertain-IF architecture. *IEEE Journal of Solid-State Circuits*, 44(1), 269–280.
- Sundaresan, K., Allen, P., & Ayazi, F. (2006, February). Process and temperature compensation in a 7-MHz CMOS clock oscillator. *IEEE Journal of Solid-State Circuits*, 41(2), 433–442.
- 37. Zhang, L., Kuo, N.-C., & Niknejad, A. (2019, October). A 37.5–45 GHz superharmoniccoupled QVCO with tunable phase accuracy in 28 nm CMOS. *IEEE Journal of Solid-State Circuits*, 54(10), 2754–2764.
- Ding, X., Wu, J., & Chen, C. (2019, February). A low-power 0.6-V quadrature VCO with a coupling current reuse technique. *IEEE Transactions on Circuits and Systems II: Express Briefs*, 66(2), 202–206.
- Meng, X., Li, X., Cheng, L., Tsui, C.-Y., & Ki, W.-H. (2019, December). A low power relaxation oscillator with switched-capacitor frequency-locked loop for wireless sensor node applications. *IEEE Solid-State Circuits Letters*, 2(12), 281–284.
- Mikulić, J., Schatzberger, G., & Barić, A. (2017, September). A 1-MHz on-chip relaxation oscillator with comparator delay cancelation. In *Proceedings of the European Conference on Solid-State Circuits (ESSCIRC)* (pp. 95–98).
- 41. Savanth, A., Weddell, A., Myers, J., Flynn, D., & Al-Hashimi, B. (2019, November). A sub-nW/kHz relaxation oscillator with ratioed reference and sub-clock power gated comparator. *IEEE Journal of Solid-State Circuits*, 54(11), 3097–3106.
- 42. Tokairin, T., et al. (2012, June). A 280nW, 100kHz, 1-cycle start-up time, on-chip CMOS relaxation oscillator employing a feedforward period control scheme. *IEEE proceedings of the Symposium VLSI Circuits*, 16–17.
- Koo, J., Moon, K.-S., Kim, B., Park, H.-J., & Sim, J.-Y. (2017, February). A quadrature relaxation oscillator with a process-induced frequency-error compensation loop. In *IEEE International Solid-State Circuits Conference – (ISSCC) Digest of Technical Papers* (pp. 94–95).
- 44. Liu, N., et al. (2019, July). A 2.5 ppm/°C 1.05-MHz relaxation oscillator with dynamic frequency-error compensation and fast start-up time. *IEEE Journal of Solid-State Circuits*, 54(7), 1952–1959.
- 45. Lee, J., George, A. K., & Je, M. (2020, September). An ultra-low-noise swing-boosted differential relaxation oscillator in 0.18-μm CMOS. *IEEE Journal of Solid-State Circuits*, 55(9), 2489–2497.
- 46. Lu, S.-Y., & Liao, Y.-T. (2019, February). A low-power, differential relaxation oscillator with the self-threshold-tracking and swing-boosting techniques in 0.18-μm CMOS. *IEEE Journal of Solid-State Circuits*, 54(2), 392–402.

- 47. Zhou, W., Goh, W. L., & Gao, Y. (2020, October). A 3-MHz 17.3-μW 0.015% period jitter relaxation oscillator with energy efficient swing boosting. *IEEE Transactions on Circuits and Systems II: Express Briefs*, 67(10), 1745–1749.
- Abidi, A. A., & Meyer, R. G. (1983, December). Noise in relaxation oscillators. *IEEE Journal of Solid-State Circuits*, 18(6), 794–802.
- 49. Razavi, B. (2001). Design of Analog CMOS integrated circuits. Mc Graw Hill.
- 50. Ho, Y., Yang, Y.-S., Chang, C., & Su, C. (2013, November). A near-threshold 480 MHz 78 μW all-digital PLL with a bootstrapped DCO. *IEEE Journal of Solid-State Circuits*, 48(11), 2805–2814.
- Lee, H., Li, Z., Durrant, J. R., & Tsoi, W. C. (2016, June). Is organic photovoltaics promising for indoor applications? *Applied Physics Letters*, 108(25), 1–5.
- 52. Liao, W., et al. (2016, November). Lead-free inverted planar formamidinium tin triiodide perovskite solar cells achieving power conversion efficiencies up to 6.22%. Advanced Materials, 28(42), 9333–9340.