1 Introduction

Three-dimensional (3-D) machine vision can allow for a robotic or autonomous system to dynamically interact with its physical surroundings, identify objects, and navigate in working environments. In order to enable this, improvements in high-resolution depth image sensing technology are necessary. While millimeter-wave (MMW) radars [1] have been the sensor of choice in automotive ranging detection for years, their spatial resolution capability is limited due to their relatively long electromagnetic wavelength. On the contrary, light detection and ranging (LIDAR) systems with imaging array or so-called time-of-flight (ToF) sensors is a fast-growing class of depth sensors that offers higher spatial resolution due to the shorter optical wavelength.

Figure 1 shows a block diagram of a pulse-based LIDAR sensor. A laser diode (LD) outputs an optical pulse which is collimated through a transmitter focal lens in front of the laser head. Return pulses, reflected by remote objects, are collected by the receiver lens and focused on the sensor array. Receiver sensitivity is important in these systems. Due to the properties of Lambertian reflectance on most natural surfaces, the reflected ToF signal captured by a sensor is strongly attenuated by a ratio of 1/R2, where R is the range distance of the reflectors [2]. The situation is potentially worse in outdoor environments where an optical filter, which is used to suppress strong background noise, can further depress reflected ToF signals. This motivates the use of highly-sensitive solid-state photodiodes, such as avalanche photodiodes (APDs) and single-photon avalanche diodes (SPADs), which can be implemented with micrometer-scale footprints in CMOS technologies and provide for low-power, low-cost, and large array LIDAR sensor systems-on-a-chip (SoCs). These devices are generally utilized in the sensor array pixels to capture the low intensity of the returned pulses. The time of flight is then digitized with a time-to-digital converter (TDC) array and often computed with time-correlated single photon counting (TCSPC) algorithms implemented in the DSP.

Fig. 1
figure 1

Block diagram of pulse-based LIDAR sensor

At the system level, laser scanning approaches with collimated laser beams are common in many LIDAR sensors to achieve a higher signal to noise ratio. This allows for an expandable image resolution with a small pixel array through assembling images over the scanning viewing angle. For example, in [3] a mechanically-rotated mirror was designed to distribute a laser beam over a horizontal field-of-view (FoV) of 55° and vertical FoV of 9°. This allowed an image resolution of 202 × 96 with only a 16 × 1 ToF macro pixel array. A more advanced laser scanning with micro-electromechanical system (MEMS) mirrors achieved horizontal and vertical FoV of 45° and 11°, respectively [4].

The embedded time-to-digital converter (TDC) array is an important block that sets the accuracy of the time-resolved image sensing in LIDAR sensors. A key challenge in TDC designs is the generation of the counting signals, with two main architectures utilized. In global-based counting [3, 5], the counting signal is distributed from an external source to the entire TDC array. Since a common counting source is shared, this architecture achieves high channel uniformity and well-controlled resolution. However, this approach consumes significant power in the counting clock distribution when the array size increases due to the large parasitic capacitance of long distribution wires. Alternatively, pixel-based counting [6, 7] saves power by integrating the counting source, i.e. a gated delay line or gated ring oscillator, within each TDC channel. This allows for further power reductions via a “reverse start-stop scheme” that turns off (gates) the counting sources while no photon flux is detected. Nevertheless, this local counting source approach generally yields poor channel uniformity due to channel mismatch. Moreover, the discontinuous current profile consumed by gated operation can induce time-varying IR drop and has the potential disadvantage of image-dependent TDC resolution.

This paper presents a LIDAR sensor SoC that utilizes an APD receiver front-end followed by a TDC design based on local injection-locked oscillator (ILO) counting clock generation which both minimizes clock distribution power and achieves good channel uniformity [8]. Section 2 provides an analysis of the power and timing accuracy of TDC array clock distribution techniques. An overview of the proposed LIDAR sensor prototype, which contains a 32 × 1 TDC array that provides 31 TCSPC channels is given in Sect. 3. Section 4 details the key circuitry that makes up the TCSPC channels, including the APD front-end receivers, TDC converter array, and phase-locked loop (PLL) ILO biasing scheme. Both electrical characterization results of the LIDAR sensor system, fabricated in a GP 65 nm CMOS process, and pulsed-laser testing with an APD array hybrid-integrated with the CMOS SoC are shown in Sect. 5. Finally, Sect. 6 concludes the paper.

2 TDC array clock distribution techniques

Sub-ns time-resolved measurement systems often utilized multiple clock phases derived from a relatively low frequency signal to improve resolution. For instance, a time conversion with 312.5 ps resolution is possible with 16-phase counting clocks generated from 200 MHz 16-stage delay-locked loop (DLL) [5]. While this lowers the maximum clock frequency requirements, distributing these multi-phase counting clocks amongst a TDC array can consume significant power. Moreover, depending on the distribution scheme and array size, the jitter accumulated along with the distribution path can degrade the TDC timing accuracy. Thus, it is important to analyze counter clock distribution architectures in terms of power and timing accuracy in the context of a TDC array implementation. Table 1 lists three state-of-the-art architectures often used (Type-1 to Type-3) and a novel TDC clocking scheme based on injection-locked oscillators (Type-4).

Table 1 Clock distribution comparison

Type-1 is a global counting scheme where a single buffering stage distributes multi-phase counting clocks from the clock source to the TDC channels through a global routing bus [3]. As there is only a single buffer stage, this clock distribution scheme performs well in terms of channel uniformity and system reliability. The power of the clock distribution is computed considering TDC wire routing, with parasitic capacitance C0 and resistance R0, and TDC input capacitance Ctdc. Assuming this is dominated by dynamic power, the power to distribute B-phase clocks amongst an N × 1 TDC array is

$$ P = BN \cdot C_{0t} V^{2} f $$
(1)

where C0t is the sum of C0 and Ctdc, V is the operating voltage, and f is the clock frequency. While minimizing the clock signals’ transition times improves supply-induced jitter, this must be balanced with the overall power consumption. Here the transition time is approximated as the 10%-90% signal transition time of a first-order RC circuit, which for the Type 1 architecture is [9]

$$ \tau = 2.2\left( {\frac{N}{2}} \right)^{2} R_{0} C_{0t} $$
(2)

when the clocks are distributed from the center of the TDC array. This shows that while the Type-1 architecture is popular in many design for its simplicity, it is not desired in large TDC array implementations or ones that require high timing accuracy since the signal transition time is proportional to N2. Overall, the huge parasitic RC effect on the long distribution wires degrades the signal slew rate and increases jitter sensitivity, which impacts timing accuracy.

In order to address these long global distribution wire issues, the Type-2 clock distribution scheme uses symmetrical multi-stage buffers to strengthen the driving capability [5]. The effective transition time in a X-stage buffering is

$$ \tau = 2.2 \cdot \left( {\mathop \sum \limits_{i = 1}^{X} \frac{{R_{i} C_{i} }}{4}} \right), $$
(3)

where Ri and Ci are the parasitic resistance and capacitance seen by the i-th buffer stage, respectively. Assuming symmetric distribution, the parasitic loading of the stages is

$$ R_{i} = \frac{{NR_{0} }}{{2^{i} }}\quad \quad C_{i} = \frac{{NC_{0} }}{{2^{i} }}. $$
(4)

Therefore, (3) can be rewritten as:

$$ \begin{aligned} \tau_{1,X - 1} = 2.2 \cdot \left( {\mathop \sum \limits_{i = 1}^{X - 1} \frac{{N^{2} R_{0} C_{0} }}{{4^{i + 1} }}} \right) = 2.2\left( {\frac{N}{2}} \right)^{2} \left[ {\frac{{1 - \left( {\frac{1}{4}} \right)^{X - 1} }}{3}R_{0} C_{0} } \right],\;\tau_{X} = 2.2 \cdot \frac{{N^{2} }}{{4^{X + 1} }}R_{0} C_{0t} \hfill \\ \tau = \tau_{1,X - 1} + \tau_{X} = 2.2\left( {\frac{N}{2}} \right)^{2} \cdot \left[ {\frac{{1 - \left( {\frac{1}{4}} \right)^{X - 1} }}{3}R_{0} C_{0} + \frac{{R_{0} C_{0t} }}{{4^{X} }}} \right] \hfill \\ \end{aligned} $$
(5)

Notice that the maximum τ is 1/3 of the Type-1 value when X is infinite. Key advantages of the Type-2 scheme are that it distributes clocks to each TDC with nominally the same propagation delay and reduces the effective wire RC time constant and deterministic jitter. However, the cost is the additional power consumed by the extra buffer stages.

$$ \begin{aligned} P = & B\left[ {\left( {\mathop \sum \limits_{i = 1}^{X} 2^{i - 1} \cdot C_{i} } \right)V^{2} f + \mathop \sum \limits_{i = 1}^{X} 2^{i - 1} \cdot P_{b} } \right] \\ = B\left[ {X\frac{N}{2} \cdot \left( {C_{0} V^{2} f} \right) + \left( {2^{X} - 1} \right)P_{b} } \right], \\ \end{aligned} $$
(6)

where Pb is the power of one clock buffer.

An alternative to the Type-1 and 2 global counting schemes is offered by the Type-3 scheme that utilizes local free-running ring oscillators (ROs) to generate multi-phase counting clocks inside the TDC array. These local ring oscillators can be shared by multiple TDC channels. Based on simulation results, the four-channel-shared architecture shown in Table 1 is optimimum in terms of power and driving capability. PVT variations are compensated by a global control voltage signal to tune the oscillation frequency. This design benefits from the local clock generation approach, which avoids the high power from the clock distribution over long wire traces and improve the signal transition times since the parasitic wire RC is minimized over a local region. The signal transition time in a 4-shared ring oscillators is

$$ \tau = 2.2 \cdot 4R_{0} C_{0t} , $$
(7)

which is independent of the TDC array size. However, a serious issue encountered in this scheme is the large phase noise from the free-running ROs, which impacts the integral nonlinearity (INL) in the TDC conversion. In addition, the variation of oscillating frequency due to the process mismatch will impact the uniformity of the resolution in array-based TDC that usually requires background calibration to calibrate the final TDC value. While many state-of-the-art designs reduce phase noise accumulation and power by utilizing the gated ROs (GROs) and “reverse start-stop schemes” to reset the oscillators [10], the discontinuous current consumed by the GROs can induce time-varying IR drop and has the potential disadvantage of image-dependent TDC resolution.

A solution to the Type-3 phase noise issues is possible with the proposed Type-4 scheme that utilizes injection-locked oscillators (ILOs) [11]. This design takes advantage of the low phase noise in ILOs which have the oscillating frequency and phase locked by an external clock. While a global clock distribution containing multi-stage buffers similar to the Type-2 is utilized to distribute the external clock to ILOs, less power is consumed because only one clock phase is distributed. Although this global clock distribution can potentially induce additional deterministic jitter, the ILOs reject any high frequency jitter induced by power supply noise in the global clock distribution due to their inherent first-order jitter filtering. This approach with 4-shared ILOs also offers the same clock transition rates as the Type-3 scheme.

Figure 2 shows the simulated power consumption and 10%-90% signal transition times of the four architectures listed in Table 1 utilizing a testbench that produces 16 phases of 1.2 GHz counting clocks. This corresponds to a system which is targeting a 52 ps TDC resolution. All designs are simulated with the TDC array size varying from 8 × 1 to 64 × 1. These simulation results show that the local counting schemes (Type-3 and Type-4) have significant advantages in terms of power consumption relative to the global counting schemes (Type-1 and Type-2). Moreover, since the loading of the local oscillators is constant (four TDCs share one oscillator), the signal transition times is independent of the TDC array size in the local counting schemes. This is in contrast to the global counting schemes where the transition times ramp up dramatically as the TDC array expands. Although the same ring oscillator design is utilized in the Type-3 and Type-4 schemes, the Type-4 approach consumes an additional 25% power to distribute the injection-locking clock for phase-locking to the ILOs. However, given that the reduced phase noise properties of the ILO-based approach will translate into significant TDC performance improvement, the Type-4 scheme is chosen in the presented LIDAR sensor prototype.

Fig. 2
figure 2

Simulated power and 10–90% counting clock transition times of the four TDC clock distribution architectures

3 LIDAR sensor architecture

A block diagram of the LIDAR sensor SoC prototype is shown in Fig. 3. The SoC includes a 31 × 2 pixel channel array, with each pixel containing a transimpedance amplifier (TIA) front-end receiver to sense APD photocurrent and amplify it to a full-swing ToF signal. A PMOS open-drain buffer at the receiver output allows column-based bit-line sharing, resulting in 31 ToF outputs from the pixel array. For electrical characterization, photodiode emulators are embedded in each column of channel array to emulate the transient current profile of APD devices. The emulators are designed to generate a 5-ns current pulse with programmable current amplitude, providing the capability to measure the sensitivity of receiver front-end and characterize the TCSPC channels. The 32 × 1 TDC array resolves time events by utilizing the START signal and the 31 ToF signals. An on-chip third-order PLL synthesizes a 1.2 GHz TDC reference frequency for generation of coarse counting signals and for process, voltage, and temperature (PVT) tolerant biasing of the ILOs used for fine counting clock phase generation. After conversion is finished, a digital timing control block (TCON) synchronizes a 32-to-1 time-multiplexed readout circuit to output the digitized ToF data at a frequency of 18.75 MHz. This results in a 1.7 µs minimum readout time, as well as the latency time, for each row.

Fig. 3
figure 3

LIDAR sensor prototype with embedded 32 × 1 TDC array

4 Key circuits

4.1 APD front-end receiver

As shown in Fig. 4, the receiver front-end utilizes a three-inverter-stage TIA. The TIA has been modified from the design in [12, 13] for lower area and power in order to be suitable for the arrayed implementation. By utilizing resistive feedback RF1 in the first gain stage I1, a low input impedance is developed to achieve sufficient TIA bandwidth with the input pad and hybrid-integrated APD capacitance. The second and third gain stages (I2 and I3) operated as limit amplifiers to further amplify the signal. A feedback loop from the I3 output adjusts the input average current through I5 to control the common-mode level in a self-biased manner. The first-order low-pass filter (LPF) in this feedback loop utilizes Miller multiplication to reduce area and achieve a 5.3 MHz cut-off. The receiver front-end achieves a simulated 78.3 dB Ω gain, 1.5 GHz bandwidth, and 860nArms input-referred noise. A subsequent inverter-based comparator (I6 and I7) amplifies the front-end TIA output to a full-swing signal to drive a PMOS open-drain buffer for bit-line sharing. The bit line is pulled to ground by resistor RPD when no photon is captured. In order to prevent an incorrect decision due to noise or PVT-related trip point variation in the comparator, the gate width ratio of the PMOS and NMOS in I4 is carefully selected so that a 70 mV threshold is induced between the output common-mode level at I3 and the trip point at I6. Although this threshold reduces TIA sensitivity to 50 uApp, the receiver sensitivity with a conventional APD that has 0.8 A/W responsivity and an optical gain of 10 is – 22 dBm, which is applicable for LIDAR sensing with a laser-scanning approach. In order to reduce power supply noise coupling from other on-chip sources, the front-end supply is separated from others receiver circuitry. This front-end receiver occupies an area of 25 × 21 µm2 and has a power consumption of 300 µW. In rolling shutter operation, only one row of receivers is turned on at a given time to minimize system power.

Fig. 4
figure 4

APD front-end receiver and bit-line buffer with overflow detection

The pulsewidth of the ToF signals after the receiver front-end can be dependent on the incident optical power. When the power is near the sensor’s sensitivity level, the receiver may not be able to provide a full-swing ToF signal at the bit-line output and can induce metastability and degrade the TDC mean time between failure (MTBF). This issue is addressed with the bit-line buffer shown in Fig. 4 that regenerates a constant 5-ns width pulse from the rising edge of the original ToF signal and preserves signal integrity at the TDC input. An overflow detector is also implemented with the DFF2 register to detect the first rising edge at the output of front-end receiver. If rising edged is detected (photon captured), the PhC_b signal will gate the overflow signal that is asserted at the end of each TDC conversion period, so TDC register keeps the real ToF value and would not be flushed out by overflow. If no rising edge is detected (no photon captured), the overflow signal will trigger Vo at the end of TDC measurement and force the TDC register capturing the last digital number.

4.2 ILO-based time-to-digital converter array

Figure 5 shows the 32 × 1 TDC array block diagram and timing diagram. A two-stage flash TDC is utilized to support high dynamic range conversion. A sliding scale technique [14] is utilized to improve conversion linearity. This involves having the START signal, which is synchronized with the laser emission signal, serve as the TDC[0] input, while the TDC[31:1] inputs are the 31 ToF signals triggered from the pixel channel array. The final ToF value of TDC[N] (N = [31:1]) is the time difference between TDC[N] and TDC[0]. Each TDC involves two 10-bit coarse-TDC (CTDC) registers and one 16-phase fine-TDC (FTDC) edge detector. Once TDC STOP is triggered, the TRIG signal will latch the 10-bit digital counting code (CTDC0/CTDC1) and the 16-phase FTDC phase information (f ILO [15:0]) into the two CTDC registers and FTDC phase detector respectively. The FTDC phase information is then encoded into a 4-bit binary code as the FTDC output. The FTDC edge detector is composed of 8 differential sense amplifiers (SAs), similar to the design in [15].

Fig. 5
figure 5

32 × 1 TDC array and ILO distribution circuitry: a block diagram and b timing diagram

A double counting scheme [16] with two counters operating at a 180° phase shift (CTDC0 and CTDC1) is utilized to avoid any missing code induced by phase misalignment between coarse and fine TDCs. The final choice of the CTDC is decided by the most significant bit (MSB) of the 4-bit FTDC output. As shown in Fig. 5(b), CTDC0 is chosen as the output when FTDC [3] is equal to 0 (T1 period), and CTDC1 is chosen when FTDC [3] is equal to 1 (T2 period). Since the register values latched at the transition edge are discarded, the transition edge of the CTDC counting signal would not affect TDC conversion. 10-bit ripple counters clocked by the 1.2 GHz PLL clock are utilized for CTDC0 and CTDC1. These counter outputs are distributed across the entire TDC array by a symmetric four-stage 1-to-8 CMOS buffer network. In order to minimize the CTDC power, the counter values are encoded as Gray-code before distribution.

Eight ILOs are utilized to generate the 16-phase 1.2 GHz clocks (f ILO [15:0]) utilized by local clusters of four FTDCs. The FTDCs slice time utilizing these clock edges and achieve a resolution defined by the spacing between two adjacent phases, which is 52 ps in nominal operation. These ILOs are locked by a globally-distributed 1.2 GHz locking frequency, fLOCK, generated by the PLL. This local multi-phase clock generation scheme saves significant power and offers less phase skew relative to the global distribution of 16 clock phases to the 32 TDCs. The ILO, shown in Fig. 6, is an 8-stage differential ring oscillator (RO) that generates 16-phase clock outputs. Current-starved delay cells are utilized in the ILO for improved power supply noise rejection, with the free-running oscillation frequency controlled by a tail current source that is split into two parts. One current source is controlled by the PLL to compensate for global PVT variations, while the other is set by an independent 5-bit programmable code to calibrate for mismatch-induced frequency offsets. AC-coupled inverters with resistive feedback are utilized to injection-lock the ILO to the global 1.2 GHz clock, providing for improved phase spacing uniformity [17]. The ILO consumes 1.9mW, including 1.5mW from the core RO, and 0.4 mW from 16 output level shifters.

Fig. 6
figure 6

Injection-locked oscillator (ILO)

4.3 PLL-based PVT-tolerant ILO biasing

Adaptive control of the ILOs’ free-funning oscillation frequency is achieved with a PLL-based replica-biasing scheme that compensates for global PVT variations. As shown in Fig. 7, in the PLL a voltage-to-current converter (VIC) is used to convert the control voltage, V c , to 9 replicated currents, I ref [8:0]. I ref [0] controls the frequency of the RO in the PLL, while the remaining I ref [8:1] currents control the frequency of the 8 ILOs utilized by the FTDCs. Since the same RO design is shared by the PLL and ILOs, correlated global PVT variations are sensed by the PLL, and the ILO free-running frequency is compensated by the replica-bias I ref [8:1] signals.

Fig. 7
figure 7

PLL-based replica-biasing scheme for the ILOs

5 Experimental Results

The LIDAR sensor prototype was fabricated in a GP 65 nm CMOS process. As shown in the chip micrograph in Fig. 8, the chip area is 1.43 × 1.6 mm2. The 32 × 2 pixel channel array is located in the center of the chip, with the 8 FTDC ILOs and clock distribution buffers below. On the right side is the global PLL which clocks the CTDC counters and drives the global distribution buffers. In order to electrically characterize the proposed TDC array, on-chip photodiode emulators were embedded in each pixel channel to emulate APD photocurrent with a programmable amplitude. LIDAR sensor operation is verified with a hybrid-integration approach that utilizes a stacked photonic chip with a 4 × 2 InP APD-array connected to eight of the TIA receivers on the CMOS prototype. This integration approach allows for short wirebonds to achieve suitable operating bandwidth.

Fig. 8
figure 8

LIDAR sensor chip micrograph

TDC single-shot precision (SSP) is characterized with a time input triggered by external START and STOP signals. The measurement includes both the front-end receiver and TDC channel. Figure 9 shows the measured SSP of a single TDC channel (TDC [1]) with time inputs of 151 and 502 ns, and Fig. 10 shows the SSP across 31 TDC channels (TDC[31:1]) for time inputs of 151, 502, and 703 ns. These measurement results show a uniform SSP profile, implying uniform jitter performance amongst the 8 ILOs. The accuracy is centered at roughly 32 psrms at a time input of 151 ns to 36 psrms at a time input of 703 ns, with a maximum value of 37.2 psrms. Although the rms accuracy slightly increases with larger time inputs, the value is still confined below one LSB.

Fig. 9
figure 9

Single-shot precision of one TDC channel with time input at a 151 ns and b 502 ns

Fig. 10
figure 10

Single-shot precision across 31 TDC channels with time input at 151, 502, and 703 ns

TDC channel uniformity is measured by buffering a global STOP signal through a symmetric clock buffer into 31 TDC channels simultaneously. Jitter effects are filtered out by averaging 2000 measurement results for each time input. Figure 11 shows the channel uniformity of 31 TDC channels with time inputs of 151 ns and 502 ns. The worst-case accuracy is 11 psrms (0.21 LSB), which is close to the 0.13 LSB of the global distribution scheme utilized in [5].

Fig. 11
figure 11

Uniformity of 31 TDC channels with time input at a 151 ns and b 502 ns

Linearity is measured through a code density approach to minimize any jitter and noise effects. Two clock domains, 375 kHz and (375 kHz–1 Hz), are utilized to generate a time ramp input for the measured DNL and INL shown in Fig. 12. The measurements are taken with 200 ramp periods to assure even code density distributed among 214 TDC codes. The maximum DNL and INL throughout the range of 830 ns are 29.1 ps (0.56 LSB) and 81.1 ps (1.56 LSB), respectively. Due to the double counting scheme, there is no missing code issue in the TDCs.

Fig. 12
figure 12

TDC DNL and INL. The maximum DNL is 0.56 LSB. No missing code occurs in the linearity measurements

Figure 13 shows the mean and single-shot-precision of TCSPC channel response at different photocurrent levels. The photocurrent is emulated by an on-chip photodiode emulator that provides a 5-ns current pulse with programmable peak current from 25 to 90 uA. The channel response has a large deviation at 25 uA: mean is 350 ps higher and single-shot-precision is 50 ps worse than the other three. This deviation is due to the limitation of TIA sensitivity mentioned in Sect. 4, while the TIA output is not rail-to-trail triggered and causes metastability issue in front-end receiver, resulting in longer propagation delay and higher time uncertainty in TCSPC time-resolved chain. The response is stable once the photocurrent is higher than 50 uA.

Fig. 13
figure 13

TCSPC channel output mean and single-shot-precision at different photocurrent levels

Figure 14 shows the TCSPC channel response and single-shot ToF error measured in hybrid-integrated LIDAR module over a range of 700 ns. In this test, a laser fiber is mounted on the top of the sensor prototype and a 1550 nm pulsed optical beam is emitted on the APD array perpendicularly with a 375 kHz repetition rate, 10 ns full-width at half maximum (FWHM) pulse duration, and 40 W peak power. The laser emission time, referred to the TDC start time, is controlled by the phase delay function of the function generator. Figure 14(a) shows a linear response from the TCSPC channel. Figure 14(b) shows the maximum single-shot error is 3.2 ns (0.46%), which is larger than the measured SSP in Fig. 9. This is believed to be due to the resolution of the phase delay function in employed function generator.

Fig. 14
figure 14

TCSPC channel response: a measured ToF, and b error from the actual emission time

Table 2 shows the SoC power breakdown. The chip consumes 39 mW from the nominal 1 V supply with the PLL operating at 1.2 GHz. This power is dominated by the TDC array, with a per-channel TDC consumption of 788 μW. Table 3 summarizes the LIDAR sensor performance and compares this work against recent designs. Relative to the global-based counting designs of [3, 5, 15], the proposed TDC allows for much higher resolution and much lower channel power. While the pixel-based of [6] is lower power, the proposed design achieves a much higher maximum range and an order of magnitude improvement in channel uniformity.

Table 2 SoC power breakdown
Table 3 Performance summary and comparison

6 Conclusion

This paper presented a LIDAR sensor SoC prototype that consists of 31 × 2 pixel channels and an embedded 32 × 1 TDC array. The TIA-based receiver front-end achieves a simulated 78.3 dB Ω gain and 1.5 GHz bandwidth for APD photocurrent conversion, while a PMOS open-drain buffer allows column-based bit-line sharing. A power down controls allows for turning on only one row of the receiver at a given time to support rolling-shutter operation and minimize system power. The proposed TDC used a double counting scheme and a gray-code counter to avoid any missing code in TDC conversion and increase system reliability. Global clock distribution power is minimized by utilizing 8 replica-biased multi-phase 1.2 GHz ILOs in the TDC array, which allows for good timing accuracy and channel uniformity. A PLL-based biasing scheme allows for adaptive compensation of PVT variations of the ILOs’ free-running oscillation frequency. The CMOS SoC is hybrid-integrated with a 4 × 2 APD chip and achieves time-correlated single photon counting over a 700 ns counting range. Utilizing the presented electrical TDC characterization results, the SoC can support a range precision of 0.78 cm and a maximum distance of 124 m in terrestrial and automotive LIDAR applications.