Abstract
A dual-time resolution differential-time signaling (DTR-DTS) architecture is proposed in this paper. The number of transmitted bits per symbol in a time-based serial link can be increased by using dual-time resolution pulse-position modulation at the transmitter side instead of using the conventional pulse-position modulation without significantly affecting the transmitted signal bandwidth. Using the proposed architecture, the receiver design is simplified by using time-to-digital converter (TDC) circuits with less number of bits compared to the differential-time signaling (DTS) architecture for the same link rate. The design details are presented in this paper. A simulated 8-bit 12 Gb/s DR-DTS link has been designed in 65 nm mixed signal CMOS process using Cadence tools. Four TDC circuits, 2-bits each, have been used to recover the 8-bits from the received signal. The simulated DTR-DTS link consumes 3.6 mW without the driver circuit.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
With the increase demand of high capacity data transmission, the demand for high-speed serial links becomes more than the parallel links. Serial links reduces number of traces/cables, cost, power dissipation and area on chip compared to parallel links. Channel and link circuitry bandwidth is limiting the increase of the link rate. The circuitry bandwidth is improving with technology scaling, however the channel bandwidth does not improve.
Serializer/Deserializer (SerDes) architectures are commonly used in the market as serial links [1–3]. An accurate and high frequency input clock signal is required at the transmitted side to serialize the parallel input data in order to generate the transmitted signal. As a result, the transmitted signal bandwidth becomes much higher than other serial links. Very sophisticated equalization techniques [4] are used at the receiver side in order to compensate for the channel attenuation and reflection. Several pulse-amplitude modulation (PAM) techniques are presented [5–7] in order to serialize a certain number of input bits into one transmitted signal and occupy small bandwidth compared to SerDes architectures. In PAM links, the number of transmitted bits per symbol cannot be easily increased because of reduced voltage resolution with technology scaling. A pulse-amplitude modulation and pulse-width modulation has been combined and presented [8] in order to increase number of transmitted bits per symbol by using two different types of modulations to the input clock signals. PAM, pulse-width modulation (PWM) and PAWM requires the input clock signals to be transmitted along with the modulated signal in order to be used in the recovery process.
In [9], the authors presented the differential-time-signaling (DTS) architecture. In such architecture, the input clock signal edges are modulated independently in order to increase the number of transmitted bits per symbol. A reference clock pulse is embedded in the transmitted signal in order to be used as a reference edge for the receiver circuit to avoid transmitting the clock on a separate channel. The DTS architecture reduces the transmitted signal bandwidth compared to other serial links such as SerDes architecture for the same link rate as well as uses much lower input clock signal frequency [9]. At the receiver side, the transmitted bits are recovered using time-to-digital (TDC) converter circuits. Increasing the number of transmitted bits using the proposed architecture in [9], increases the required number of bits of the TDC circuit used by the receiver circuit. High number of bits TDC circuit is very hard to design. Delay line-based flash TDCs operating at high frequency do not support high bit resolution [10], which does not allow increasing the serial link rate.
In this work, a dual-time resolution DTS architecture is presented. The design of the proposed DTR-DTS transmitter circuit is based on the dual-time resolution pulse-position modulator (DTR-PPM), which was presented by the authors in [11]. The transmitter side of the proposed architecture uses two DTR-PPM, which was presented in [11] in order to dual modulate both the rising and the falling edges of the input clock signal.
Since designing high number of bits TDC circuits is challenging, while the design of low number of bits TDC circuits is much easier and simpler, using the dual-time resolution modulation in time-based serial link architecture design allows using multiple TDC circuits with low number of bits at the receiver side instead of using one TDC circuit with high number of bits. As a result, using DTR-PPM allows more bits to be transmitted while simplifies the receiver circuit design and slightly affects the transmitted signal bandwidth. Using the proposed architecture allows increasing the total number of transmitted bits while simplifying the receiver design or in other words the TDC design.
2 System architecture
2.1 Transmitter side
2.1.1 DTR-DTS architecture
The transmitter circuit block diagram for the proposed DTR-DTS serial link is shown in Fig. 1. It is similar to the block diagram of the DTS architecture [9] except using a DTR-PPM rather than using a conventional PPM circuit in order to increase the number of transmitted bits per symbol. As shown in Fig. 1, the transmitter circuit uses two DTR-PPM. Assuming four sets of input bits (A0, A1,…, AN1), (B0, B1,…, BN2), (C0, C1,…, CN3) and (D0, D1,…, DN4), the first DTR-PPM modulates the positive edge of the input clock signal according to the (N1 + N2) number of bits. The second DTR-PPM modulates the negative edge of the input clock signal according to the (N3 + N4) number of bits. The outputs of the DTR-PPM circuits are then combined to generate the data signal. A reference clock signal is generated and embedded in the transmitted signal, as shown in Fig. 2, to be used as a reference edge at the receiver side.
The block diagram of the DTR-PPM is shown in Fig. 3 [7]. The upper part represents the low-time resolution pulse-position modulator (LTR-PPM) while the lower part is the high-time resolution pulse-position modulator (HTR-PPM). The circuit uses dual resolution, which are T1 and ΔT, where T1 is the low time resolution and ΔT is the high time resolution. The HTR-PPM circuit uses a tunable delay element as proposed in [12], in order to achieve a differential delay value of ΔT. The tunable delay element circuit diagram is shown in Fig. 4. The DC voltages Vgp and Vgn are used to control the rising and falling time of the positive and negative edges of the input signal.
The signal combination circuit is a D-type flip-flop with the D-input is connected to Vcc, the clock input is connected to the output of the upper DTR-PPM as shown in Fig. 1 and the reset is connected to the lower DTR-PPM. The output stage is a driver stage that drives as well as matches the transmitter output to the channel input impedance. Figure 5 shows the circuit diagram of the 5-stages CML driver that has been used in the DTR-DTS link design.
2.1.2 Transmitted signal
As shown in Fig. 1, the transmitter circuit modulates an N total number of bits, which equals to N1 + N2 + N3 + N4. N1 + N2 bits modulate the positive edge of the input clock signal and N3 + N4 bits modulate the negative edge of the input clock signal. According to Fig. 1, The input bits (A0, A1,…, AN1) modulate the positive edge of the input clock signal using T1 as time resolution while the input bits (B0, B1,…, BN2) modulate the LTR-PPM output signal using ΔT as time resolution. Similarly, the input bits (C0, C1,…, CN3) modulate the negative edge of the input clock signal using T1 as time resolution while the input bits (D0, D1,…, DN4) modulate the LTR-PPM output signal using ΔT as time resolution.
The delay time assigned to the positive edge of the input clock signal caused by the LTR-PPM circuit according to the input bits pattern is shown in Table 1.
From Table 1, the total delay assigned to the positive edge of the input clock signal caused by the A’s input bits can be obtained from Eq. (1).
where TM1 is the total static delay caused by the multiplexer circuits.
As a result, the total delay assigned to the positive edge of the input clock signal caused by the A’s and B’s input bits can be obtained from Eq. (2).
where TM1 and TM2 are the static delay caused by the LTR-PPM and HTR-PPM circuits respectively.
The total delay assigned to the negative edge of the input clock signal caused by the C’s and D’s input bits can be obtained in the same way from Eq. (3).
Figure 6, presents the eye diagram of the signals indicated in Figs. 1 and 3, the Figure shows the transmitted signal generation. Figure 6(a) shows the input clock to the transmitter circuit with a time period T and approximate 50% duty cycle, and Fig. 6(b) indicates the eye diagram of the LTR-PPM output signal showing the low-time resolution T1. Figure 6(c) represents the eye diagram of the upper DTR-PPM output signal, indicating the dual-time-modulation of the positive edge of the input clock signal and Fig. 6(d) indicates the eye diagram of the lower DTR-PPM output signal, showing the dual-time-modulation of the negative edge of the input clock signal. Figure 6(e) shows the eye diagram of the data signal, which is generated from the main clock signal showing both edges modulated independently. Figure 6(f) represents the eye diagram of the DTR-DTS transmitted signal after embedding the reference clock pulse signal in the transmitted signal.
2.1.3 Design steps
In order to design a DTR-DTS link for a certain number of input bits, the time spacing values, which are shown in Fig. 7, has to be assigned. From Fig. 7, Eq. (4) can be obtained.
In order to simplify the design, TP, TD and TM can be chosen to be equal to a certain value TF. In this case Eq. (4) can be written as:
TF represents the minimum pulse width in the transmitted signal. It should be noticed that TF should be big enough for the pulse to propagate through the transmitter circuits. Knowing the clock period, it is recommended to choose the value of TF first and obtain the value of TB, which will be used to obtain the dual resolutions of the link as indicated in Eq. (6).
The high resolution value ΔT should be chosen carefully according to the capability of the TDC circuit, which is used at the receiver side, to recognize any two consecutive edges as different codes. From Eq. (6), the designer can choose the value of T1 and calculate TB or vice versa.
2.2 Receiver side
2.2.1 The receiver circuit
The receiver circuit block diagram is shown in Fig. 8. The received signal is detected and amplified by the comparator circuit, which is shown in Fig. 9. The comparator circuit is followed by a separation circuit, which separates the reference clock pulse signal and the data pulse signal from the received signal. The circuit diagram of the separation circuit is shown in Fig. 10. It consists of two JK-flip flops with their J and K inputs connected to Vcc.
Figure 11 shows the eye diagram of the signals indicated on Fig. 10. Figure 11(a) shows the eye diagram of the differential output signal of the comparator circuit. Figure 11(b), (c) indicate eye diagram of the output signals of the upper and lower flip-flops. Figure 11(d) represents the eye diagram of the reference clock pulse signal.
After separating the clock signal and the data signals from the received signal, the TDC stage is used to convert the time difference between the positive edge of the clock pulse signal and the positive edge of the data pulse signal into a binary code corresponding to the transmitted bits (A0, A1,…, AN1, B0, B1,…, BN2). The TDC stage also converts the time difference between the positive edge of the clock pulse signal and the negative edge of the data pulse signal into a binary code corresponding to the transmitted bits (C0, C1,…, CN3, D0, D1,…, DN4) as will be shown in the following sub-section.
2.2.2 TDC stage
The two-step TDC approach such as in [13] has been used in the proposed receiver circuit. As shown in Fig. 8, the conversion is done in two steps or stages. The first stage demodulates the low-time resolution bits (A0, A1,…, AN1) and (C0, C1,…, CN3) and the second stage demodulates the high-time resolution bits (B1,…, BN2) and (D1,…, DN4).
The low-time resolution TDC (LTR-TDC) is shown in Fig. 12. The circuit is based on Vernier delay line TDC. The input clock signal and the input data signal are obtained from the signal separation circuit as shown in Fig. 10. One output clock signal is selected from the output clock signals (Clk-1, Clk-2,…, Clk-2N1) in order to be used by the high-time resolution TDC (HTR-TDC) as the input clock signal. The differential time between T3 and T4 equals to the low-time resolution T1. Figure 13 indicates the eye diagram of the signals shown in Fig. 12. Figure 13(a) shows the input clock signal and Fig. 13(b) indicates the eye diagram of the input data signal, which has dual time resolution. Figure 13(c) presents the output clock signals. The low-time resolution input bits can be recovered by the LTR-TDC circuit while the high-time resolution appears as a jitter to the LTR-TDC circuit.
The high-time resolution TDC (HTR-TDC) is shown in Fig. 14. The input data signal is obtained from the signal separation circuit as shown in Fig. 10 while the input clock signal is obtained from the selector circuit as shown in Fig. 8. The differential time between T5 and T6 is the high-time resolution ΔT. Figure 15 shows the eye diagram of the input data signal and the input clock signal with different possible positive edge locations for the data signal. The Figure indicates that the selection of the input clock signal that is done by the selector circuit changes with the location of the positive edge of the data signal according to the low-time resolution. As indicated in Fig. 15, the HTR-TDC circuit is not affected by the low-time resolution of the data input signal since the selection of the input clock signal compensates for the low-time resolution modulation of the data signal. The high-time resolution input bits can be recovered by the HTR-TDC.
3 Simulation results
An example, 8-bit 12 Gb/s DTR-DTS link has been designed and simulated using Cadence tools in a 65 nm mixed signal CMOS process and using 1.5 GHz as an input clock signal. Four input bits are modulating the positive edge and other four bits are modulating the negative edge of the input clock signal. The number of modulated bits have been chosen as N1 = N2 = N3 = N4 = 2. The design values have been chosen as shown in Table 2. The transmitter is similar to the circuit shown in Fig. 1.
From Table 2, TD can then be calculated from Eq. (4) as TD = 102.67 ps.
Figure 16(a) indicates the simulated eye diagram of the upper DTR-PPM output signal shown in Fig. 1. The Figure shows that the positive edge is double modulated and the negative edge is not modulated. Figure 16(b) presents the eye diagram of the lower DTR-PPM output signal shown in Fig. 1. The Figure shows that the negative edge is double modulated and the positive edge has a fixed position for all input codes. Figure 16(c) shows the signal combination circuit output signal after combining the two signals shown in Fig. 16(c), (b), which represents the data pulse signal. Figure 16(d) represents the eye diagram of transmitted signal at the driver output after combining the reference clock pulse signal with the data signal. The Figure indicates that each period consists of a reference clock pulse that has a fixed position and a data pulse that has both edges modulated independently.
Figure 17 presents the receiver circuit of the designed link, which indicates that four TDC circuits with a resolution of 2-bits each are used to recover 8-bits from the received signal. On the other hand, 8-bits TDC circuit is required to recover 8-bits for the received signal in case of using single-time resolution modulation technique or the DTS architecture. Figure 18 shows the eye diagram of the separation circuit output signals. Figure 18(a) presents the eye diagram of the output clock signal. The Figure indicated that the positive edge has a fixed position with all possible input codes, which is used as a reference edge in order to recover the transmitted bits. Figure 18(b) shows the eye diagram of the positive edge data signal which is the inversion of the output clock signal. Figure 18(c) indicated the eye diagram of the negative edge data signal. The input data stored in the positive edge of the data pulse of the received signal (A0, A1, B0, B1) can be recovered by converting the time difference between the positive edge of the output clock signal shown in Fig. 18(a) and the positive edge of the positive edge data signal shown in Fig. 18(b) into a binary code. Similarly, the input data stored in the negative edge of the data pulse of the received signal (C0, C1, D0, D1) can be recovered by converting the time difference between the positive edge of the output clock signal shown in Fig. 18(a) and the positive edge of the negative edge data signal shown in Fig. 18(c) into a binary code.
Figure 19 shows the eye diagram of the output clock signals (Clk-1, Clk-2, Clk-3, Clk-4), which are indicated in Fig. 12. The time difference between the four clock signals compensates for the low-time resolution modulation in order to recover the high-time resolution modulation bits.
The HTR-PPM circuit has been excluded from the designed link in order to design 4-bit 6 Gb/s DTS link using single-time resolution modulation. The transmitted signal spectrum of the 6 Gb/s and the 12 Gb/s links has been compared. Figure 20 shows the DFT of the 12 Gb/s transmitted signal and Fig. 21 indicates the 6 Gb/s transmitted signal spectrum. From Figs. 20 and 21, it can be noted that using the dual-time-modulation approach can double the link rate without significant effect on the bandwidth of the transmitted signal.
4 PVT variations
The delay line is considered as the main building block in time-based serial data link design. Process, voltage and temperature (PVT) variations affect the delay time value of each delay line. The PVT variations effect on a 100 ps delay line has been studied in 65 nm mixed signal CMOS technology. The temperature has been changed from 0 to 120°C. The simulation results indicate that the delay time value of the designed 100 ps delay line changes from 98 to 104 ps.
The voltage supply has been changed from 0.95 to 1.05 V, which represents ±5% of the actual value in order to study the effect of the voltage supply variation on the value of the delay time of the designed delay line. The delay time of the designed 100 ps delay line changes from 106 to 92 ps when the voltage supply value changes from 0.95 to 1.05 V.
Monte-Carlo simulations have been carried out considering the mismatch between transistors in order to study the process variations on the delay time of the designed delay line. Twenty different simulations indicated that the maximum delay value was 114 ps and the minimum delay value was 84.5 ps with an average of 92 ps. From the above simulations, the process variation effect on the delay lines is higher than the temperature and voltage effect, however it is easy to fix since process variation effect is considered as a static error and not a dynamic error as the case of Voltage and temperature effects.
From the simulated results indicated in this section, a calibration technique is needed to fix the designed delay time of the delay lines used in the link design against PVT variations. A calibration technique which is suitable for time-based serial links has been presented by the author in [14] that can be used for the designed link in order to compensate for the effect of the PVT variations on delay lines.
5 Conclusion
A dual-resolution differential-time signaling (DTR-DTS) architecture has been presented in this paper using a dual-time resolution PPM circuit. Using the proposed architecture, the number of transmitted bits per symbol is increased while simplifying the receiver circuit design by reducing the required number of bits of the TDC circuit used by the receiver circuit. Applying the-dual-time modulation approach slightly affects the transmitted signal bandwidth. An example 65 nm CMOS 8-bit 12 Gb/s DTR-DTS link has been designed and simulated using 1.5 GHz as an input clock signal. Four TDC circuits with a resolution of 2-bits each have been used to recovered 8-bits from the received signal.
References
Harwood, M., et al. (2012). A 225mW 28 Gb/s SerDes in 40 nm CMOS with 13 dB of analog equalization for 100GBASE-LR4 and optical transport lane 4.4 applications. In Proceeding of ISSCC digest of technical papers, San Francisco, CA, USA, (pp. 326–327).
Boecker, C., et al. (2014). A 8.125–15.625 Gb/s SerDes using sub-sampling ring-oscillator phase-locked loop. In Proceeding of IEEE custom integrated circuits conference (CICC), San Jose, CA, USA, (pp. 1–4).
Aziz, P., et al. (2014). 28 Gb/s 560mW multi-standard SerDes with single-stage analog front-end and 14-tap decision-feedback equalizer in 28 nm CMOS. In IEEE international solid-state circuits conference digest of technical papers (ISSCC), San Francisco, CA, USA, (pp. 38–39).
Nazari, M. H., & Emami-Neyestanak, A. (2012). A 15 Gb/s 0.5mW/Gbps two-tap DFE receiver with far-end crosstalk cancelation. IEEE Journal of Solid-State Circuits (JSSC), 47(10), 2420–2432.
Bongsub, S., et al. (2013). A 0.18-um CMOS 10-Gb/s dual-mode 10-PAM serial link transceiver. IEEE Transaction on Circuits and Systems I (TCAS I), 60(2), 457–468.
Schwager, L., et al. (2012). Design of CMOS 5 Gb/s 4-PAM transciever frontend for low-power memory interface. In International SoC design conference (ISOCC), Jeju Island, South Korea, (pp. 531–534).
Vijaya, S., & Pradip, M. (2010). A new power efficeint current-mode 4-PAM transmitter interface for off-chip interconnect. In IEEE Asia Pacific conference on circuits and systems (APCCAS), Kuala Lumpur, Malaysia, (pp. 959–962).
Ghederi, N., & Hadidi, K. (2009). A CMOS 3.2 Gb/s serial link transceiver, using PWM and PAM scheme. In European conference on circuit theory and design (ECCTD), Antalya, Turkey, (pp. 205–208).
Rashdan, M., Yousif, A., Haslett, J. W., & Maundy, B. (2013). Differential time-signaling data-link architecture. Journal of Signal Processing Systems (Springer), 70, 21–37.
Chung, Hayun, Ishikuro, Hiroki, & Kuroda, Tadahiro. (2012). A 10-Bit 80-MS/s decision-select successive approximation TDC in 65 nm CMOS. IEEE Journaal of Solid-State Circuits (JSSC), 47(5), 1232–1241.
Rashdan, M., & Haslett, J. (2015). Dual-time resolution pulse-position modulator for time-based serial communication links. International Journal of Electronics letters. doi:10.1080/21681724.2015.1077526.
Townsend, K.A., Macpherson, A.R., Haslett, J. (2010). A fine-resolution Time-to-digital converter for a 5Gs/s ADC. In IEEE international symposium on circuits and systems (ISCAS), (pp. 3024–3027).
Kim, K., et al. (2013). A 7 bit, 3.75 ps resolution two-step time-to-digital converter in 65 nm CMOS using pulse-train time amplifier. IEEE Journal of Solid-State Circuits, 48(4), 1009–1017.
Rashdan, Mostafa. (2017). Pulse amplitude-modulated time-based interface for off-chip interconnect. International Journal of Electronics, 104(1), 16–33.
Acknowledgements
This work was supported by the Electrical Engineering Department, University of Calgary.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rashdan, M. Dual-time resolution time-based transceiver for low-power serial interfaces. Analog Integr Circ Sig Process 92, 81–89 (2017). https://doi.org/10.1007/s10470-017-0977-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10470-017-0977-4