# On-Chip Measurement of Clock and Data Jitter With Sub-Picosecond Accuracy for 10 Gb/s Multilane CDRs

Joshua Liang, Mohammad Sadegh Jalali, Student Member, IEEE, Ali Sheikholeslami, Senior Member, IEEE, Masaya Kibune, and Hirotaka Tamura, Fellow, IEEE

*Abstract*—On-chip jitter measurement can be used to optimize the performance of wireline transceivers. In this work, the jitter of random data is measured on-chip by correlating the phase detector outputs from two adjacent CDR lanes. This allows the jitter's autocorrelation function to be estimated, from which the jitter's RMS value and power spectral density are extracted without using any external reference clock. The RMS value of random jitter ranging from 0.85 ps to 1.89 ps, and sinusoidal jitter from 0.89 ps to 5.1 ps is measured in PRBS31 data with less than 0.6 ps of error compared to measurements by an 80 GS/s real-time oscilloscope. Correlating the phase detectors in the CDRs with a third phase detector, which measures the phase difference between the clocks recovered by the two CDRs, allows measurement of the recovered clock jitter. Sinusoidal jitter from 1.8 ps to 5.3 ps is measured in the recovered clock with an error of less than 1 ps.

Index Terms—Clock and data recovery, CDR, jitter, jitter measurement, on-chip measurement.

### I. INTRODUCTION

N-CHIP jitter measurement can be used to diagnose performance issues, monitor for device failures and aging effects, or help reduce jitter's adverse effects by adapting circuit parameters. Accurate on-chip measurement can also assist designers by providing a better understanding of jitter's impact on existing designs. In clock and data recovery (CDR) circuits, the jitter of both the received data, and the recovered clock contribute to receiver performance and should be characterized. As will be discussed in this paper, existing techniques for measuring clock jitter [1]–[4] and data jitter [5], [6] often require low-jitter external reference clocks, which may not always be available in a design. The goal of this work is an on-chip jitter measurement system able to characterize both clock and data jitter, without external reference clocks or measurement equipment such as oscilloscopes or spectrum analyzers. It should also add minimal area overhead to circuit layouts.

Manuscript received August 28, 2014; revised November 06, 2014; accepted November 24, 2014. Date of publication December 30, 2014; date of current version March 24, 2015. This paper was approved by Guest Editor Masato Motomura. This work was supported in part by NSERC.

J. Liang, M. S. Jalali, and A. Sheikholeslami are with the Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, M5S 3G4, Canada.

M. Kibune and H. Tamura are with Fujitsu Laboratories Limited, Kawasakishi, Japan.

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2014.2378280



Fig. 1. Definitions of (a) absolute jitter (b) relative jitter and (c) period jitter.

To characterize jitter, we estimate the autocorrelation functions of data and clock jitter by correlating the phase detector (PD) outputs of two 10 Gb/s CDRs [7]. We then extract the RMS jitter and estimate the jitter's power spectral density (PSD). The measurements can be used on-chip, or processed off-chip.

The remainder of this paper is organized as follows. Section II reviews jitter terminology and existing jitter measurement schemes. Section III presents the proposed correlation-based jitter measurement technique and Section IV provides analysis. Sections V and VI describe the circuit implementation and measurement results of the fabricated chip.

#### II. BACKGROUND

## A. Jitter Terminology

We distinguish between three types of jitter: absolute jitter, relative jitter and period jitter. As depicted in Fig. 1, we define *absolute jitter* ( $\psi_D$ ) as the time difference between the zerocrossings of the signal of interest, in this case *Data*, and the rising edges of an ideal reference clock. It is the absolute jitter of a CDR's recovered clock, that is typically specified in wireline transceiver standards. Note that we use  $\psi$  to represent absolute jitter in seconds, to avoid confusion with  $\phi$ , which is often used to represent phase in radians. In this work, we seek to measure the absolute jitter of both the input data ( $\psi_D$ ) and recovered clock ( $\psi_{CK}$ ) of a CDR.

In practical cases, jitter must generally be measured compared to some jittery, non-ideal clock source. We refer to jitter



Fig. 2. (a) TDC-based (b) self-referenced and (c) PD-based jitter measurement techniques.

measured against a jittery clock, as *relative jitter*, which is the difference between the absolute jitter of the signal of interest and that of the reference clock. Relative jitter  $(\psi_D - \psi_{CK})$  only approaches  $\psi_D$  when  $\psi_{CK} \ll \psi_D$ . In other words, the absolute jitter  $(\psi_D)$  can only be measured with a reference clock whose jitter  $(\psi_{CK})$  is much less than  $\psi_D$ .

Alternatively, we can measure the time between adjacent zero-crossings of a clock signal to measure its *period jitter*, which is the deviation of a signal's period from its ideal value. Because period jitter is essentially the first-difference of absolute jitter [8], its spectrum is high-pass filtered compared to that of absolute jitter. Period jitter therefore has less low-frequency content compared to absolute jitter. Having established this terminology, we now review some of the existing techniques for jitter measurement.

#### B. Clock Jitter Measurement

Previous works on jitter measurement have focused on clock jitter and often fall into three categories, shown in Fig. 2: time-to-digital converter (TDC)-based, self-referenced and phase detector (PD)-based. TDC-based circuits measure the relative jitter between a signal and a reference clock. Using delay lines [2], [4] or other circuits, they effectively oversample the zero-crossing of a signal with a fine time-resolution, converting zero-crossing times to digital codes. This approach has two main drawbacks. Firstly, since it can only measure relative jitter, the reference clock jitter must be much lower than that of the signal under measurement. Secondly, achieving a high time-resolution from TDC circuits generally limits their operating speed, due to the latency of delay line structures [2] or cascading of circuits such as time amplifiers [3]. Furthermore, since TDCs have multi-bit outputs, the data generated by a TDC at tens of GHz requires very high data throughput to process. For these reasons, the operating frequency of high-resolution TDC-based jitter measurement circuits is limited to several GHz.

The second category, self-referenced designs, avoids the need for a reference clock by measuring the jitter between a clock and its delayed version, effectively measuring period jitter. This approach only works for clock signals as random data does not have a transition in every unit interval (UI). As noted above, period jitter also has less content at low frequencies, limiting its usefulness for jitter measurement.

Finally, PD-based circuits use an analog phase detector to convert the relative jitter between the signal of interest and the

 TABLE I

 LIMITATIONS OF EXISTING JITTER MEASUREMENT TECHNIQUES

| Approach                     | Works without<br>clean ref. clock | Works for<br>random data |
|------------------------------|-----------------------------------|--------------------------|
| TDC-based                    | Ν                                 | Y                        |
| Self-referenced<br>TDC-based | Y                                 | Ν                        |
| PD-based                     | N                                 | Y                        |
| Goal of this work            | Y                                 | Y                        |

reference clock, to an analog output [1]. The analog output allows for high-resolution jitter measurement, but is also sensitive to noise. Because analog PDs may also output very small signals, on the order of mV in [1], the output must be measured with either an oscilloscope or spectrum analyzer [1]. Alternatively, an on-chip high-speed, high-resolution ADC would be required to produce a digital output, transforming the PD into a TDC. As in the TDC-based approach, the reference clock jitter must again be much lower than that of the signal to be measured. In this work we also want to measure the jitter of data.

## C. Data Jitter Measurement

As mentioned, the self-referenced technique is not applicable to data jitter. Because the TDC and the PD-based [5] approaches both use a clean reference clock, they are also not suitable in plesiochronous links, where CDRs receive data without a reference clock. Generating the required low-jitter clock could be costly in terms of power and area. In CDRs, eye-monitor circuits [9] can generate the relative jitter histogram of a data signal by sampling it with a variable phase. Asynchronous clocking [10] can also be used to sweep the data eye with an external clock having a frequency offset compared to the data. However, both of these methods measure relative jitter and therefore require low-jitter reference clocks for accuracy.

Table I summarizes the main limitations of the jitter measurement techniques discussed. In this work, we seek a jitter measurement solution applicable to both data and clock jitter, which does not require a clean reference clock. We propose a method to do this by estimating the jitter autocorrelation using phase detectors in a CDR.

#### **III. PROPOSED JITTER MEASUREMENT SCHEME**

#### A. Basic Concept for RMS Jitter Extraction

The goal of this work is the extraction of absolute jitter and its power spectral density. As discussed, phase detectors can measure the relative jitter between two signals, but the absolute jitter of each signal is not observable. In this work, we add a third signal. By measuring the relative jitter between each pair of signals using three PDs, we determine the jitter of each source. As shown in Fig. 3, with three sources A, B and C, the outputs of three ideal PDs are

$$e_1 = \psi_A - \psi_B \tag{1}$$

$$e_2 = \psi_A - \psi_C \tag{2}$$

$$e_3 = \psi_B - \psi_C \tag{3}$$



Fig. 3. Basic concept.

Note that  $e_3$  can be determined from  $e_1$  and  $e_2$  as  $e_3 = e_2 - e_1$ . However, this requires subtracting the outputs of two linear PDs. As will be discussed, since bang-bang PDs are used in this work, this is not feasible to implement. We instead use three separate PDs.

Assuming the  $\psi$ 's are zero-mean random processes  $(E[\psi] = 0)$  that are uncorrelated from each other, we multiply each pair of PD outputs and take the expected value  $(E[\cdot])$ . This eliminates the uncorrelated jitter components, allowing the variance  $(\sigma^2)$  and RMS value  $(\sigma)$  of each jitter component to be identified. We assume the jitter is ergodic and approximate  $E[\cdot]$  using the time-average, implemented by a low-pass filter.

$$E[e_1e_2] = E\left[\psi_A^2\right] = \sigma_{\psi_A}^2 \tag{4}$$

$$E[e_1e_3] = -E\left[\psi_B^2\right] = -\sigma_{\psi_B}^2 \tag{5}$$

$$E[e_2e_3] = E\left[\psi_C^2\right] = \sigma_{\psi_C}^2 \tag{6}$$

Unlike prior PD-based jitter measurement approaches, no clock is used as an ideal reference, eliminating the need for a clean reference clock.

This approach relies on the jitter of the clocks being uncorrelated. Any correlated jitter in the two clock sources would add an offset error to the measurement, that would have to be calibrated out. If for example, clocks *B* and *C* both contain some jitter  $\psi_{\text{CORR}}$ , (4) would become  $E[\psi_A^2 + \psi_{\text{CORR}}^2]$ . To minimize any such correlation, the two clock sources should be well isolated from each other through careful layout and separation of their power grids using regulators or separate supply pads. This isolation should ensure that any correlated jitter caused by mutual coupling contributes only a fraction of the total clock jitter. In this work, we assume any correlated jitter is negligible compared to the total jitter being measured.

In the remainder of this section, we first discuss how the described method can be extended to measure the jitter's autocorrelation and PSD. We then describe how it can be implemented using bang-bang PDs and applied to multilane CDRs by locking two CDRs to the same data.

#### B. Jitter Autocorrelation and PSD Measurement

The above scheme provides the measured RMS jitter of the signal of interest but does not provide information about its frequency content. To extract spectral information, we estimate the jitter's autocorrelation function. By delaying  $e_1$  by k before correlating it with  $e_2$ , (4) becomes

$$E[e_1(n-k)e_2(n)] = E[\psi_A(n-k)\psi_A(n)] = R_{\psi_A}(n-k,n)$$
(7)



Fig. 4. PD autocorrelation measurement with two PDs.

where  $R_{\psi_A}(n-k,n)$  represents the autocorrelation function of the jitter  $\psi_A$ . Here, we have assumed uncorrelated jitter sources as before. If the jitter is wide-sense stationary (WSS), which is true in oscillators [11], then  $R_{\psi_A}(n-k,n)$  is not a function of time *n* and can be replaced by  $R_{\psi_A}(k)$ . The Fourier transform of  $R_{\psi_A}(k)$  gives the PSD [12] of the jitter, providing information about the jitter's frequency content. We approximate  $E[\cdot]$  in (7) by taking the average of  $e_1(n-k)e_2(n)$  over time *n*, for different values of *k*. This method of estimating  $R_{\psi_A}(k)$  is similar to the Blackman-Tukey [13] method for spectral estimation.

If the jitter is not WSS but cyclostationary, which is true for example when jitter is caused by periodic noise from clocked digital circuits [14], the autocorrelation function becomes a periodic function of n. In this case, our averaging approach gives an estimate of the time-averaged autocorrelation [15] with respect to n.

$$R_{\psi_A,\text{average}}(k) = \frac{1}{N} \sum_{n=1}^{N} R_{\psi_A}(n-k,n)$$
(8)

As a result of this averaging, the measured autocorrelation preserves the amplitude and frequency of periodic jitter, but not its phase. In the remainder of this paper, we assume jitter is wide-sense stationary. If the jitter is cyclostationary, the results will be subject to this averaging effect.

#### C. Application to Bang-Bang PD

So far, we have ignored the gain of the PDs. The ideal PDs in Fig. 3 are replaced with PDs having gains  $K_{P1}$  and  $K_{P2}$  in Fig. 4. If bang-bang PDs are used, the correlation and filtering can be done on-chip using logic and counters. To measure autocorrelation, the phase offset k can be adjusted using FIFOs. Accounting for the PD gains, and assuming  $\psi_A$  is WSS, (7) becomes

$$E[e_{12}] = K_{P1} K_{P2} R_{\psi_A}(k) \tag{9}$$

To estimate  $\sigma_{\psi_A}$ , we set k to zero giving

$$\sigma_{\psi_A} = E\left[\psi_A^2\right] = \sqrt{\frac{E[e_{12}]}{K_{P1}K_{P2}}}$$
(10)

For a bang-bang PD however,  $K_{P1}$  and  $K_{P2}$  depend on the distribution of the relative input jitter ( $\psi_A - \psi_B$  for PD1 and )  $\psi_A - \psi_C$  for PD2) [16]. We estimate the PD gain using an



Fig. 5. PD correlation-based jitter measurement using two CDRs in a multilane configuration.

edge-monitor circuit, which consists of an auxiliary edge sampler driven by a clock with a variable phase offset. Comparing the edge-monitor samples to the edge samples from the PD allows the PD output to be measured as a function of the phase offset, without affecting the lock position of the CDR. This allows the cumulative distribution function (CDF) of the PD output to be measured on-chip using counters. The PD gain can then be measured as the slope of this CDF, and the RMS jitter calculated (off-chip) from (10). Because linearizing the PD response is an approximation, the value of  $\sigma_{\psi_A}$  as determined by (10) must be divided by a constant that depends on the type of the jitter distribution. Matlab simulations estimate this constant to be 1.97 for sinusoidal jitter (SJ) and 1.34 for Gaussian jitter.

#### D. Complete System for Multilane CDR

In summary, the proposed technique characterizes jitter using the correlation between each pair of PDs. The CDF of each PD output is used to extract the PD gain. Spectral information is obtained by sweeping the delay of one of the PD outputs, to produce the autocorrelation function. This data can be sent offchip, and its FFT taken, to obtain the jitter's PSD.

This jitter measurement scheme can be applied to CDRs by replacing the signals A, B and C with the input Data and two clocks CK1 and CK2, respectively, having jitters of  $\psi_D$ ,  $\psi_{CK1}$ and  $\psi_{CK2}$ . In this work, CK1 and CK2 are generated by two adjacent analog CDRs both locked to Data. Fig. 5 shows an example system applied in a multilane CDR where an adjacent lane could be taken offline and reconfigured using a MUX, to provide CK2 in a diagnostic mode. To maintain full operation of the link, a redundant diagnostic lane could also be added to the system, amortizing the cost of circuits across many lanes. The PD outputs from each CDR are used for jitter measurement. Adding PD3 allows the jitter of CK1 and CK2 to also be measured. Edge-monitors are added for PD gain measurement. Since PD3 is not part of a CDR loop, its sampling phase is adjustable and can serve as its own edge-monitor. The outputs of all of the PDs can be correlated and analyzed digitally. The CDRs have a filtering effect on the jitter being measured, which is analyzed in the following section.

#### IV. ANALYSIS OF JITTER MEASUREMENT WITH TWO CDRS

#### A. Linear Model

To determine the effect of the CDRs on the PD correlation signal, we examine the frequency content of the PD signals using a linear phase model of the CDR as shown in Fig. 6. In this model,  $\psi_{N1}$  and  $\psi_{N2}$  represent all of the jitter contributions from the VCO, charge pump (CP) and loop filter (LF) of CDR1 and CDR2 respectively. The CDR loops filter  $\psi_D$ ,  $\psi_{N1}$  and  $\psi_{N2}$  and produce  $e_1$  and  $e_2$  with corresponding Laplace transforms  $E_1(s)$  and  $E_2(s)$ 

$$E_1(s) = \frac{K_{P1}}{1 + K_{P1}H1(s)} [\Psi_D(s) - \Psi_{N1}(s)]$$
(11)

$$E_2(s) = \frac{K_{P2}}{1 + K_{P2}H^2(s)} [\Psi_D(s) - \Psi_{N2}(s)]$$
(12)

Where H1(s) and H2(s) represent the combined transfer functions of the CP, LF and VCO of CDR1 and CDR2 respectively. In CDRs, H1(s) and H2(s) have a low-pass response, therefore each PD output contains high-pass filtered versions of the corresponding data jitter  $\psi_D$  and CDR jitter  $\psi_{N1}$  or  $\psi_{N2}$ . The measurement of data and recovered clock jitter are analyzed separately.

#### B. Analysis of Data Jitter Measurement

We first examine the correlation signal  $e_{12}$ , used to measure data jitter. In the general case, one of the PD signals is delayed by k, as in (7).

$$E[e_{12}] = E[e_1(n-k)e_2(n)]$$
(13)

Assuming that the CDR jitter sources  $\psi_{N1}$  and  $\psi_{N2}$  can be modelled as uncorrelated, zero-mean Gaussian random processes, multiplying the outputs of PD1 and PD2 and taking the expected value cancels out the uncorrelated jitter sources, leaving only the data jitter. If  $\psi_D$  is wide-sense stationary then (13) is only a function of k. Consequently, if the PSD of the jitter  $\psi_D$  is  $S_{\psi_D}(f)$ , then using the Wiener-Khinchin theorem [12], it can be shown that the Fourier transform of  $E[e_{12}]$  (taken with respect to time k) can be written as

$$\mathcal{F}\{E[e_{12}]\} = \frac{K_{P1}K_{P2}S_{\psi_D}(f)}{(1+K_{P2}H2(f))(1+K_{P1}H1(f))^*} \quad (14)$$

The Fourier transform of the averaged PD autocorrelation signal is therefore a high-pass filtered version of  $S_{\psi_D}(f)$ , the PSD of  $\psi_D$ . Since in-band jitter is suppressed by the CDR loops, this scheme characterizes the out-of-band data jitter responsible for performance degradation in the CDR. This method is suitable for measuring wideband jitter such as jitter on PRBS data. Next, we consider PD3, which allows the CDR's recovered clock jitter to be estimated.

#### C. Analysis of Clock Jitter Measurement

PD3 measures the phase difference between CK1 and CK2 and when combined with PD1 and PD2, allows us to measure the jitter in CK1 and CK2. The output of PD3 is

$$e_3 = K_{P3}(\psi_{CK1} - \psi_{CK2}) \tag{15}$$

CK1 and CK2 both contain filtered versions of the data jitter so  $\psi_{CK1}$  and  $\psi_{CK2}$  can be written as

$$\psi_{CK1} = (\psi'_{D1} + \psi'_{N1}), \ \psi_{CK2} = (\psi'_{D2} + \psi'_{N2})$$
(16)



Fig. 6. Linear model of PD correlation with two CDRs.

where  $\psi'_{D1}$  and  $\psi'_{N1}$  represent the contributions to the recovered clock jitter of CDR1 from  $\psi_D$  and  $\psi_{N1}$  respectively, and  $\psi'_{D2}$  and  $\psi'_{N2}$ , are the corresponding terms for CDR2.

$$\Psi_{D1}'(s) = \frac{K_{P1}H1(s)\Psi_D(s)}{1+K_{P1}H1(s)}, \Psi_{N1}'(s) = \frac{\Psi_{N1}(s)}{1+K_{P1}H1(s)}$$
(17)
$$\Psi_{D2}'(s) = \frac{K_{P2}H2(s)\Psi_D(s)}{1+K_{P2}H2(s)}, \Psi_{N2}'(s) = \frac{\Psi_{N2}(s)}{1+K_{P2}H2(s)}$$
(18)

If the two CDRs are identical, i.e., H1(s) = H2(s), and  $K_{P1} = K_{P2}$ , then the  $\psi_D$  terms cancel out in (15), leaving

$$e_3 = K_{P3}(\psi'_{N1} - \psi'_{N2}) \tag{19}$$

PD3 therefore provides a measure of the filtered CDR jitter  $\psi'_{N1}$  and  $\psi'_{N2}$ . Now correlating PD3 from (19) with the PD1 output given by (11) and making the same assumptions about jitter being uncorrelated gives

$$E[e_{13}] = E[e_1(n-k)e_3(n)]$$
(20)

$$= -K_{P1}K_{P3}E[\psi'_{N1}(n-k)\psi'_{N1}(n)] \qquad (21)$$

The correlation signal  $E[e_{13}]$  contains the high-pass filtered CDR jitter  $\psi'_{N1}$ . Assuming that  $\psi_{N1}$  is wide-sense stationary, the Fourier transform of  $E[e_{13}]$  is then

$$\mathcal{F}\{E[e_{13}]\} = \frac{-K_{P1}K_{P3}S_{\psi_{N1}}(f)}{|1 + K_{P1}H1(f)|^2}$$
(22)

Correlating PD1 and PD3 therefore allows us to measure the out-of-band portion of  $\psi_{N1}$ , which represents the portion of the recovered clock jitter ( $\psi_{CK1}$ ) contributed by CDR1's circuits. This measurement is decoupled from the data jitter, allowing an assessment of CDR1's intrinsic jitter performance. To minimize the high-pass filtering effect on clock jitter measurement, a low CDR loop bandwidth should be used. Correlating PD3 with PD2 yields the analogous result for  $\psi_{N2}$ .

The proposed approach therefore allows us to estimate the RMS value of  $\psi_D$  and  $\psi_{N1}$ , the high-pass filtered versions of the data and CDR clock jitter, without any clean reference clock. The proposed approach therefore allows us to estimate the RMS value of both the data and CDR clock jitter without any clean



Fig. 7. Test chip block diagram.



Fig. 8. CDR1 block diagram.

reference clock. Taking the Fourier transform of the correlation signals also gives us the estimated PSD.

#### V. IMPLEMENTATION

#### A. Test Chip Implementation

As shown in Fig. 7, a test chip was fabricated consisting of a continuous-time linear equalizer (CTLE) driving two 10 Gb/s half-rate CDRs, DMUXes and a digital core. PD3 is added to allow estimation of the CDR's recovered clock jitter. A variable delay block deskews CK2 compared to CK1, ensuring correct operation of PD3.

## B. CDR Implementation

Fig. 8 shows the CDR1 architecture, consisting of a half-rate bang-bang PD, charge pump, loop filter and a 4-stage ring VCO operating at 5 GHz. The variable-phase edge-monitor clock  $(CK\_EDGE)$  is generated by a 5-bit CML phase interpolator (PI\_E), which interpolates between two phases of the VCO with a resolution of 25 ps/31 $\cong$  0.8 ps. Two PI blocks (PI\_I and PI\_Q) with fixed interpolation ratios buffer  $CK\_I$  and  $CK\_Q$  are nominally aligned. Fig. 9 shows the two types of PI blocks; one with 5-bit control and the other with both inputs equally weighted.

Differential-to-single-ended (D2S) converters convert the CML clocks to CMOS levels for use in the half-rate PD shown in Fig. 10. The PD outputs two half-rate UP/DN signals with rail-to-rail swing to dual charge pumps that drive the loop filter. This relaxes the design requirements of the charge pump and avoids the high-speed muxes required in [17]. The PD



Fig. 9. Phase interpolator with (a) 5-bit resolution and (b) fixed interpolation ratio.



Fig. 10. Half-rate PD.



Fig. 11. High-speed latch. Changes from design in [19] are highlighted.

uses sense-amp based latches due to their narrower sampling aperture [18]. Double-tail latches based on those used in [19] and shown in Fig. 11 are used. Compared to the design in [19], an additional NMOS keeper cell is used in the second stage to maintain pull-down current when the first stage outputs go low, and a reset switch is added to reduce hysteresis. As shown in Fig. 10, to maintain timing margin, additional latches resample and align all outputs to a single clock phase before all the outputs are resampled with conventional CMOS flip-flops. The PD outputs are then DMUXed by 8 and sent to the digital core, which is clocked at 625 MHz.

As shown in Fig. 8, PD3 is a second bang-bang PD whose edge clock phase  $CK\_Q\_PD3$  is also driven by CDR1's VCO through a phase interpolator. Note that although only the edge sample is needed for PD3 to detect the phase of CK2, a complete PD was used to ease debugging of the test chip. In CDR2, instead of driving PD3, the VCO drives the variable delay block (see Fig. 7) used to feed CK2 into PD3. All of the additional measurement-related circuits can be disabled by disabling the



Fig. 12. Overview of digital core.

D2S circuits, thereby gating the clock for all front-end latch circuits.

#### C. Phase Offset Compensation

Effects such as charge pump mismatch and comparator offset could cause the CDR to lock with residual phase offset with respect to the data. The PD output could therefore have a nonzero mean, introducing error into the PD correlation. Although this can be managed with careful design and offset compensation, in this work, residual offset was compensated by using the edge-monitor samples, rather than the PD edge samples for PD correlation. Phase offset was compensated by digitally adjusting the edge-monitor phase using either the on-chip DLL function (described below) or manually, by examining the CDF of the edge-monitor output. Duty-cycle-distortion (DCD) in the half-rate architecture could also cause even/odd mismatch in the PD. To compensate, RMS jitter was measured separately for even and odd samples, with the edge-monitor phase optimally adjusted in each case. The even and odd results were then averaged.

## D. Digital Core

The digital core is shown in Fig. 12 and consists of FIFOs, digital PD blocks, a filter mask block, correlation counters and a programmable DLL counter. Instead of sending the PD outputs to the digital core directly, the raw data and edge samples are sent, requiring two bits per UI including the recovered data, instead of three if the PD outputs were sent in addition to the recovered data. FIFO stages allow data to be phase-shifted between PDs to generate the autocorrelation function. The filter mask block can filter the PD data based on even and odd samples, as well as several data patterns that can be used to analyze the effect of intersymbol interference (ISI) on jitter. For example, by measuring only the 010 data pattern, the effect of the first post-cursor ISI on jitter can be suppressed in the measurement.

The filtered PD outputs are then sent to digital correlation counters. In this block, a 17-bit edge counter counts the total



Fig. 13. Die photo and power breakdown.



Fig. 14. (a) Half-rate recovered PRBS7 data eye and (b) clock jitter (pink is jitter spectrum measured by scope).

number of data transitions, while the correlation counters count how many of the PD outputs are correlated. The ratio of the correlated to the total number of transitions gives the PD correlation. Additional histogram counters count the number of DN transitions, allowing the relative jitter histogram to be measured as the edge-monitor phase is swept. In this chip, one edge counter and six additional counters were used to simultaneously process and correlate data from all three PDs. To reduce power and area, fewer counters could be implemented and reused for different measurements.

The DLL counter with programmable division ratio is reconfigurable to accept the input of any of the PDs. In conjunction with the edge-monitor PI blocks, the counter could be used to lock the edge-monitors to the edge of the data eye. When driving the variable delay block (see Fig. 7), the DLL could also be used to deskew CK2 with respect to CK1.

### VI. MEASUREMENT RESULTS

## A. CDR Functionality

Fig. 13 shows the die photo of the chip fabricated in Fujitsu's 65 nm CMOS technology. CDR1 consumes 62 mW and occupies 0.084 mm<sup>2</sup> while CDR2 consumes 57 mW due to fewer circuits. The edge-monitor blocks add 11% measured power and 9% area overhead to CDR1. The DMUXes occupy a total of 0.013 mm<sup>2</sup> and consume 7 mW. The total area overhead including DMUXes, of all jitter-related analog circuitry is approximately 18%. The digital core occupies 0.106 mm<sup>2</sup> and consumes 31 mW.

To demonstrate the CDR's functionality and typical performance, Fig. 14(a) shows a recovered half-rate PRBS7 data eye. The real-time scope is able to pattern-lock to the PRBS7 pattern. Fig. 14(b) shows the jitter histogram of the recovered clock,



Fig. 15. Measured jitter tolerance.



Fig. 16. Test setup.



Fig. 17. Measured CDF of PD1's relative jitter  $\psi_D - \psi_{CK1}$  with no RJ or SJ added.

showing typical RMS jitter of 1.8 ps. Fig. 15 shows the CDR's jitter tolerance for 10 Gb/s PRBS31 data at a bit error rate (BER) of  $10^{-12}$ . High-frequency jitter tolerance is  $0.19UI_{PP}$ .

## B. Jitter Measurement Test Setup

The test setup is shown in Fig. 16. To validate the proposed concept, PRBS31 data from a Centellax TG1B1-A BERT was used to drive the CDRs. The BERT was clocked by a TG1C1-A clock synthesizer with internal SJ injection. Random jitter (RJ) was applied by driving the synthesizer's external modulation input with a NoiseCOM noise generator. The bandwidth of this input was 20 MHz to 100 MHz, allowing RJ in this frequency range to be injected.

## C. PD Gain Measurement

The PD gain is measured as part of each jitter measurement. Fig. 17 is an example of a CDF of PD1's output, measured by sweeping the edge-monitor phase. As described in Section III-C, the PD gain is calculated from the slope of the CDF in its linear region and combined with the PD correlation in calculating RMS jitter.



Fig. 18. Measured RMS data jitter with 20-100 MHz injected RJ.



Fig. 19. Measured RMS data jitter with SJ injected at 100 MHz.

#### D. Data Jitter Measurement Results

Fig. 18 shows the measured RMS data jitter as RJ is injected into the data. The plot compares the RMS jitter as estimated by on-chip measurement against the jitter measured by an Agilent DSAX91604A 80 GS/s (16 GHz bandwidth) real-time oscilloscope, which has a 150 fs jitter measurement noise floor. Fig. 19 shows measurement results when SJ is injected into the data at 100 MHz. In both cases, the estimated jitter differs from the real-time scope's measurement by no more than 0.6 ps over the entire range of injected jitter amplitudes. Using this approach, jitter levels well below that of the CDR's recovered clock jitter of 1.8 ps RMS can be estimated. The results shown in Figs. 18 and 19 are slightly different than those reported in [7] as we previously used a scaling factor of 2 for both RJ and SJ cases. These scaling factors are now updated to 1.34 for RJ and 1.97 for SJ, as discussed in Section III-C.

Some of the discrepancy between the estimated and scope measurement results is likely attributed to coupling of the data jitter, possibly through the power supplies of the test chip. When injecting SJ into the data, measurements showed that the injected SJ was coupling to the CDR output clocks, causing spurs in their spectra. Since the coupling was to both CDR outputs, this would cause correlated jitter between the two CDR outputs, leading to an offset in the estimated jitter as described in Section III-A.



Fig. 20. (a) Estimated autocorrelation of data jitter  $(R_{\psi_D(k)})$  without jitter injection (b) even/odd samples of  $R_{\psi_D(k)}$  (1UI = 100 ps).



Fig. 21. (a) Estimated autocorrelation of data jitter  $(R_{\psi_D(k)})$  with 0.05 UI<sub>PP</sub> SJ at 100 MHz (b) even/odd samples of  $R_{\psi_D(k)}$  (1UI = 100 ps).

The data jitter's PSD is estimated from the FFT of the measured PD autocorrelation. Fig. 20(a) shows the measured data jitter autocorrelation  $R_{\psi_D}(k)$  with no additional jitter added to the data. DCD in the half-rate CDR causes variation between the even and odd values of  $R_{\psi_D}(k)$ . Fig. 20(b), plots the even and odd samples of  $R_{\psi_D}(k)$  separately. In this figure the delta function centered at k = 0, indicates that the data's random jitter is nearly white. (The autocorrelation function of white noise is a delta function.) Fig. 21 shows the measured data jitter autocorrelation when 0.05UI<sub>PP</sub> SJ injected at 100 MHz, corresponding to a period of 100 UI. The SJ at 100 MHz is clearly visible in the autocorrelation function as a sinusoid. Fig. 22 compares the FFT of  $R_{\psi_D}(k)$  to the jitter spectrum measured by the scope with SJ injected. Both show large spurs at 100 MHz, demonstrating that individual SJ components can be identified with this approach.



Fig. 22. PSD of data jitter with SJ at 100 MHz as measured by (a) scope and (b) on-chip measurement using FFT of  $R_{\psi_D(k)}$ .



Fig. 23. Measured RMS clock jitter with SJ injected into VCO at 47 MHz.

## E. CDR Clock Jitter Measurement Results

Correlating the outputs of PD1 with PD3 allows estimation of the CDR's output jitter. As shown in (22), the correlation signal is a high-pass filtered version of the CDR clock's output jitter. Unlike the data jitter, which has a high bandwidth, due to modulation by the random data pattern, the jitter of the CDR clock has a lower bandwidth. The high-pass characteristic of the correlation signal attenuates the low-frequency content of this jitter and measures only the high-frequency portion of the CDR clock's output jitter. Despite this, the measurement is useful for diagnostics as it can still reveal changes in the CDR's jitter performance. To test this, SJ was injected at 47 MHz (close to the CDR's jitter transfer corner frequency) into the CDR's VCO by coupling an external clock source into the VCO's bias control circuit. As shown in Fig. 23, the jitter estimated from PD correlation closely tracks the CDR's output jitter as measured by the scope but has an offset of about 0.8 ps due to the high-pass filtering effect of the CDR.

Fig. 24 shows the estimated clock jitter autocorrelation and PSD with and without jitter injected at 1 GHz. First, unlike the data jitter autocorrelation, instead of a delta function, a much wider pulse is centered at k = 0. This indicates that the clock jitter has a limited bandwidth with a time constant related to the



Fig. 24. Estimated CDR clock jitter autocorrelation  $R_{\psi_{CK1}(k)}$  (a) with and (b) without SJ injected at 1 GHz (1UI = 100 ps).



Fig. 25. PSD of CDR clock jitter with SJ at 1 GHz as measured by (a) scope and (b) on-chip measurement using FFT of  $R_{\psi_{CK1}(k)}$ .

spread of the pulse in UI. Second, once injected, the 1 GHz SJ is visible in Fig. 24(a), superimposed on the original autocorrelation function. Fig. 25 compares the estimated jitter PSD to the jitter spectrum measured by the scope. Despite the high-pass filtering effect, the plotted PSD not only shows the injected 1 GHz spur, but also the low-pass nature of the clock's jitter spectrum and the CDR's loop bandwidth.

Table II compares this work to previous works on jitter measurement.

## VII. CONCLUSION

We have proposed and implemented a jitter measurement scheme using PD correlation. By correlating the PD outputs from two CDRs locked to the same data, the RMS clock and data jitter can be measured with sub-picosecond accuracy. Compared to prior techniques, this approach achieves comparable accuracy at the highest data rate, is applicable to both clock and data jitter measurement, and does not rely on any clean external

 TABLE II

 COMPARISON TO PREVIOUS JITTER MEASUREMENT CIRCUITS

| Reference | Input<br>signal | Ext. ref.<br>clock | Output<br>signal | Frequency<br>/ data rate | Measurement<br>error |
|-----------|-----------------|--------------------|------------------|--------------------------|----------------------|
| [2]       | Clock           | Yes                | Digital          | 250MHz                   | <1ps                 |
| [3]       | Clock           | No                 | Digital          | 3.36GHz                  | 350-700fs*           |
| [4]       | Clock           | Yes                | Digital          | 1GHz                     | 101fs*               |
| [5]       | PRBS            | Yes                | Analog           | 2.5Gb/s                  | 1.56ps+              |
| This Work | PRBS            | No                 | Digital          | 10Gb/s                   | ≤ 0.6ps              |
| This Work | Clock           | No                 | Digital          | 5GHz                     | ≤ 1ps                |

\*Jitter only reported for one case

<sup>+</sup> Value is standard deviation of measurement error

clock source. Using autocorrelation, the jitter's PSD can also be estimated. This approach is applicable to multilane CDRs where CDRs could be reconfigured in a diagnostic mode and allows for monitoring and optimization of the CDR's jitter performance.

### ACKNOWLEDGMENT

The authors would like to thank CMC Microsystems for providing test equipment and CAD tools and NSERC for partial funding.

#### REFERENCES

- M. Ishida *et al.*, "A programmable on-chip picosecond jitter-measurement circuit without a reference-clock input," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, 2005, pp. 512–614.
   K. Nose, M. Kajita, and M. Mizuno, "A 1-ps resolution jitter-mea-
- [2] K. Nose, M. Kajita, and M. Mizuno, "A 1-ps resolution jitter-measurement macro using interpolated jitter oversampling," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2911–2920, Dec. 2006.
- [3] K. Niitsu, M. Sakurai, N. Harigai, T. Yamaguchi, and H. Kobayashi, "CMOS circuits to measure timing jitter using a self-referenced clock and a cascaded time difference amplifier with duty-cycle compensation," *IEEE J. Solid-State Circuits*, vol. 47, no. 11, pp. 2701–2710, Nov. 2012.
- [4] T. Hashimoto, H. Yamazaki, A. Muramatsu, T. Sato, and A. Inoue, "Time-to-digital converter with vernier delay mismatch compensation for high resolution on-die clock jitter measurement," in *IEEE Symp. VLSI Circuits*, 2008, pp. 166–167.
- [5] M. Ishida, K. Ichiyama, T. Yamaguchi, M. Soma, M. Suda, and T. Okayasu, "On-chip circuit for measuring data jitter in the time or frequency domain," in *IEEE Radio Frequency Integrated Circuits Symp.*, 2007, pp. 347–350.
- [6] J. Schaub, F. Gebara, T. Nguyen, I. Vo, J. Pena, and D. Acharyya, "On-chip jitter and oscilloscope circuits using an asynchronous sample clock," in *European Solid-State Circuits Conf.*, 2008, pp. 126–129.
- [7] J. Liang, M. S. Jalali, A. Sheikholeslami, M. Kibune, and H. Tamura, "On-chip measurement of data jitter with sub-picosecond accuracy for 10 Gb/s multilane CDRs," in *IEEE Symp. VLSI Circuits*, 2014, pp. C114–C115.
- [8] D. Lee, "Analysis of jitter in phase-locked loops," *IEEE Trans. Circuits Syst. II: Analog Digit. Signal Process.*, vol. 49, no. 11, pp. 704–711, Nov. 2002.
- [9] B. Analui, A. Rylyakov, S. Rylov, M. Meghelli, and A. Hajimiri, "A 10-Gb/s two-dimensional eye-opening monitor in 0.13 μm in standard CMOS," *IEEE J. Solid-State Circuits*, vol. 40, no. 12, pp. 2689–2699, Dec. 2005.
- [10] K. Jenkins, K. Shepard, and Z. Xu, "On-chip circuit for measuring period jitter and skew of clock distribution networks," in *IEEE Custom Integrated Circuits Conf.*, 2007, pp. 157–160.
- [11] A. Demir, A. Mehrotra, and J. Roychowdhury, "Phase noise in oscillators: A unifying theory and numerical methods for characterization," *IEEE Trans. Circuits Syst. I: Fund. Theory Applicat.*, vol. 47, no. 5, pp. 655–674, May 2000.

- [12] A. Leon-Garcia, Probability, Statistics, and Random Processes for Electrical Engineering, 3rd ed. Upper Saddle River, NJ, USA: Pearson Prentice Hall, 2008.
- [13] P. Stoica and R. L. Moses, *Introduction to Spectral Analysis*. Upper Saddle River, NJ, USA: Prentice Hall, 1997.
- [14] E. Alon, V. Stojanovic, and M. Horowitz, "Circuits and techniques for high-resolution measurement of on-chip power supply noise," *IEEE J. Solid-State Circuits*, vol. 40, no. 4, pp. 820–828, Apr. 2005.
- [15] A. Napolitano, Generalizations of Cyclostationary Signal Processing. Chichester, West Sussex, U.K.: Wiley, 2012.
- [16] B.-J. Lee, M.-S. Hwang, S.-H. Lee, and D.-K. Jeong, "A 2.5–10-Gb/s CMOS transceiver with alternating edge-sampling phase detection for loop characteristic stabilization," *IEEE J. Solid-State Circuits*, vol. 38, no. 11, pp. 1821–1829, Nov. 2003.
- [17] A. Rezayee and K. Martin, "A 9–16 Gb/s clock and data recovery circuit with three-state phase detector and dual-path loop architecture," in *European Solid-State Circuits Conf.*, 2003, pp. 683–686.
- [18] T. Toifl et al., "A 22-Gb/s PAM-4 receiver in 90-nm CMOS SOI technology," *IEEE J. Solid-State Circuits*, vol. 41, no. 4, pp. 954–965, Apr. 2006.
- [19] G. Ono et al., "A 10:4 MUX and 4:10 DEMUX gearbox LSI for 100-Gigabit ethernet link," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 3101–3112, Dec. 2011.



Joshua Liang received the B.A.Sc. degree in engineering science and the M.A.Sc. degree in electrical engineering from the University of Toronto, Canada, in 2007 and 2009, respectively. From 2009 to 2011 he was an Analog Designer with Zarlink Semiconductor (now Microsemi) where he worked on circuits for low-jitter clock synthesis. Since 2012 he has been working toward the Ph.D. degree in electrical engineering at the University of Toronto in the area of circuit design for high-speed wireline and optical communications.



**Mohammad Sadegh Jalali** (S'13) received the Bachelor degree (with honors) in electrical engineering from the University of Tehran, Iran, the Master degree from the University of British Columbia, Canada, and the Ph.D. degree from the University of Toronto, Canada, in 2008, 2010, and 2014, respectively.

In 2014 he joined Semtech-Snowbush IP, and has been engaged in the development of multistandard SerDes IP.



Ali Sheikholeslami (S'98–M'99–SM'02) received the B.Sc. degree from Shiraz University, Iran, in 1990 and the M.A.Sc. and Ph.D. degrees from the University of Toronto, Canada, in 1994 and 1999, respectively, all in electrical engineering.

In 1999, he joined the Department of Electrical and Computer Engineering at the University of Toronto where he is currently a Professor. He was on research sabbatical with Fujitsu Labs in 2005–2006, and with Analog Devices in 2012–2013. His research interests are in analog and digital integrated circuits, high-

speed signaling, and VLSI memory design. He has coauthored over 50 journal and conference articles and 8 patents.

He served on the Memory, Technology Directions, and Wireline Subcommittees of the ISSCC in 2001–2004, 2002–2005, and 2007–2013, respectively. He is currently an Associate Editor for the Solid-State Circuits Magazine and the Educational Events Chair for ISSCC. He was an Associate Editor for the IEEE TCAS-I for 2010–2012, and the program chair for the 2004 IEEE ISMVL. He is a registered professional engineer in Ontario, Canada.

Dr. Sheikholeslami has received numerous teaching awards including the 2005–2006 Early Career Teaching Award and the 2010 Faculty Teaching Award, both from the Faculty of Applied Science and Engineering at the University of Toronto.



**Masaya Kibune** was born in Kanagawa, Japan, in 1973. He received the B.S. and M.S. degrees in applied physics from Tokyo University, Japan, in 1996 and 1998, respectively.

In 1998, he joined Fujitsu Laboratories, Ltd., Kanagawa, Japan. He has been engaged in research and design of CMOS high-speed IO.

Mr. Kibune was a TPC member of ASSCC from 2012 to 2013.



Hirotaka Tamura (M'02–SM'10–F'13) received the B.S., M.S., and Ph.D. degrees in electronic engineering from Tokyo University, Tokyo, Japan, in 1977, 1979, and 1982.

He joined Fujitsu Laboratories in 1982. After being involved in the development of different exploratory devices such as Josephson junction devices and high-temperature superconductor devices, he moved into the field of CMOS high-speed signaling in 1996. His first contribution to this area was in the designing of a receiver front-end for

DRAM-to-processor communications. Then, he got involved in the development of a multichannel high-speed I/O for server interconnects. Since then he has been working in the area of architecture- and transistor-level design for CMOS high-speed signaling circuits.