# Time Interleaved C-2C SAR ADC with Background Timing Skew Calibration in 65nm CMOS

Luke Wang, Qiwei Wang, and Anthony Chan Carusone

Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, CA E-mail: luke.wang@isl.utoronto.ca, jeffrey.wang@isl.utoronto.ca, tony.chan.carusone@isl.utoronto.ca

*Abstract*—This paper presents a 5GS/s 8-bit 40-way timeinterleaved SAR ADC fabricated in 65nm CMOS. Two-level hierarchical interleaving is employed, resulting in 4 sub-ADCs each operating at 1.25GS/s at the topmost level with front-end track and hold samplers. The sub-ADCs use capacitive C-2C DACs to minimize the input capacitance and area. A novel background timing skew calibration method is used which requires no redundant signal paths. After calibration, the ADC achieves an SNDR of 33.3dB at Nyquist and consumes 138.6mW from a 1V supply with a 5GS/s sampling rate, yielding an FOM of 738fJ/conv-step. The individual sub-ADC achieves an SNDR of 37.9dB at Nyquist and consumes 34.2mW, yielding an FOM of 428fJ/conv-step.

# I. INTRODUCTION

Both wireless and wireline receivers may benefit from the improved testability and technology scaling offered by a frontend analog-to-digital converter (ADC) followed by digital signal processing (DSP) blocks configurable for multiple communication standards. For multi-Gb/s data rates, time interleaved (TI) ADC designs offer the potential for excellent power efficiency [1]. For massively time-interleaved ADCs, the input capacitance may become a problem, which in general limits the number of channels *N*. In this work, instead of using custom-engineered (<1fF) unit capacitors [2], a C-2C DAC architecture is adopted for the individual time-interleaved successive approximation register (SAR) ADCs to keep the input capacitance reasonable when interleaving. Digital correction is used to mitigate the nonlinearity that results from parasitics in the C-2C ladder [3].

Mismatch between sub-ADCs, including gain, offset, bandwidth, and timing skew mismatch, introduces timevarying errors, with the last two impairments dominating when the input frequency is high at GHz sampling rates. In this work, boost-strapped CMOS sample-and-holds provide 4.8GHz analog bandwidth, mitigating the impact of bandwidth mismatch. Previous work in timing skew mismatch correction involves the addition of reference ADC channel(s) [2][4] or achieves slow convergence [5]. In this work, a fast background least-mean-square (LMS) calibration technique is proposed which corrects the timing skew mismatch using only the outputs of the existing ADC channels.

# II. SUB-ADC DESIGN

# A. Architecture and Implementation

The time-interleaved successive approximation register (SAR) ADC remains the most power efficient architecture for high speed (>5GS/s) and high resolution (>6-bit) applications



Fig. 1: 1.25GS/s 8-bit C-2C SAR sub-ADC block diagram

[6] since its power consumption is predominantly due to digital circuitry which scales well with process technology. In addition, the unit capacitor size in the charge-redistribution DACs also scales with process technology, reducing circuit area and clocking power. The area and input capacitance of massively TI SAR ADCs can become prohibitive if each sub-ADC uses a binary-weighted capacitive DAC. To address this, past work has made use of tiny (<1fF) unit capacitors [2], which require detailed electromagnetic analysis and/or prototype characterization. By contrast, in this work, C = 40 fFunit capacitors included in the standard design kit are used throughout, and a C-2C architecture is used to limit the size of the DAC and hence input capacitance. The same total capacitance, 2C, is presented to the input sampling switch during each step of the conversion. The drawback of C-2C DACs is nonlinearities (change in radix) due to the parasitic capacitances, however these may be alleviated by calibration [3].

Each sub-ADC was designed to operate at 1.25GS/s for an aggregate 5GS/s 8-bit TI ADC. In order to relax timing constraints, the 8-bit sub-ADC consists of 10 elementary synchronous SAR ADCs whose outputs are combined with a 10-to-1 multiplexer as shown in Fig. 1. Each sub-ADC has a front-end sampler which consists of a pair of bootstrapped switches and unity gain buffers implemented with PMOS source followers. The sampling capacitor has a value of 100fF. Each elementary synchronous SAR ADC consists of a pair of bootstrapped input switches, comparators, C-2C DACs and a digital controller. The secondary switches in the elementary SAR do not introduce additional timing skew error due to the fact that the timing instant is completely defined by the front-end sampler. The front-end sampler including interconnect has a simulated bandwidth of 4.8GHz after RCC extraction, sufficient for a Nyquist signal at 2.5GHz, corresponding to the aggregate sampling rate. The comparator is made of a pre-amplifier with a gain of 10dB and bandwidth



Fig. 2: 8-bit 1.25GS/s sub-ADC performance up to 4GHz



Fig. 3: Die photo with sub-ADC components indicated

of 3.6GHz and a double-tail latch [7]. Each elementary SAR ADC takes 10 cycles to complete 8-bit conversion: 1 cycle/bit, 1 cycle for sampling and 1 cycle for latching the result. Since there are 10 elementary SAR ADCs in each sub-ADC, a single 1.25GHz differential clock signal is sufficient. The input range is  $1V_{pp}$  differential,  $V_{refn} = 400$ mV,  $V_{cm} = 650$ mV, and  $V_{refp} = 900$ mV.

### **B.** Measurement Results

The ADC was implemented in 65nm CMOS technology. The individual 1.25GS/s sub-ADC consumes 34.2mW from a 1V supply. Even though the sub-ADC does not experience errors due to timing skew, the 10 elementary SAR ADCs can still experience gain, offset and radix mismatch. The DNL is 1.9/-1.0 LSB and the INL is 2.2/-1.7 LSB after performing offchip calibration as in [3] to remove these errors. The sub-ADC achieves an SNDR of 39.4dB at low frequency and 37.9dB at Nyquist frequency. A plot of SNDR/SFDR up to an input frequency of 4GHz is shown in Fig. 2. The FoM is 360fJ/convstep and 428fJ/conv-step at low and Nyquist frequency respectively. The die photo is shown in Fig. 3 with the sub-ADC components indicated. It occupies an area of 400µmx600µm. A performance summary & comparison with other C-2C DAC based SAR ADCs is shown in Table 1 for the 1.25GS/s sub-ADC. This work achieves a good SNDR while maintaining a comparable FoM.

# III. TIMING SKEW OVERVIEW

## A. Impact of Timing Skew

If there exists timing skew, for instance caused by variations in the layout of the clock paths to the sub-ADCs, then spurious tones appear at frequencies  $\pm f_{in} + (k/N)f_s$ , where k ranges from 1 to N-1. The magnitude of the tones increases as the input frequency increases (error proportional to signal

TABLE I. PERFORMANCE SUMMARY C-2C SAR ADCs > 500MS/s

| Ref. | Res.   | Sample Rate | Power | SNDR | FOM       |
|------|--------|-------------|-------|------|-----------|
|      | [bits] | [GS/s]      | [mW]  | [dB] | [fJ/conv] |
| [3]  | 6      | 0.6         | 5.3   | 34   | 220       |
| [8]  | 6      | 1           | 6.27  | 31.5 | 210       |
| [9]  | 7      | 2.5         | 50    | 34   | 480       |
| This | 8      | 1.25        | 34.2  | 37.9 | 428       |
| Work |        |             |       |      |           |

derivative as 1<sup>st</sup> order approximation), leading to significant SFDR/SNDR degradation at high frequency for broadband TI ADCs. It is possible to derive an analytical bound on skew in order to achieve a certain resolution at a specific data rate. The bound for a *N* channel ADC achieving *B*-bit resolution at input frequency *f* is given in [4] as

$$\sigma_{\Delta t}^{2} \leq \left(\frac{N}{N-1}\right) \left(\frac{2}{3(2^{2B})(2\pi f)^{2}}\right)$$
 (1)

Note that for median to high resolution (>6-bit) and high speed (>2.5GHz) ADCs, sub-picosecond skew standard deviation is required. For instance, for an ADC resolution of 6 bits, the standard deviation must be less than 0.9ps for 2.5GHz input signal. In general it is impossible to meet this bound by careful design alone, therefore calibration is employed.

## B. Mitigation of Timing Skew

In general background calibration is preferred where the ADC can sustain normal operation during the calibration phase. Mixed-signal calibration methods are also popular where detection of the skew is performed digitally by observing the ADC outputs, and correction is applied via an analog control, such as tuning the delay of buffers in the clock path. This avoids the use of long FIR filters for interpolation of small timing skews in the digital domain. The detection generally works to minimize or maximize an appropriate cost function. For instance, the work in [4] maximizes the crosscorrelation between each of the sub-ADC outputs, in turn, and an additional subsampled reference ADC channel. A similar approach was used in [2], except that two additional reference channels were used to generate an estimate of the derivative. The input derivative and error in each sub-ADC output with respect to the reference channel can together be used to infer the timing skews. In [5] the sub-ADC clock skews are adjusted until the probability of a zero crossing between each pair of neighbouring samplers was equal. This technique did not need an additional ADC channel, but convergence was slow, requiring 2<sup>24</sup> samples to settle. This work proposes a fast background approach with no additional reference channels.

# IV. PROPOSED BACKGROUND CALIBRATION ALGORITHM

Consider a converter with 2 time-interleaved channels and samples shown in Fig. 4a. The clock phases are  $\phi_0$  and  $\phi_1$ , and  $\phi_1$  is skewed by a value of  $\Delta t$ . Define two errors terms:  $e_1$ which is the difference in amplitude between the current sample and previous sample,  $e_1=y_1[k-1]-y_0[k]$ , and  $e_2$  which is the difference in amplitude between the current sample and next sample,  $e_2=y_1[k-1]-y_0[k-1]$ . When the skew is zero, the mean of  $|e_1|$  and  $|e_2|$  are equal. A block diagram is shown in



Fig. 4: (a) Sampling sequence for a 2 channel TI ADC (b) Block diagram of proposed calibration method

Fig. 4b, where the difference in the mean of  $|e_1| - |e_2|$  is used to tune an analog delay  $\Delta t_x$  in the clock path of  $\phi_1$ .

This calibration scheme can be generalized to converters with more than 2 channels by calibrating the phases against each other. Consider a converter with N channels, where N is a power of 2, and the input signal x(t) is sampled nominally at  $f_s$ =  $1/T_s$ . Define  $e_1$  and  $e_2$  as differences between samples of x(t)taken at  $(iN+k)T_s$ ,  $(iN+k-j)T_s$  and  $(iN+k+j)T_s$ , where k and j vary with each calibration step;  $i=0,1,2..., k \in [1,N-1]$ ,  $j \in [1,N/2]$ . Again, with equally-spaced samples and stationary input statistics,  $e_1$  and  $e_2$  should have the same mean magnitude. The proposed calibration technique minimizes the *M*-sample mean *g*.

$$e_{1}(i) = x[(iN+k)T_{s}] - x[(iN+k-j)T_{s}]$$

$$e_{2}(i) = x[(iN+k)T_{s}] - x[(iN+k+j)T_{s}]$$

$$g = \left|\frac{1}{M}\sum_{i}(|e_{1}| - |e_{2}|)\right| \quad (2)$$

The calibration sequence for a 4 channel ADC is shown in Fig. 5. Phase  $\phi_0$  is used as the reference phase. First, phase  $\phi_2$  (*k*=2) is tuned to the correct position using two neighbouring samples taken using  $\phi_0$  (*j*=2). This is done using a magnitude-sign LMS iterative update on the delay code of  $\phi_2$  to minimize the cost function *g*. Second, phase  $\phi_1$  (*k*=1) is tuned to the correct position using neighbouring samples from  $\phi_0$  and  $\phi_2$  (*j*=1). Finally, phase  $\phi_3$  (*k*=3) is tuned to the correct position using neighbouring samples from  $\phi_2$  and  $\phi_0$  (*j*=1). The total number of steps required is therefore *N*-1. Extending the calibration to converters will more than 2 channels does impose additional signal constraints. For 4 channels, the input frequency must be less than  $0.4f_s$  to compensate for a maximum skew of  $0.5T_s$ . It works for broadband inputs



Fig. 5: Calibration sequence for 4 channel TI ADC (a) tuning phase  $\phi_2$ , (b) tuning phase  $\phi_1$ , (c) tuning phase  $\phi_3$ 



Fig. 6: 5GS/s time interleaved ADC top-level block diagram

following the same constraint. Background timing skew calibration without a reference ADC was also performed in [10], but the work focused on only two time-interleaved channels and required multiplying the outputs of the two channels. This work has the advantage of using only simple addition/subtraction and absolute value operations, i.e. it uses amplitude information directly to generate skew information.

#### V. 5GS/s TIME-INTERLEAVED ADC

#### A. Architecture and Design

A high-level block diagram of the TI 8-bit SAR ADC is shown in Fig. 6. It consists of two levels of time-interleaving. At the lower level, 10 elementary differential SAR ADCs operate at 125MS/s. At the higher level, there are 4 sub-ADCs sampling at 1.25GS/s for an aggregate rate of 5GS/s. The 8-bit outputs of the sub-ADCs are combined, retimed and downsampled by 81x for off-chip processing. The skew calibration is implemented off-chip.

For clock distribution, differential 5GHz current mode logic (CML) clock is distributed from off-chip via an on-chip transmission line to the middle of the chip where a divider produces 4 differential phases at 1.25GHz for the samplers of the 4 sub-ADCs. The phases are then buffered using CML buffers and distributed by an H-bridge as shown in Fig 7. Similarly, the differential input is distributed via an on-chip transmission line from the right (not shown) terminated near



Fig. 7: Clock distribution for 5GS/s time-interleaved ADC



Fig. 8: Clock tuning using delay-line

the center of the chip, and an H-bridge to the 4 sub-ADCs. A total of 8 sub-ADCs were integrated onto the die, but only 4 were used for testing, with no additional reference ADCs for calibration. In order to correct the timing skew of each of the 4 channels, 7 bit (4 bit thermometer coded, 3 bit binary coded) controllable delays were implemented in the clock paths as shown in Fig. 8 after the CML clocks are converted into CMOS levels for the bootstrap switch. The delay codes are updated by the skew calibration algorithm running off-chip. The delay line has a resolution of 440fs and a range of  $\pm 28$ ps.

#### **B.** Measurement Results

The 5GS/s ADC consumes 138.6mW from a 1V supply at Nyquist frequency of 2.5GHz. The skew calibration is performed after gain, offset, and radix calibration. The delay code for each channel converge in 30 iterations, using 2500 samples per iteration, as shown in Fig. 9. Note that channel 2 is calibrated first, followed by 1 and then 3, resulting in 90 iterations total. Fig. 10 shows the spectrum of a 4GHz input signal before and after skew calibration. Note that the SNDR improves by 20dB. Fig. 11 shows the performance of the ADC for input frequencies up to 4GHz. Notice that at higher frequency, the SNDR is severely impacted by timing skew, and the effect of gain/offset/radix calibration is minimal. The ADC achieves an SNDR of 33.25dB at Nyquist. The figure of merit (FoM) is 401fJ/conv-step and 738fJ/conv-step for low and Nyquist frequency respectively.

#### ACKNOWLEDGMENT

The authors would like to thank Victor Kozlov for the design and layout of the one-shot/thermometer decoder.

#### REFERENCES

[1] L. Sumanen, M. Waltari, and K. Halonen, "A 10-bit 200-MS/s CMOS Parallel Pipeline A/D Converter," JSSC, 2011.



Fig. 9: Timing skew calibration showing delay code convergence using the method of section VI



Fig. 10: Spectrum before (gray) and after skew calibration (black): 2500pt FFT,  $f_{in} = 3.998$ GHz, aliased to 14.35MHz after down-sampling 81x, circular markers denote distortion tones due to skew, fundamental tone denoted by diamond marker



Fig. 11: 5GS/s ADC performance up to 4GHz

- [2] D.Stepanovic, B.Nikolic, "A 2.8GS/s 44.6mW Time-Interleaved ADC Achieving 50.9dB SNDR and 3dB Effective Resolution Bandwidth of 1.5GHz in 65nm CMOS," VLSI, 2012.
- [3] S.Chen, et al, "A 6-bit 600-MS/s 5.3-mW Asynchronous ADC in 0.13um CMOS," JSSC, 2006.
- [4] M.Chammas, B.Murmann, "A 12-GS/s 81-mW 5-bit Time-Interleaved Flash ADC with Background Timing Skew Calibration," JSSC, 2011.
- [5] C.Huang, et al, "A CMOS 6-Bit 16-GS/s Time-Interleaved ADC Using Digital Background Calibration Techniques," JSSC, 2011.
- [6] B. Murmann, "Limits on ADC Power Dissipation," in Analog Circuit Design, by M. Steyaert, A.H.M. Roermund, J.H. van Huijsing (eds.), Springer, 2006.
- [7] D. Schinkel, "A Double-Tail Latch-Type Voltage Sense Amplifier with 18ps Setup+Hold Time," ISSCC, 2011.
- [8] J. Yang, T. L. Naing, and R. W. Brodersen, "A 1 GS/s 6 Bit 6.7 mW Successive Approximation ADC Using Asynchronous Processing," JSSC, 2010.
- [9] E. Alpman, H. Lakdawala, L. R. Carley, and K. Soumyanath, "A 1.1V 50mW 2.5GS/s 7b Time-Interleaved C-2C SAR ADC in 45nm LP Digital CMOS," ISSCC, 2009.
- [10] B.Razavi, "Design Considerations for Interleaved ADCs," JSSC, 2013.