# Multi-Phase Bang-Bang Digital Phase Lock Loop with Accelerated Frequency Acquisition

Amer Samarah, Anthony Chan Carusone

Edward S. Rogers Sr. Department of Electrical and Computer Engineering

University of Toronto, Toronto, CANADA

*Abstract*—A highly digital phase lock loop with a multi-phase bang-bang phase detector is proposed to speed up lock time and to increase the pull in range. To reduce power consumption, the high speed counter and re-timing circuit in the feedback loop are disabled after frequency lock is achieved.

### I. INTRODUCTION

Low phase noise phase-locked loops (PLLs) are widely used in clock generation, frequency synthesis, and data conversion. Digital phase-locked loops (DPLLs) have been continuously replacing analog PLLs due to the continued shrinking of CMOS technologies and the resulting reduction of supply voltage as well as reductions in analog performance (lower gain and output impedance) and increasing the process variabilities. On the other hand, DPLLs which make use of digital signal processing take advantage of higher speed and more dense logic gates in newer process technologies, are less sensitive to mismatch and process variations and can operate from a low power supply voltage to save energy which is very important for portable devices.

Most DPLLs employ a time-to-digital converter (TDC) to quantize the phase difference between the reference and the output or divided clock signal. The TDC needs careful design and calibration to alleviate non-linearities which can limit their jitter performance. Furthermore, the TDC consumes a considerable amount of power compared to other components of DPLLs [1].

Bang-bang DPLL structures eliminate the need for complex TDC design and achieve low jitter and power performance [2]. However, bang-bang DPLLs suffer from slewing as seen in clock and data recovery (CDR) applications [3]. The binary nature of a bang-bang phase detector limits the information it provides to the loop filter and so in the case of large initial frequency and/or the phase error, it can take a considerable amount of time to correct the error; i.e. the lock time is large.

The proposed architecture is based on a multi-phase bangbang detector (MPBBD), which eliminates the need for a TDC circuit and the associated background calibration. The MPBBD accelerates phase acquisition during initial locking. Moreover, an auxiliary frequency lock loop (FLL) is also enabled upon reset of the DPLL and guarantees correct frequency operation without degrading the jitter performance during lock. Hence, the proposed detector acts as phase and frequency detector with an automatic gear shifting mechanism such that the bandwidth of DPLL can be adjusted automatically based on the magnitude of the phase/frequency error.



Fig. 1. Architecture of the proposed DPLL. The FLL and high speed counter as well as the synchronization structure are disabled by a lock detector once frequency lock occurs. The MPBBD locks the phase of the output clock (phase A) to the reference clock.



Fig. 2. Transfer function of the MPBBD and its LUT (thick solid blue) vs. BBPD (thin dashed red) given four phases of ring DCO sampled by a reference clock.

In [4], a multi-phase oscillator is used to synthesize simple fractional channels (like 1/2, 1/4, 1/8) by making use of the implicit TDC formed by the oscillator and a MPBBD. However, in the proposed architecture, a MPBBD is used for automatic gear shifting to accelerate the phase and frequency locking compared to case of a binary BBPD. Furthermore, MPBBD gives the DPLL the ability to track large frequency disturbances after being in lock without involving the FLL loop. By contrast, a binary BBPD will slew and may take long time to recover, if it ever does, under large frequency disturbance. In [5], a multi-phase detector is employed where



Fig. 3. Timing diagram of the multi-phase DCO sampled by a reference clock at some point. Based on the sequence of MPBBD outputs (01111000), the LUT provides indication of the phase error magnitude between phase A and reference clock as shown in the circle on the right bottom side

the clock phases are generated locally to allow in-loop modulation. Our proposed MPBBD has wider lock range and is simpler to implement.

The FLL loop, which includes a synchronization structure and higher speed counter, is disabled to save power once frequency lock is achieved.

In a classical PLL, whether it is analog or digital, the feedback divider's phase noise appears at the output amplified by a factor of  $N^2$ . On the other hand, using a sub-sampling phase detector [6] eliminates the amplification factor of input refereed noise and totally eliminates the divider noise. The reference clock phase noise is still multiplied by  $N^2$  when transferred to the output.

The presented MPBBD is similar to the sub-sampling detector where the oscillator output is sub-sampled by the reference clock and no divider is used during phase lock. Accordingly, the phase noise of the proposed DPLL is independent of the frequency control word (FCW) i.e. the multiplication factor. The main source of in-band noise is the reference clock noise and the noise on power supplies.

The ring oscillator noise is high passed filtered by the PLL loop. Accordingly, the loop bandwidth must be as high as possible to minimize the VCO noise. The reference noise will then be the dominant source of noise within the loop bandwidth and could limit the performance and stability of the PLL if very high bandwidth is used. In some work, injection locking could be used to reduce the noise contribution of ring oscillator[4].

This paper is organized as follows. Section II provides an overview of the proposed DPLL architecture and shows the multi-phase bang-bang detector and its timing and transfer function. The simulation results are presented in Section III while measurements results are shown in Section IV.

## II. ARCHITECTURE

The block diagram of the proposed DPLL is shown in Fig. 1. The frequency-lock loop (FLL) is composed of a



Fig. 4. The programmable delay unit used to form a four-stage DCO. Each unit has 7-bit coarse cap configuration and 8-bit fine cap implemented as combination of 4-bit binary along with 15 thermal caps.

re-timing circuit, a counter outputting the number of output clocks within each reference cycle, an accumulator for the phase of the synthesized channel based on a given frequency control word (FCW), and finally a digital substractor. The FLL controls a 7-bit coarse capacitor bank to bring the four-stage digitally controlled ring oscillator (DCO) as close as possible to the required output frequency. The FLL is then disabled after frequency locking is achieved. Accordingly, the proposed DPLL architecture saves the power of the high speed feedback counter as well as the power of the re-timing circuit during normal (locked) operation.

The DCO is followed by four flip-flops that sample the output phases with the reference clock, so no divided down output clock signal is needed. The 4-bit output of the multiphase bang-bang detector (MPBBD) is mapped to gain values using a look up table (LUT), that coarsely quantizes the phase error, as shown in Fig. 2. The digital loop filter outputs an 8-bit control word to achieve phase lock between the output clock (phase A of the DCO) and the reference clock.

A lock detector circuit is continuously checking the output patterns of MPBBD and FLL to determine whether the DPLL is frequency and phase locked or not. During the steady state, the FLL loop is disabled. Compared to a regular binary BBPD, the MPBBD is able to track larger phase or frequency error without reactivating the FLL and without slewing for long time. However, if the MPBBD output is slewing for a long time due to very large frequency error, the FLL gets enabled again until locking is achieved.

## A. DCO

The DCO is a four-stage differential ring oscillator, as shown in Fig. 3, where each stage has 7-bit coarse capacitor bank and 8-bit fine capacitor bank. The 4-MSBs of the fine bank are binary encoded while the 4-LSBs are thermometer encoded to reduce switching activity during locking. The tuning capacitors are implemented as switched active MOS device as shown in Fig. 4. The layout of the DCO is highly regular and so automated layout using, for example, a TCL is possible if a fully integrated design flow is sought.

The DCO output is a rail-to-rail clock signal where different frequencies are achieved by adding or removing MOS



Fig. 5. Absolute phase error of the output clock of DPLL with respect to an ideal clock with (a) binary BBPD (thin dashed blue) and (b) MPBBD (solid thick red). The initial frequency error is 300 MHz. Data is clipped below 6.5  $\mu$ s as rest was applied at that moment after loading the right loop configurations. The FLL takes 15.5  $\mu$ s (310 reference cycles) while PLL takes 30  $\mu$ s (600 cycles) in case BBPD (a) is used and  $5\mu$ s (100 cycles) in case MPBBD (b) is employed

capacitors. Accordingly, the power consumption, which is proportional to  $f.C.V^2$ , is quite consistent regardless of the DCO frequency. The power consumption can be brought down by using a current steering programmable DCO.

The mismatch between the DCO phases are not large compared to one quarter DCO cycle. Hence, the effect of a phase mismatch or duty cycle distortion could affect the speed of locking under a large frequency disturbance. But, it would not effect the steady state jitter performance when the loop operates, effectively, as with a BBPD.

## B. MPBBD

The eight output phases of the DCO (four differential phases) are sampled by the reference clock. This results in a 4-bit output stream which carries information of the phase error sign as well as its magnitude as shown in Fig. 2 and Fig. 3. The raw output of MPBBD is semi-thermometer encoded and passed through an LUT to generate a binary representation of the phase error magnitude and sign: -4, -3, -2, -1, +1, +2, +3 or +4 as shown in Fig. 3. The loop filter is programmable such that the gain of both proportional and integral path can be altered to achieve specific performance.

For example, if the output of MPBBD is [ACBD] = 0110, then the reference clock is leading the output clock by more than 90 degrees but less than 135 degrees. And so, the corresponding LUT output is -3 providing a larger phase correction signal to the DPLL loop filter.

During steady state operation, assuming the phase error remains in the range  $\pm$  45 degrees, the MPBBD alternates between +1 and -1 with the same dynamics as a PLL with a simple binary BBPD. However, during initial locking when the DCO frequency is not locked, the MPBBD output will span the full range from -4 to +4. Similarly, when the input clock is frequency modulated the MPBBD remains active.

## C. Loop Filter

The loop filter has a 16-bit output where the 8-MSBs are directly driving the DCO fine capacitor bank (which is a



Fig. 6. The mapped output of bang-bang detector (from LUT) during frequency and phase lock. The binary BBPD slews when phase error is high and takes long time to recover. On the other hand, MPBBD automatically gears its gain according to the phase error magnitude till lock is achieved.

combination of a 4-bit binary rationed bank and a 15-bit unary thermal banks). The 8-LSBs are considered fractional bits that represent a frequency step smaller than one DCO LSB step. To realize a finer DCO frequency resolution, the 8-LSBs at the loop filter output are fed to a first order noise-shaping deltasigma modulator (DSM) that drives one LSB capacitor at a speed higher than the reference frequency.

#### **III. SIMULATION RESULTS**

The DPLL settling time is inversely proportional to the loop bandwidth. When a MPBBD is employed, the DPLL loop bandwidth is increased by factor of four during the initial frequency and phase locking operation (in comparison to a binary BBPD offering the same jitter performance in lock). Accordingly, we can expect faster locking by approximately a factor of up to four. This temporary increase in bandwidth may be smaller than four depending on the frequency and phase error as well as on the jitter from reference and output clock.

Fig. 5 shows a behavioral simulation of the absolute phase error of the output clock converging to very small value (ideally zero) after phase lock. The speed of convergence (which is an indication of the settling time) is dependent of the type of phase detector used and on the initial frequency error. If a binary BBPD is used and the initial frequency error is around 300 MHz (34% locking range), the phase lock operation takes around 30 us. n that case, the BBPD slews for long time before finding the proper code to drive the DCO, as shown in Fig. 6(a). On the other hand, using the MPBBD only 5 us is needed to achieve phase lock given the same frequency error and same initial conditions.

Note that a binary bang-bang DPLL is directly analogous to a DSM modulator with binary quantizer [7] where wellestablished non-linear system theory can offer useful insights on locking behaviour and speed as well as the occurrence (and elimination) of spurs in the output spectrum. Similarly, a MPBBD-based DPLL is analogous to a DSM employing



Fig. 7. Phase noise simulation of the DPLL employing MPBBD. The reference clock is 20 MHz while the output is 1.20 GHz.

multi-bit quantization, which accounts for its improved stability, locking, and tracking behavior

## **IV. MEASUREMENT RESULTS**

A prototype chip was designed and fabricated in the STM 28nm LP CMOS process. Fig. 9 shows a die photograph. The active area is less than 0.008  $mm^2$  including the decoupling caps and output buffers. The reference clock is an off-chip high quality 20 MHz crystal oscillator from Wenzel Associates.

The measured coarse DCO step is around 2.5 MHz/ step while the fine DCO step is 13kHz/ step on average. Fig. 8 shows the phase noise spectrum of the 1.2 GHz PLL output measured from an Agilent spectrum analyzer. The in-band phase noise is -98 dBc/Hz at 200 kHz offset and out-of-band phase noise is -123 dBc/Hz at 20 MHz offset. Switching on/off the FLL has negligible effect on the spectrum. The in-band phase noise based simulation is around -102 dBc/Hz, as shown in Fig. 7.

The CMOS stages in the DCO have inherently low power supply noise rejection, and must therefore generally be operated from a regulated supply voltage, using a voltage regulator. No regulator was integrated into the present design, resulting in higher-than-expected phase noise.

The DPLL locks to the reference over the range 880 MHZ - 1.20 GHz using 1.1V power supply. The in-band phase noise was almost the same for the whole locking range. The power consumption of the DPLL was 502uW (1.1V x 456uA) excluding the DCO. Disabling the FLL after locking saved around 85uW of power in lock. The power savings would be larger if the DCO (and, hence, the frequency counter) was working at higher frequency. The DCO consumes from 2.9 - 3.1 mW depending on the frequency of operation. Using a 0.7V supply, the DPLL works at 440 MHZ while DPLL (excluding DCO) only consumes 64uW (0.7V x 91uA).

## V. CONCLUSION

In summary, DPLLs with BBPD are becoming more widely used compared to TDC-based DPLL due to their simplicity and low power consumption. However, binary BBPDs suffer from slewing and slow locking if the phase error is large.



Fig. 8. DPLL output phase-noise spectrum at 1.20 GHz captured by an Agilent E4448A spectrum analyzer. The in-band noise is -98 dBc/Hz while the loop bandwidth is around 1.7 MHz



Fig. 9. Die photograph of the DPLL in 28nm CMOS LP ST Microelectronics technology (active area is less than 0.008  $mm^2$ ).

An improved multi-phase bang-bang detector (MPBBD) is proposed to achieve fast lock time and to avoid slewing, effectively realizing automatic gear shifting. The modification of the loop DPLL compared to a binary BBPD is minimal, requiring only three additional flip flops and a small LUT. A further reduction of power consumption is achieved by disabling the high speed logic used for frequency counting after frequency lock is achieved.

#### ACKNOWLEDGEMENT

The authors would like to acknowledge CMC Microsystems for providing CAD tools and fabrication access through STM.

#### REFERENCES

- M. Lee and A. Abidi, "A 9 b, 1.25 ps Resolution Coarse-Fine Timeto-Digital Converter in 90 nm CMOS that Amplifies a Time Residue," *Solid-State Circuits, IEEE Journal of*, vol. 43, no. 4, pp. 769–777, 2008.
- [2] G. Marucci et al., "Analysis and Design of Low-Jitter Digital Bang-Bang Phase-Locked Loops," Circuits and Systems I: Regular Papers, IEEE Transactions on, vol. 61, no. 1, pp. 26–36, 2014.
- [3] M. Brownlee et al., "A 3.2Gb/s Oversampling CDR with Improved Jitter Tolerance," in Custom Integrated Circuits Conference, 2007. CICC '07. IEEE, 2007, pp. 353–356.
- [4] P. Park et al., "An All-Digital Clock Generator using a Fractionally Injection-Locked Oscillator in 65nm CMOS," in *Solid-State Circuits* Conference Digest of Technical Papers (ISSCC), 2012 IEEE International, 2012, pp. 336–337.
- [5] R. Nonis et al., "DigPLL-Lite: A Low-Complexity, Low-Jitter Fractional-N Digital PLL Architecture," Solid-State Circuits, IEEE Journal of, vol. 48, no. 12, pp. 3134–3145, 2013.
- [6] X. Gao et al., "A Low Noise Sub-Sampling PLL in Which Divider Noise is Eliminated and PD/CP Noise is Not Multiplied by N<sup>2</sup>," Solid-State Circuits, IEEE Journal of, vol. 44, no. 12, pp. 3253–3263, 2009.
- [7] D. Liu *et al.*, "A Frequency-Based Model for Limit Cycle and Spur Predictions in Bang-Bang All Digital PLL," *Circuits and Systems I: Regular Papers, IEEE Transactions on*, vol. 59, no. 6, pp. 1205–1214, 2012.