# Partial Analog Equalization and ADC Requirements in Wired Communications

by

Amir Hadji-Abdolhamid

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Electrical and Computer Engineering University of Toronto



© Copyright by Amir Hadji-Abdolhamid, 2004

# Partial Analog Equalization and ADC Requirements in Wired Communications

Amir Hadji-Abdolhamid Department of Electrical and Computer Engineering University of Toronto Degree of Doctor of Philosophy, 2004

## ABSTRACT

High-speed high-resolution analog-to-digital converters (ADC) are one of the major bottlenecks in digital communication systems. Every extra bit requirement in a high-speed flash ADC roughly doubles the silicon area and power consumption of the chip and furthermore, complicates ADC design.

This thesis investigates the ADC requirements for wired communication applications and presents an efficient partial analog equalization (PAE) approach to reduce the frontend ADC resolution requirement. The contributions of this thesis include two major components. First, an analytical study elaborates the benefit of partial equalization in terms of ADC bit requirements. Second, an implementation of a high-speed PAE / ADC, combined on a single 1.8-V CMOS chip, is demonstrated and the benefit of 2-3 bits improvement is verified, experimentally. Moreover, the optimization of PAE coefficients and the similarity of 2-tap PAE to an analog first-order decorrelator is investigated. The analytical discussions include studying the benefit of PAE in baseband systems with both feedforward and decision feedback equalizers. Similar benefits of PAE in a passband modulation system is also discussed as an appendix for future research direction.

The target application for this thesis is 622 Mb/s over a 300-m coaxial cable for serial digital video data transmissions. The proposed PAE along with a 6-bit 400-MHz flash ADC was designed and fabricated in a 0.18-µm CMOS process. The fabricated chip consumes 106 mW of power with 34-dB SNDR at 250 MHz sampling clock. For a 400-Mb/s data transmission over a 240-m coaxial channel, experimental results showed an error performance improvement equivalent to an 8-bit-ADC system.

#### **Table of Contents**

| CH  | APTER     | 1 Introduction                                                        | 1  |
|-----|-----------|-----------------------------------------------------------------------|----|
| 1.1 | Motivat   | ion and Introduction                                                  | 1  |
| 1.2 | Thesis (  | Dutline                                                               | 3  |
|     |           |                                                                       |    |
| CH  | APTER2    | 2 Background                                                          | 7  |
| 2.1 | Introduc  | ction                                                                 | 7  |
| 2.2 | Wired I   | Data Communication Overview                                           | 7  |
|     | 2.2.1     | Twisted Pairs                                                         | 7  |
|     | 2.2.2     | Coaxial Cables                                                        | 8  |
| 2.3 | Digital   | Communication Systems                                                 | 9  |
| 2.4 | Equaliz   | ation                                                                 | 12 |
| 2.5 | Adaptiv   | e Filtering Overview                                                  | 15 |
| _   | 2.5.1     | Adaptation of Cascaded FIR Filters                                    |    |
| 2.6 | Coaxial   | Cable Modeling                                                        | 17 |
| 2.7 | Summa     | ry                                                                    | 19 |
| СН  | APTER     | <b>3</b> ADC Requirements and Partial Equalization                    | 21 |
|     |           |                                                                       |    |
| 3.1 | Introduc  | ction                                                                 | 21 |
|     | 3.1.1     | Analog to Digital Conversion in Digital Communication Receivers       |    |
| 3.2 | Quantiz   | ation Noise                                                           | 23 |
| 3.3 | ADC In    | put Characterization                                                  | 25 |
| 3.4 | ADC R     | esolution Requirement                                                 | 29 |
|     | 3.4.1     | Example and Comparison to Simulation Results                          | 32 |
| 3.5 | ADC B     | it Requirement Reduction Techniques and Partial Analog Equalization . | 33 |
| 3.6 | Partial I | Equalizer Design                                                      | 36 |
|     | 3.6.1     | Optimizing the PAE Independently for Maximal equalization             |    |
|     | 3.6.2     | Splitting an Optimum Equalizer into Analog and Digital Filters        |    |
|     | 3.6.3     | Global Optimization Using Genetic and/or Gradient Search              |    |
|     | 3.6.4     | Simulation Results and Comparisons                                    | 41 |
| 3.7 | Two-tap   | PAE: An Efficient Choice                                              | 42 |
|     | 3.7.1     | Using Decorrelation Concept for a 2-tap PAE Design                    | 45 |
|     | 3.7.2     | Using the PAE Inverse in the Digital Domain                           | 47 |
|     | 3.7.3     | Comparison with predictive coding systems (ADPCM)                     | 49 |
|     | 3.7.4     | PAE Usage in FFE and FFE/DFE Architectures and Comparisons            | 52 |
| 3.8 | Simulat   | ion Results                                                           | 55 |
|     | 3.8.1     | Quantization Noise Reduction Demonstration                            | 55 |
|     | 3.8.2     | Using 2-tap PAE in Coaxial Channel Application                        | 56 |
| 3.9 | Summa     | ry                                                                    | 57 |

| CH                                                        | APTER 4 Circuit Implementation                                           | .59 |  |
|-----------------------------------------------------------|--------------------------------------------------------------------------|-----|--|
| 4.1                                                       | Introduction                                                             | .59 |  |
| 4.2                                                       | Partial Analog Equalizer Design                                          | .59 |  |
|                                                           | 4.2.1 Delay Line Generation Techniques                                   | 60  |  |
|                                                           | 4.2.2 PAE Topology Choices                                               | 61  |  |
|                                                           | 4.2.3 Sample-and-Hold design                                             | 65  |  |
|                                                           | 4.2.4 Non-overlapping Triple Phase Clock Generation and Digital Controls | 71  |  |
|                                                           | 4.2.5 The Design of Transconductors                                      | 73  |  |
|                                                           | 4.2.6 Current amplifier and I/V converter and voltage buffer             | 79  |  |
|                                                           | 4.2.7 Overall Performance and Simulation results                         | 81  |  |
| 4.3                                                       | ADC Flash Architecture                                                   | .83 |  |
|                                                           | 4.3.1 Comparator Design                                                  | 84  |  |
|                                                           | 4.3.2 Autozeroing Techniques                                             | 90  |  |
|                                                           | 4.3.3 ADC Reference Voltages                                             | 94  |  |
|                                                           | 4.3.4 Digital Back-end                                                   | 95  |  |
|                                                           | 4.3.5 Layout                                                             | 96  |  |
| 4.4                                                       | Bias Circuit                                                             | .97 |  |
| 4.5                                                       | Summary                                                                  | .98 |  |
| CH                                                        | APTER 5 Experimental Results1                                            | 01  |  |
| 5.1                                                       | Introduction1                                                            | 01  |  |
| 5.2                                                       | Test Set-up and Functional Testing1                                      | 01  |  |
| 5.3                                                       | Dynamic Test and SNDR Measurement1                                       | 07  |  |
| 5.4                                                       | Code Density Measurement and INL/DNL1                                    | 10  |  |
| 5.5                                                       | Experiment in a 400-Mb/s 240-m Coaxial Cable System1                     | 14  |  |
|                                                           | 5.5.1 Test Set-up                                                        | 14  |  |
|                                                           | 5.5.2 Special Considerations 1                                           | 19  |  |
| 5.6                                                       | Channel Emulator Using Arbitrary Waveform Generator1                     | 23  |  |
| 5.7                                                       | Summary1                                                                 | 25  |  |
|                                                           |                                                                          |     |  |
| CH                                                        | APTER 6 Summary and Future Research1                                     | 27  |  |
| 6.1                                                       | Summary and Conclusions                                                  | 27  |  |
| 6.2                                                       | Suggestions for Future Work1                                             | 28  |  |
|                                                           |                                                                          |     |  |
| API                                                       | PENDIX A Transconductor Circuit Analysis1                                | 31  |  |
| A.1                                                       | Transconductor feedback loop analysis and lead compensation1             | 31  |  |
|                                                           | A.1.1 No Compensation                                                    | 35  |  |
|                                                           | A.1.2 Lead Compensation                                                  | 36  |  |
| A.2                                                       | Transconductance Gain Expression and the Effect of the Feedback Loop1    | 39  |  |
|                                                           | I I I I I I I I I I I I I I I I I I I                                    |     |  |
| APPENDIX B Global Gradient Search for PAE Optimization143 |                                                                          |     |  |

# APPENDIX C Reduction of ADC Requirements in CAP/QAM Receivers ......147

| C.1 | Introduction                     | 147 |  |  |
|-----|----------------------------------|-----|--|--|
| C.2 | CAP/QAM Modulation               | 148 |  |  |
| C.3 | Conventional CAP Receiver        | 149 |  |  |
|     | C.3.1 Front-end ADC Requirements | 150 |  |  |
| C.4 | The Proposed Architecture        | 152 |  |  |
| C.5 | Simulation Results               | 153 |  |  |
| C.6 | Summary                          | 156 |  |  |
|     | -                                |     |  |  |
| REF | REFERENCES                       |     |  |  |

# List of Abbreviations

| ADC   | Analog-to-digital Converter                  |
|-------|----------------------------------------------|
| BER   | Bit error rate                               |
| CAP   | Carrierless amplitude and phase (modulation) |
| CLK   | Clock                                        |
| DFE   | Decision feedback equalizer                  |
| FFE   | Feedforward equalizer                        |
| FFT   | fast fourier transform                       |
| FIR   | Finite impulse response (filter)             |
| IIR   | Infinite impulse response                    |
| ISI   | Intersymbol interference                     |
| KCL   | Kirchhoff's current law                      |
| KVL   | Kirchhoff's voltage law                      |
| MSE   | Mean square error                            |
| MUX   | Multiplexer                                  |
| NRZ   | Non-return to zero                           |
| PAE   | Partial analog equalizer                     |
| PAM   | Pulse amplitude modulation                   |
| PSD   | Power spectral density                       |
| QAM   | Quadrature amplitude modulation              |
| RMS   | Root mean square error                       |
| RRMSE | Relative root mean square error              |
| S/H   | Sample and hold                              |
| SER   | Symbol error rate                            |
| SNDR  | Signal-to-noise and distortion ratio         |
| SNR   | Signal-to-noise ratio                        |
| VGA   | Variable gain amplifier                      |
|       |                                              |



# Introduction

## **1.1 Motivation and Introduction**

Data communications plays a major role in the modern world from the industries to the everyday life. The advancement of this role in recent times has created an increasing demand for higher data transmission rates, over longer paths and for the lowest cost for different applications. These communication paths vary from large distances such as the ones in wide or local area networks (WAN / LAN) to just a few inches, as seen in high speed links on a printed circuit board between two chips.

The limitations of data transfer rates over different wireless and wired channels are caused by the imperfections of the channels (such as cable attenuation, dispersion in fiber optics, cross talk and fading) and the practical limitations in transceiver building blocks (such as linearity, noise and clock jitter). The achievable signal-to-noise ratio within the available bandwidth can be used to find the theoretical upper limit of the data rate according to the Shannon Theorem [1].

The means of achieving higher data rates over band-limited channels include using multi-level signalling and digital processing including digital equalization with the aid of an appropriate front-end analog-to-digital converter (ADC). The design of high-speed high resolution ADCs is a major challenge in receiver design. Every extra bit requirement in a high-speed flash ADC, roughly doubles the silicon area and power consumption of the chip and furthermore, complicates the design at high speed with regard to the effective number of bits [2][3][4].

2

Communication channels with large attenuation cause large inter-symbol interference (ISI). Equalizers are responsible for minimizing the ISI error by compensating for channel attenuation. In the case of digital equalization, as ISI increases, more resolution is needed for the front-end ADC before the equalizer. Using full-analog equalizers to eliminate the demand for accurate ADC and digital equalization is an approach used in many wired communication applications [5][6][7][8][9]. However, there are many practical difficulties in the design of high speed adaptive analog filters with large orders, such as dc offset [10], mismatch effect [11][12] and nonlinear distortions. These difficulties are aggravated in short channel CMOS technologies at higher speeds and longer cable channels. Alternatively, adaptive digital receivers with a front-end ADC offer many advantages such as reliability and flexibility [7]. For example, digital decision-feedback equalizers (DFE) with no channel noise enhancement, digital pulse shaping and digital clock recovery are feasible in a system with an ADC. Moreover, the benefit of shrinking CMOS technology lies mainly in favour of digital circuitries. Therefore, low order partial-analog equalizers, rather than full-analog ones are excellent candidates for relaxing the ADC and digital circuit requirements.

Few partial analog equalizers (PAE) have been reported [13][14][15]. However, the trade-off between the order of partial preprocessing and their actual effect on the ADC performance enhancement has not been elaborated. In this thesis, we discuss the effect of the number of ADC bits or quantization noise in a digital communication system, and introduce possible analog preprocessing as a means to reduce the effect of quantization noise. In low-noise channels such as coaxial cables, and in high speed systems where low resolution ADCs are used, the quantization error is of greater importance. In this thesis, by studying the ADC requirements and partial equalization effect, an efficient two-tap partial analog equalizer is shown to significantly relax the ADC requirements. This is particularly true for the communication systems with channel losses greater than about 20 dB at half the baud-rate.

The two-tap partial analog equalization approach can be viewed as a first order decorrelator which by reducing the ADC input samples correlation reduces the crest factor of the ADC input [16]. Furthermore, the quantization noise in the reconstructed signal after the ADC will be converted to a lowpass shape. This lowpass shaped noise is not enhanced much by the existing equalizer after the ADC. To consider all practical aspects of this effi-

3

cient PAE approach, a 400-MHz 6-bit ADC with a flash architecture along with an adaptive 2-tap PAE is implemented in 0.18-µm CMOS technology [17]. Experimental results demonstrate that the combination of partial analog equalization with a 6-bit ADC achieves better performance than an 8-bit ADC system for our target application. The extra power and area consumed by the PAE is 12% and 17%, respectively. A variety of circuit design issues that exist in the design of the proposed analog preprocessor and the high speed analog-to-digital converter are also addressed in this thesis.

The target application in this thesis is 622-Mb/s data transmission over a 300-m coaxial cable with a 4-level non-return-to-zero pulse amplitude modulation (NRZ PAM) scheme. Coaxial cables are a common physical medium in wired communications, applicable in digital video transmission and also as the backbone of many networking applications. A 300-m coaxial cable channel causes a considerable amount of ISI and has a magnitude loss of about 32 dB at 155.5MHz [18]. The 622 Mb/s rate is compatible with the STS-12 OC12 standard, originally developed for fiber optic channels. Although the major application addressed here is digital serial video applications over coaxial cables, the concept and the design within this thesis can be extended to other wired applications and various digital serial applications with large channel loss and ISI. This would include read channel applications, high speed links on PCBs and flat panel cables.

#### **1.2 Thesis Outline**

In Chapter 2, the applications and theoretical background for this work are reviewed. This includes an overview of wired communications, coaxial cable channels and serial digital video applications and standards. A block diagram of a digital communication system, comparison of NRZ line code with others, the various kinds of adaptive equalizations, adaptive filtering and coaxial cable modeling and their characteristics are the topics briefly discussed in the background chapter.

Chapter 3 starts with a review of the effect of ADC resolution on the performance of a baseband communication receiver. This is done primarily by studying quantization noise and ADC input signal characterizing. Next, a formula is developed that determines the required ADC resolution in a given communication system with a desired error rate. Using these results, ADC bit requirement reduction techniques are discussed. System simula-

tions in this part suggest that a partial analog equalization approach is an efficient approach. Considering the partial equalizer as splitting a digital equalizer into separate analog and digital filters, the optimization of this splitting, regarding the ADC requirement reduction for a given error rate, is discussed and compared. System level simulation results show that a 2-tap partial analog equalizer (PAE) is an efficient choice, compared to a slight improvement for higher order PAEs for the purposes of ADC requirement reduction. A 2-tap PAE technique is compared with the first order feedforward decorrelator and it is shown that the correlation of the ADC input samples for determining the single zero of the 2-tap PAE can be used. A performance comparison of a 2-tap PAE in FFE and FFE/DFE systems is presented. System-level simulation results concludes Chapter 3.

Chapter 4 discusses the issues involved in the circuit implementation of the proposed front-end for our target application. This front-end consists of a 2-tap PAE followed by a 6-bit 400-MHz ADC. In the beginning, several candidate topologies for PAE design are compared from a circuit design point of view. The chosen topology uses two sets of triple interleaved sample-and-holds (S/H) followed by two variable-gain transconductors, a current to voltage converter, and a voltage buffer for driving the ADC input. In each part, the background circuit issues are reviewed and the design decisions and contributions are justified. In addition to the contributions involved within the design of the main architecture and circuit blocks, there exist some design technique contributions within the peripheral circuit blocks. These include a non-overlap triple clock generator, using a master clock for interleaved S/H and placing adaptive lead compensations in the transconductors. Theoretical details for the latter one are discussed in Appendix A.

A flash architecture was chosen for the 6-bit 400-MHz ADC design. The specifications of the ADC were chosen slightly higher than what was needed for the target application. A brief background for different ADC architectures, and a comparison between them, is discussed in the beginning of the ADC section in Chapter 3. Comparator design, autozeroing techniques and offset cancellations, digital back-end and control circuits are the main circuit issues discussed in the second half of Chapter 4. The relevant simulation results in each part are presented. Finally, layout issues many of which are particularly critical for the ADC implementation conclude this chapter.

Chapter 5 presents the experimental results for the fabricated chip. These results include functional tests, signal to noise and distortion ratio analysis, integral nonlinearity, differential nonlinearity characteristics and a performance evaluation of PAE/ADC frontend in an experimental 240-m coaxial cable test setup. This communication-system test setup included an off-chip adaptive digital equalizer. For other communication channels specifications, another experiment with an arbitrary function generator was performed. At the end of this chapter the characteristics of the chip are summarized and the chip is compared to other state-of-the-art ADCs. According to the experimental results, the combined PAE/ADC consumes 106 mW of power with 34 dB SNDR at 250 MHz sampling frequency. Using the proposed PAE, the ADC performance was shown to be improved by 2-3 bits at a cost of 12% extra area and 17% extra power.

Chapter 6 presents concluding remarks along with suggestions for continuation of this work and future research.

Appendix A presents the theoretical analysis of the transconductors circuits in PAE with regard to its feedback loop, and adaptive compensation.

Appendix B presents the global gradient search algorithm for PAE coefficient optimization in detail. This optimization considers both ADC resolution requirements, ISI and channel noise reduction, simultaneously.

Appendix C presents the preliminary simulation results for one of the future directions mentioned in Chapter 6. This includes the ADC resolution requirement for a carrierless amplitude phase modulation/quadrature amplitude modulation (CAP/QAM) communication system, as opposed to baseband systems. First, the CAP/QAM system is briefly introduced. Then a new architecture, using partial analog equalization and demodulation technique is presented. According to simulation results, for an application of 1.2 Gb/s over a 200-m coaxial cable, the proposed topology with two 300-MHz 6-bit ADCs, along with two 3 to 5-tap analog FIR partial equalizer filters, gives the same error performance as a conventional system with a 900-MHz 8-bit front-end ADC. The implementation and circuit design issues of the purposed architecture is a suggested topic for future research.



# **Background**

# 2.1 Introduction

In this chapter, applications and theoretical background for this thesis and future discussions throughout the following chapters are reviewed.

## 2.2 Wired Data Communication Overview

Wired channels are a major physical medium in the telecommunications industry. In wired communications, there are trade-offs between the speed, length and price and/or quality of the cables used. Coaxial cables and twisted pairs, shown in Fig. 2.1, are the most commonly used copper wired channels for relatively short distances. Optical fiber cables are used for longer distances, up to tens of kilo metres.



Figure 2.1: (a) Coaxial cable. (b) Unshielded twisted pair (UTP).

#### 2.2.1 Twisted Pairs

Unshielded twisted pair (UTP) cables are widely used in local area network applications. UTP cables are categorized based on their quality: CAT1 to CAT5/5e and the recently-introduced CAT6. For example, 1000Base-T (IEEE802.3) uses 4 pairs of UTP cat5 with 256 Mb/s on each pair. The maximum UTP cable length for Ethernet applications is 100 m. [19].

Transmitting data over existing telephone lines on top of the plain old telephone service (POTS) at frequencies above 20 kHz is known as xDSL (digital subscribe loop) [20]. In xDSL standards upstream and downstream data rates can be symmetric or non-symmetric. Different versions of the xDSL standard exist. Their speeds and maximum cable lengths are summarized in Table 2.1. VDSL is the highest-rate DSL technology running at speeds of up to 52 Mbps over 1 kft (300 m) of twisted pair over a frequency band of 200 kHz to 30 MHz.

| Table 2.1: xDSL Standard types |                           |                     |                      |                    |  |
|--------------------------------|---------------------------|---------------------|----------------------|--------------------|--|
| DSL Type                       | Symmetric /<br>Asymmetric | Loop<br>Range (kft) | Downstream<br>(Mbps) | Upstream<br>(Mbps) |  |
| IDSL                           | symmetric                 | 18                  | 0.128                | 0.128              |  |
| SDSL                           | symmetric                 | 10                  | 1.544                | 1.544              |  |
| HDSL (2 pairs)                 | symmetric                 | 12                  | 1.544                | 1.544              |  |
| ADSL G.lite                    | asymmetric                | 18                  | 1.5                  | 0.256              |  |
| ADSL                           | asymmetric                | 12                  | 6                    | 0.640              |  |
| VDSL                           | asymmetric                | 3 / 1               | 26 / 52              | 3 / 6              |  |
| VDSL                           | symmetric                 | 3 / 1               | 13 / 26              | 13 / 26            |  |

#### 2.2.2 Coaxial Cables

Coaxial cables consist of one solid wire in the center of a mesh jacket, as shown in Fig. 2.1. Different types of coaxial cables, known as RGn<sup>1</sup>, exist and are categorized based on their impedance and quality. For example 50- $\Omega$  RG8 cable is used in backbone networking at a rate of 10 Mb/s for up to 500 m. Table 2.2 shows a summary of commonly used coaxial cable types, with some example applications.

<sup>1.</sup> RG comes from the word "Radio Group" referring to the traditional usage of coaxial cables in military applications [22].

| Table 2.2: Coaxial cable types and example applications |                    |           |  |  |
|---------------------------------------------------------|--------------------|-----------|--|--|
| Application / Standard                                  | Coaxial Cable Type | Impedance |  |  |
| Thicknet Ethernet<br>(e.g. 500m cable 10Base5)          | RG8 or RG11        | 50 Ohm    |  |  |
| Thinnet Ethernet (e.g. 185m 10Base2)                    | RG58               | 50 Ohm    |  |  |
| Digital Video, CATV                                     | RG59               | 75 Ohm    |  |  |

#### Serial Digital Video Applications

One of the major applications of coaxial cables is serial digital video transmission. For example, in studios, video data from video cameras is transferred to editing equipment in serial digital format. Generally, different components of video signals are digitized by about 10 bits resolution and combined and multiplexed into one stream of digital serial data.

Table 2.3 shows the commonly used digital video standards for transmission over coaxial cables, as defined by the Society of Motion Pictures and Television Engineers (SMPTE). The maximum length for Belden-8281 cable, which is a 75- $\Omega$  RG-59U type, for each standard is also shown in the Table 2.3 [21]. These maximum lengths are based on the maximum allowed loss at the half of the clock frequency suggested by the standard.

Table 2.2. Different CMODTE standards

|                                   | SMPTE<br>259M     | SMPTE<br>259M      | SMPTE<br>259M            | SMPTE<br>344M            | SMPTE<br>252M |
|-----------------------------------|-------------------|--------------------|--------------------------|--------------------------|---------------|
| Data rate                         | 143 Mb/s          | 270 Mb/s           | 360 Mb/s                 | 540 Mb/s                 | 1.5 Gb/s      |
| Application                       | Composite<br>NTSC | Component<br>video | Component<br>Wide screen | Component<br>Wide screen | HDTV          |
| Maximum Belden<br>8281 length (m) | 436               | 305                | 262                      | 213                      | 79            |

## 2.3 Digital Communication Systems

Fig. 2.2 shows a digital communication system block diagram consisting of a transmitter and a digital receiver. In this system, bit streams are encoded to a sequence of  $A_k$  symbols and transmitted to the channel through a transmit filter g(t). The modulation of

symbols  $A_k$  by the pulse shape of g(t) is known as a pulse amplitude modulation (PAM) scheme. If the pulse g(t) is a baseband function, the modulation is called baseband PAM. Phase Shift Keying (PSK) and Quadrature Amplitude Modulation (QAM) are other popular modulation schemes [7].



Figure 2.2: Block diagram of a digital communications system.

Bandwidth efficiency can be improved by using multi-level techniques and spectral shaping of the transmit filter [23]. Moreover, multilevel signalling reduces the clock speed of the system, which is an important practical point in a receiver front-end design for a specific data rate. However, this benefit comes as a cost of extra complexity and higher signal to noise requirements. In multilevel signalling every n-bit bock is mapped onto one symbol with a level among  $2^n$  levels. 4-level PAM is a popular choice, because by adding two extra levels, symbol rate and bandwidth are halved.

The shape of the transmit filter is also known as a line code. Some basic line codes are compared in Fig. 2.3. Bandwidth efficiency, zero dc component, implementation complexity and having enough transitions for analog clock recovery are all important concerns in choosing appropriate line codes. Bandwidth efficiency is an interpretation of the required bandwidth for transmitting a certain number of symbols per second and its theoretical maximum is 2 (Symbol/sec.)/Hz. For example, in Fig. 2.3., if we consider the first frequency domain notch as the required bandwidth, it is seen that the Biphase line code has the lowest bandwidth efficiency, i.e. 0.5 (Symbol/sec)/Hz. However, its advantage is that it guaranties one transition per cycle and has no dc component. The Nyquist pulse gives the



Figure 2.3: A comparison of basic line codes.

maximum bandwidth efficiency, but it requires a high order pulse shaping filter or an accurate D/A in the transmitter, which is costly at high speeds.

NRZ (non-return to zero) pulse is commonly used because of it simplicity and reasonable bandwidth efficiency. Practically, in the presence of an adaptive digital equalizer, a lowpass filter after the NRZ shaping filter can control the bandwidth usage in the channel. If the bandwidth is not a major issue in an application this lowpass filter or part of it can be placed after the channel at the input of the receiver in order to limit the out of band noise. Excessive limitation of the bandwidth increases sensitivity to the front-end sampler jitter. Therefore, a reasonable choice for lowpass filter bandwidth can be  $0.6F_s$  where  $F_s$  is the symbol rate frequency. A  $0.6F_s$  has 20% excess bandwidth compared to the minimum required Nyquist bandwidth, which is  $0.5F_s$ . In case of low noise channels, a larger excess bandwidth is preferred. Another issue with an NRZ line code is that in ac-coupled systems, its dc component vanishes; and thus, a low frequency drift occurs in the signal. This effect, which is called baseline wander, is usually corrected by extra circuitry. The digital baseline wander correction used in this thesis is discussed in the experimental results chapter.

#### 2.4 Equalization

The convolution of the symbols sequence  $A_k$ , by the equivalent channel impulse response, composed of transmit, channel and receive filters, produces inter-symbol interference (ISI). The equivalent discrete time channel impulse response for a 300-m coaxial cable sampled at symbol rate is shown in Fig. 2.4. The more cable attenuation, the longer the channel impulse and the more ISI components exist. The part of the channel impulse that affects the next symbols is called post-cursor ISI and the part that affects previous symbols is called pre-cursor ISI, as shown in Fig. 2.4. In reality, the channel impulse is causal. However, by assuming the symbols are delayed by the time location of the channel impulse maximum location, the above definition is sensible.

Fig. 2.5 shows the ISI effect on a 4-level PAM signal by the above channel. The signal samples at the output of the channel are highly correlated and have a non-uniform amplitude distribution as opposed to the transmitted symbols. Fig. 2.5 (c, d) shows the samples of the input and output of the equivalent channel at symbol rate. As seen, the four levels of



Figure 2.4: (a) Channel impulse response. (b) Discrete time model for a channel.



Figure 2.5: ISI effect on a 4-level PAM signal.

the transmitted symbols are distorted by ISI error. An equalizer can recover the original level of the data symbols. A linear zero-forcing equalizer is primarily a filter equal to the inverse of the channel transfer function. Therefore, the attenuation by the channel would cause signal enhancement by the equalizer, particularly, at high frequencies. This enhancement is harmful in the presence of additive noise such as channel noise and ADC quantization noise.

Since the channel characteristic can change over the time, the equalizer filters are usually made adaptive. Practically, equalizers are adapted such that the residual error before the slicer is minimized, rather than forcing them to have the channel inverse response. In this way, the noise enhancement amount can be optimized as well.

Fig. 2.6 shows several equalization architectures. Part (a) and (b) show feedforward (FFE) and decision feedback (DFE) digital equalizer types. In DFE, since the feedback filter utilizes the slicer output, it does not enhance additive noise. However, in case an error



Figure 2.6: Different equalization architectures.

occurs at the slicer output, the error propagates for a long time. Moreover, the feedback filter in DFE can only cancel the post-cursor ISI.

Another categorization of digital equalizers is based on their running sampling rate. Baud-rate digital equalizers run at the symbol rate and fractionally-spaced digital equalizers run at higher rates. An important advantage of fractionally spaced equalizers is that they can perform as a combination of matched filter and equalizers, and this is favorable over noisy channels. However, at high baud rates, baud-rate equalizers may be preferred because of difficulties in building ADCs with large sampling rates and excessive order and power consumption by digital equalizers, as is the case in this work.

Part (c) shows an adaptive full-analog equalizer. In this case, after the analog equalizer the symbol data levels are recovered. Thus, a sampler and a slicer can detect the transmit-

ted symbols. Building a high speed adaptive analog equalizer with a large order in CMOS process is not an easy task. Moreover, it does not have the flexibility and robustness of digital equalizers and it cannot benefit from lower noise enhancement by architectures such as DFE. However, depending on the application, when it is possible, it offers advantages such as eliminating the need for an ADC and lower power consumption. Several BIPO-LAR and BICMOS analog equalizers have been reported for video applications [24][25].

Part (d) shows a partial analog equalizer along with an ADC and digital equalizers. This is the approach which has been presented in this thesis and which will be discussed in Chapter 3. In summary, this approach suggests a simpler analog pre-equalization that would relax the complexity requirement by the ADC and the digital equalizers.

#### 2.5 Adaptive Filtering Overview

An adaptive filter is a self-adjusting time-varying filter where the tuning is achieved by minimizing the power of an error signal, as shown in Fig. 2.7. The error signal is the difference between the filter output and the desired output. The desired output can be obtained initially from a training sequence. In equalization applications, the difference between the input and output of the slicer can be used as the error signal. At the beginning of adaptation since the slicer output has an incorrect value, it takes longer for adaptation to converge. The use of the training sequence at the beginning and then replacing the desired signal with the slicer output is another choice, and is used in this thesis.

The adaptation of the filters is performed by adjustment of their coefficients or the gain and the zero-pole locations. Gradient search with the method of steepest descent is a com-



Figure 2.7: An adaptive filter block diagram.

16

monly used method in adaptive filtering [26]. In this method, the coefficients are adjusted in the direction of the gradient of the mean square error performance surface at each step, as shown below:

$$h_i(n+1) = h_i(n) - \mu \left( \frac{\partial E[e^2(n)]}{\partial h_i} \right), \qquad (2.1)$$

where  $\mu$  is the step size parameter controlling the rate of convergence,  $h_i$ s are the coefficients of the filter to be adapted and  $E[e^2(n)]$  is the mean-squared error signal (MSE). Due to difficulties in obtaining a partial derivative of the MSE signal, the instantaneous squared error is used to approximate MSE. Therefore, (2.1) can be rewritten as

$$h_i(n+1) = h_i(n) - 2\mu e(n) \left(\frac{\partial e(n)}{\partial h_i}\right).$$
(2.2)

In case of finite impulse response (FIR) or transversal filters, we have

$$e(n) = d(n) - \sum_{i} h_{i} x(n-i), \qquad (2.3)$$

where d(n) is the desired symbols. Thus, the gradient term in (2.2) can be simplified to -x(n-i). Thus, we can write:

$$h_i(n+1) = h_i(n) + 2\mu e(n)(x(n-i)).$$
 (2.4)

#### 2.5.1 Adaptation of Cascaded FIR Filters

The above adaptation algorithm is called the least mean square or LMS algorithm, which is used in this thesis to adapt the FIR equalizer filters. Note the gradient signals x(n-i) are the delayed version of the filter input and in an FIR filter structure they are inherently available.

In case of cascaded FIR filters, such as H(z) and G(z) in Fig. 2.8, the LMS adaptation of the second filter is straight forward. For the first filter, the gradient signals have to be

obtained separately. The output error of the two cascaded filters in Fig. 2.8 can be written as

$$e(n) = d(n) - \sum_{i} \sum_{i} g_{i} h_{i} x(n-i-j).$$
(2.5)

Therefore, the error gradient, with respect to the first filter coefficients, can be written as

$$\frac{\partial e(n)}{\partial h_i} = \sum_i g_i x(n-i-j) = r(n-j), \qquad (2.6)$$

where r(n) is the input signal filtered by G(z). Fig. 2.8 demonstrates the block diagram of LMS algorithm implementation for two cascaded filters.



Figure 2.8: LMS adaptation for two cascaded filters.

### 2.6 Coaxial Cable Modeling

An approximate transfer function of the cable can be derived from the transmission line lumped-parameter model of the cable, as shown in Fig. 2.9 [5]. In this model, the primary constants R, L, G and C are the per-unit series resistance, series inductance, shunt conductance and shunt capacitance, respectively. These primary constants depend on many factors such as geometry and the material used in the insulation. In a properly terminated



Figure 2.9: Lumped-parameter model for a short section of cable.

transmission line the reverse travelling signal is zero so that the transfer function characteristic is given by

$$C(d, \omega) = e^{-d \cdot \gamma(\omega)}, \qquad (2.7)$$

where d is the cable length and  $\gamma(\omega)$ , known as propagation function, is defined as [27]

$$\gamma(\omega) = \sqrt{(R + j\omega L)(G + j\omega C)} = \alpha(\omega) + j\beta(\omega).$$
(2.8)

The real part of  $\gamma(\omega)$  determines the cable attenuation and its imaginary part determines the phase of the cable. In (2.7), the resistance R is a complex value proportional to the square root of the frequency. This is due to the skin effect (the tendency of the current to flow near the surface of the conductor which increases the resistance). The parameter G is a measure of the dielectric loss effect, which is negligible in data-grade cables. Via some rearrangements and approximations such as neglecting G relative to  $\omega C$  and  $R \ll \omega L$  at higher frequencies, and the discarding of constant delay terms, (2.7) results in the transfer function

$$C(d,s) = e^{-d \cdot k \sqrt{s}}, \qquad (2.9)$$

where  $k = \sqrt{C/L}$ .

By using the Belden 8281 cable data sheet, k was curve-fitted to  $393.4 \times 10^{-9}$   $\left(\frac{s}{rad}\right)^{1/2} m^{-1}$ . The exponential cable response in (2.7) was also modeled by an eight-order transfer function using matlab *invfreq* algorithm [28] in order to ease its usage in the simulations during the current study. The accuracy between the two transfer functions is less than 0.5 dB in the range of 1 MHz - 1 GHz. Fig. 2.10 shows the magnitude response



*Figure 2.10:* The magnitude response of a 300-m coaxial cable.

of a 300-m coaxial cable according to Belden 8281 data sheets and the cable model in (2.9).

# 2.7 Summary

In this chapter various wired communication channels and applications were reviewed. This included the coaxial cable channel used throughout this thesis as the target application medium. The modeling and characteristics of this channel were reviewed as well. A general overview of digital communication systems was presented. This included bandwidth requirement, NRZ line codes, equalization and adaptive filtering concepts. 

# ADC Requirements and Partial Equalization

## **3.1 Introduction**

In this chapter, the effect of the number of ADC bits, or quantization noise, in digital communication systems and their error performance is discussed. Also, efficient analog preprocessing approaches intended to reduce the bit requirement by the front-end ADC in digital communication receivers are investigated.

The target application used throughout this chapter is a communication system with 4level PAM data transmission at 622 Mb/s over 300-m coaxial cable. This channel has 32dB loss at half of the baud-rate frequency, i.e. 155.5 MHz [18]. This large amount of loss produces large inter-symbol interference components (ISI). It is shown that a partial analog equalizer (PAE) can significantly reduce the ADC front-end resolution requirement by partially reducing the ISI. The optimization of the partial equalization task is discussed as well. Finally, it is shown that for our target application, a system with an efficient 2-tap PAE with a 6-bit ADC, performs as good as an 8-bit ADC system without the PAE.

#### 3.1.1 Analog to Digital Conversion in Digital Communication Receivers

The quantizer at the front-end of a digital communication system introduces inevitable quantization noise. This affects the performance of the receiver in terms of bit error rate or mean square error (mse). One of the main differences between quantization noise and the channel noise is that quantization noise bandwidth is determined by the ADC sampling rate but the channel noise bandwidth is also limited by the bandwidth of the front-end ana-

log filter. Generally, the channel noise is assumed as white Gaussian but the quantization noise is assumed as white uniform noise. Since the quantization noise is related to the input signal, it is not necessarily an independent random noise [29]. However, when the signal is not highly oversampled, such as in baud-rate samplers in communication receivers, the above assumption is acceptable<sup>1</sup>.

Fig. 3.1 shows a digital communication system with a front-end ADC followed by a baud-rate digital linear equalizer (DLE) and a symbol detector slicer on the receiver side. To demonstrate the effect of the resolution of the front-end ADC in our target application, the signal samples before the slicer (after the linear equalizer) for different number of ADC bits are shown in Fig. 3.2. The order of the DLE filter was chosen to be large so as not to be a limitation in this experiment. As indicated in the figure, ADCs with resolutions less than 8 bits are not sufficient for an appropriate eye opening.

To quantify this comparison, Fig. 3.2(d) shows the relative root mean square error (RRMSE) of the recovered symbols. The RRMSE is defined as the residual RMS error before the slicer, relative to the distance between the PAM symbol levels. A symbol error rate (SER) of about 10<sup>-7</sup> occurs when  $RRMSE \approx 0.1$  and this is particularly true because the residual error has a Gaussian distribution due to the DLE. The simulations for our target application verify this assumption. As seen from Fig. 3.2(d), an acceptable RRMSE value of about 10% (corresponding to  $SER \approx 10^{-7}$ ) is attained by more than 8 bits resolution for the front-end ADC.



Figure 3.1: A digital receiver using baud-rate linear feedforward equalizer.

<sup>1.</sup> Particularly in communication systems, the channel noise acts as dither and makes the quantization error have a more random nature and white spectrum.

Chapter 3: ADC Requirements and Partial Equalization



*Figure 3.2:* (a), (b), (c) Signal samples before symbol detector for different number of ADC bits. (d) Relative mean square error versus number of ADC bits for the system in Fig. 3.1.

# 3.2 Quantization Noise

In a uniform quantizer, assuming the additive quantization error is uniform within each quantization interval  $\Delta$ , the variance of the quantization error can be written as [3]

$$\sigma_{qz}^{2} = E[Q^{2}] = \int_{-\Delta/2}^{\Delta/2} p_{Q}(q) q^{2} dq = \Delta^{2}/12, \qquad (3.1)$$

where  $p_Q(q) = 1/\Delta$  is the probability density function (pdf) of the quantization error qand  $E[\cdot]$  is the expectation function. The quantization step  $\Delta$  is a function of the inputsignal peak value  $x_{peak}$  and the number of quantization bits R as

$$\Delta = 2x_{peak} \cdot 2^{-R}. \tag{3.2}$$

24

In general, the quantization error introduced by an R-bit ADC to an input signal with variance  $\sigma_x^2$  can be defined as [30]

$$\sigma_{qz}^2 = \varepsilon_q^2 2^{-2R} \sigma_x^2, \qquad (3.3)$$

where  $\varepsilon_q^2$  is the quantizer performance factor. Generally,  $\varepsilon_q$  depends on the quantizer structure such as quantizer uniformity and the distribution of the input signal [30][31]. For a uniform quantizer with a random input signal, by combining (3.1) & (3.2) and comparing that to (3.3), we can see that  $\varepsilon_q$  is proportional to the input crest factor  $x_{peak}/\sigma_x$  as

$$\varepsilon_q^2 = \frac{1}{3} (x_{peak} / \sigma_x)^2. \tag{3.4}$$

To estimate  $x_{peak}$  for quantization purposes, we can define an acceptable clipping probability  $P_{clip}$  such that

$$P[X > x_{peak}] = P_{clip}. \tag{3.5}$$

Table 3.1 compares the quantizer error for different distributions and clipping probabilities. As seen, for a fixed signal power  $\sigma_x^2$ , the quantization error can be 12 times (10.8 dB) better, if the quantizer input signal amplitude distribution is uniform rather than Gaussian.

Chapter 3: ADC Requirements and Partial Equalization

| Input signal<br>Distribution               | P <sub>clip</sub> | $x_{peak}/\sigma_x$            | $\epsilon_q^2$ | $\sigma_q^2/\sigma_x^2 = \varepsilon_q^2 2^{-2R}$ |  |
|--------------------------------------------|-------------------|--------------------------------|----------------|---------------------------------------------------|--|
| Gaussian                                   | $10^{-9}$         | 6                              | 12             | $12(2)^{-2R}$                                     |  |
| Gaussian                                   | $10^{-7}$         | 5.2                            | 9              | $9(2)^{-2R}$                                      |  |
| Uniform (Continuous)                       | 0                 | 1.7                            | 1              | $(2)^{-2R}$                                       |  |
| Uniform <sup>a</sup><br>(Discrete 4-level) | 0                 | 1.34                           | -              | -                                                 |  |
| Uniform <sup>b</sup><br>(Discrete N-level) | 0                 | $\sqrt{3\frac{(N-1)}{(N+1))}}$ | -              | -                                                 |  |

Table 3.1: A comparison of quantization error for different input signal distributions

a. For the signal with discrete amplitude distribution,  $\varepsilon_q^2$  is highly dependent on the quantization decision levels. Ideal placements of those levels can result in a zero quantization error or  $\varepsilon_q^2 = 0$ .

b. For an N-level PAM, the levels are of the form:-(N-1)a, -(N-3)a,..., (N-2)a, (N-1)a. Thus,  $\sigma_x^2 = (N^2 - 1)a^2/3$  and  $x_{peak} = (N-1)a$ . Therefore,  $x_{peak}/\sigma_x = \sqrt{3\frac{(N-1)}{(N+1)}}$ .

# 3.3 ADC Input Characterization

In a baseband digital communication system in which the front-end ADC samples and quantizes the input signal at baud rate, the ADC input signal can be written as

$$x(n) = \sum_{k} c_{k} a(n-k), \qquad (3.6)$$

where  $c_k$  are the samples of the equivalent channel impulse response (including both transceiver front-end and back-end analog filters) and a(n) are the transmitted data symbols at the channel input. For instance, in a 4-level PAM scheme, a(n) can take four different values, each with equal probability. By assuming that the transmitted data symbols at different time instants, i.e. a(n-k), are independent random variables, we can characterize the ADC input from (3.6). For example, based on (3.6),  $\sigma_x^2$ , the variance of x(n), can be written as [32]

$$\sigma_x^2 = \sigma_a^2 \left( \sum_k c_k^2 \right). \tag{3.7}$$

To obtain the ADC input pdf  $p_X(x)$  from (3.6), we recall that the pdf of the summation of two independent random variables is equal to the convolution of the pdf of each variable. Therefore, by defining  $u_k = c_k a(n-k)$ , the pdf of the ADC input can be estimated as

$$p_X(x) = \dots * p_{U_{-1}}(c_{-1}a(n+1)) * p_{U_0}(c_0a(n)) * p_{U_1}(c_1a(n-1)) * \dots$$
(3.8)

where \* is the convolution operator. Fig. 3.3 shows the channel response  $c_k$  with normalized energy for our target application. Assuming the transmitted data has no redundant information, the pdf of the data signal at the input of the transmit filter is uniform for different amplitude levels. For example, for a 4-level PAM scheme, the pdf value is 1/4 at each PAM level and zero for the rest of the amplitude range.

Each of the pdf terms in (3.8), has the pdf of the original data, which is scaled along the amplitude axis by channel impulse samples  $c_k$ , as shown in Fig. 3.4. The result of the successive convolution of these pdf terms is also shown in Fig. 3.4. As indicated in the figure, when the number of nonzero coefficients  $c_k$  becomes larger in (3.8), the pdf of the ADC input signal  $p_X(x)$  will change from uniform toward that of a Gaussian shape. This concept is compatible to the generalized form of the central limit theorem in probability theory [32][33].



Figure 3.3: The impulse response of a 300-m coaxial cable sampled at 311 MHz.
As mentioned previously, changing the uniform distribution to that of a Gaussian one increases the crest factor by 4.5 times (see Table 3.1). Since the coefficients  $c_k$ , except  $c_0$ , represent the ISI components occurring within the channel, if before the ADC the equivalent channel impulse response is changed such that it has fewer nonzero  $c_k$ , we can benefit from a lower crest factor. As a result, a lower additive quantization error occurs. Fig. 3.5 shows the channel response after being partially equalized, and its output amplitude distribution from (3.8). When the number of large ISI components is reduced, the distribution is



*Figure 3.4:* Amplitude distribution of the received signal after the channel shown in Fig. 3.3 for a 4-level PAM NRZ signal at the input of the front-end ADC (see (3.8)); as seen, for a large number of nonzero ISI components in the channel impulse response ( $c_{k, k \neq 0}$ ), this distribution can be estimated by a Gaussian shape.

more uniform-like. Here, the channel response has one major ISI component and the distribution of the channel output has several peaks. The location of theses peaks can be obtained from the convolution sequence in (3.8). It is worth noting that this distribution peaks may be used in nonuniform quantization in order to achieve a lower quantization average error [31].

Generally, if most of the energy of the channel impulse response is accumulated in L coefficients such that the rest of the coefficients are negligible, based on (3.8) we can write

$$x_{peak} = a_{peak} \sum_{L} |c_k|, \qquad (3.9)$$

where  $x_{peak}$  is the peak amplitude of the ADC input. The above estimation is considerably useful when we need to estimate the improvement of the channel response through analog preprocessing according to the crest factor and the quantization error performance improvement. To evaluate the change of the crest factor within the channel, (3.9) and (3.7) can be combined as

$$\frac{x_{peak}}{\sigma_x} = \left(\frac{a_{peak}}{\sigma_a}\right) \times \frac{\sum |c_k|}{\sqrt{\sum |c_k|^2}}.$$
(3.10)

Generally, while  $\sigma_x^2 / \sigma_a^2$  is fixed or simply  $\sum |c_k|^2$  is normalized to one (see (3.7)), a better analog preprocessing in terms of crest factor improvement, should produce a lower  $\sum |c_k|$ . This can be used as an optimization criterion for better analog preprocessing.



*Figure 3.5:* Estimation of the signal peak when the signal is output from a combined channel/partial equalization filter with less ISI components.

## 3.4 ADC Resolution Requirement

Fig. 3.6 shows a baseband digital communication system with a digital receiver. To maintain an acceptable system bit error rate, the front-end ADC has to meet a certain resolution requirement. To determine this requirement, we consider the variance of the total remaining error before the slicer in Fig. 3.6 as

$$\sigma_{err}^{2} = [(\sigma_{qz}^{2} + \sigma_{n}^{2})K_{DLE}^{2}] + \sigma_{ISI}^{2}, \qquad (3.11)$$

where  $\sigma_{qz}^2$  is the quantization noise variance introduced by the ADC,  $\sigma_n^2$  is the variance of the channel additive noise,  $K_{DLE}^2$  is the error enhancement factor caused by the digital linear equalizer (DLE), and  $\sigma_{ISI}^2$  is the remaining ISI error variance.

The total noise error around each symbol level has approximately a Gaussian shape because of the summations of independent noise components occurring in the expression (3.11) and also within the digital equalizer filtering operation. Therefore, the symbol error rate (SER) can be estimated as

$$SER = p(error > a) = 2Q(a/\sigma_{err})$$
(3.12)

where *a* is half of the distance between the PAM levels and Q(.) is the tail area of the Gaussian density. Expression (3.12) provides a constraint for the variance of the total error (noise) before the slicer in (3.11). By recalling the expression (3.3) for the quantization error introduced by an R-bit ADC, i.e.  $\sigma_q^2 = \varepsilon_q^2 2^{-2R} \sigma_x^2$ , combining it with (3.11), and rearranging, we can write

$$2^{-2R} < \frac{L^2}{K_{DLE}^2 \cdot \varepsilon_q^2},\tag{3.13}$$

Here *L* is related to the desired SER and the participation ratio of the enhanced quantization noise  $\sigma'_{qz}^2 = \sigma_{qz}^2 K_{DLE}^2$  to the total error  $\sigma_{err}^2$ , or specifically

$$L = \left(\frac{a}{\sigma_x Q^{-1}(SER/2)}\right) \cdot \sigma'_{qz} / \sigma_{err}.$$
 (3.14)



*Figure 3.6:* Block diagram of a digital communication system receiver with 4-level PAM symbols at the input.

The inequality (3.13) can be written in dB scale as

$$R > (-L_{(dB)}^2 + K_{DLE(dB)}^2 + \varepsilon_{q(dB)}^2) / 6.02.$$
(3.15)

Either (3.13) or (3.15) can be used to determine the minimum effective number of bits *R* required for the front-end ADC. *L* is a system-design parameter that depends on the desired SER and the order of the digital equalizer. Thus, the main approach to reducing the bit requirement *R* is to reduce the factors  $\varepsilon_q^2$  (the crest factor of the ADC input signal) and  $K_{DLE}^2$  (the amount of the noise enhancement by the digital equalizer).

To reduce the ADC resolution requirement, an optimum analog preprocessing circuit must minimize both  $\varepsilon_q^2$  and  $K_{DLE}^2$  simultaneously. This can be achieved by reshaping the equivalent channel impulse response and its components  $c_k$  before the ADC. Recalling from (3.4),  $\varepsilon_q^2$  is 1/3 of the crest factor of the ADC input signal and this can be obtained from (3.10). According to (3.10), if the channel impulse energy  $\sum_k |c_k|^2$  is normalized to one,  $\varepsilon_q^2$  is minimized by minimizing  $\sum_k |c_k|$ .

The quantization (white) noise enhancement by a feedforward equalizer (FFE) with coefficients  $h_i$  is given by

$$K_{DLE}^{2} = \sum_{i} \left| h_{i} \right|^{2}.$$
 (3.16)

The above expression is a valid criterion when the channel impulse energy  $\sum_{k} |c_k|^2$  is normalized to one. If the DLE is a linear zero forcing equalizer in the frequency domain,  $H(e^{j\omega})$  is ideally the inverse of the channel frequency response  $|C(e^{j\omega})|$ . Thus, we can write:

$$\sum_{i} |h_{i}|^{2} = \frac{1}{2\pi} \int_{-\pi}^{\pi} |H(e^{j\omega})|^{2} d\omega = \frac{1}{2\pi} \int_{-\pi}^{\pi} \frac{1}{|C(e^{j\omega})|^{2}} d\omega.$$
(3.17)

The discrete fourier format of (3.17) is  $(1/N)\sum 1/|(C(k))|^2$  (*C(k)* is the DFT of  $|c_k|$ ) which can also be used to evaluate  $K_{DLE}^2$ .

#### 32

#### 3.4.1 Example and Comparison to Simulation Results

As an example, expression (3.13) can be utilized to calculate the required ADC resolution for the system in Fig. 3.6 when applied to our target application (622-Mb/s 4-level PAM over 300-m coaxial cable). First, note that for the 4-level NRZ PAM with a level spacing of 2a we can write

$$\sigma_x^2 = E[|X_k|^2] = \frac{1}{4}[(-3a)^2 + (-a)^2 + (a)^2 + (3a)^2] = 5a^2.$$
(3.18)

Thus,  $a/\sigma_x$  in (3.14) will be  $1/(\sqrt{5})$ . The assumption of  $SER = 10^{-9}$  and  $\sigma'_{qz}^2/\sigma^2_{err} = 1/3$ , as design specifications, results in  $L^2 = 1/560$  or -27.5 dB. The channel response defines an equalizer boost of  $K_{DLE}^2 = 17.5$  or 12.4 dB. A clipping error of  $10^{-7}$  gives the quantization factor as  $\varepsilon_q^2 = 9.4$  or 9.7 dB. Exploiting these parameter values in (3.15), results in a minimum bit resolution of R = 8.3 bits.

Fig. 3.7 depicts a comparison of error performance versus ADC resolution according to both analytical expression  $(3.13)^1$  and system-level simulation results. The minor deviation at lower resolutions is due to the fact that the crest factor in the analytical one was



*Figure 3.7:* A comparison of error performance according to the analytical expression in (3.13) and system-level simulation.

<sup>1.</sup> Note that in (3.13) for a chosen R, we obtain the parameter L which is related to the SER. Also, SER is related to  $\sigma^2_{err}$  according to (3.12).

based on the assumption of an ADC input with a Gaussian amplitude distribution, whereas the simulations was based on the actual channel output.

# **3.5 ADC Bit Requirement Reduction Techniques and Partial Analog Equalization**

From expression (3.13), we see that  $\varepsilon_q^2$  and  $K_{DLE}^2$  are the only factors which can be reduced in order to lower the resolution requirement for the front-end ADC. Using a decision a feedback equalizer (DFE) reduces the  $K_{DLE}^2$ , but since the precursor ISI cannot be cancelled by feedback equalizer as well as due to error propagation and clock recovery difficulties, the order of the feedback filter is limited. Therefore, the need for an FFE still exists and a considerable amount of quantization noise enhancement will remain.

One approach to reducing the quantizer factor  $\varepsilon_q^2$  is to use a nonuniform quantization such that the quantizer error is decreased [30] [31]. However, this approach is not efficient since it does not affect  $K_{DLE}^2$  and furthermore, results in a non-standard ADC architecture.

Ideally, a full analog equalizer can reduce the ADC bit requirement to as low as the slicer order; for example, two bits for a 4-level PAM scheme. Nevertheless, there are many practical difficulties in the design of adaptive analog filters with large orders, such as accurate calibration and adaptation of filter parameters, distortion accumulation, DC offset, mismatch and compatibility with complicated pulse shaping or modulation schemes. Moreover, there is always a preference to have some digital receiver capabilities such as adaptive digital filtering, decision feedback equalization and digital clock recovery.

Partial analog equalization (PAE) can be defined as splitting the equalization among digital and analog sides [17]. An example of splitting the 10-tap FIR into two 4-tap analog and 7-tap digital, with the same total number of zeros, is shown in Fig. 3.8. Simulation results show that partial equalization reduces both  $K_{DLE}^2$  and  $\varepsilon_q^2$  simultaneously. The reason behind the  $\varepsilon_q^2$  reduction, as mentioned previously, is that the amplitude distribution of the signal at the output of the channel, greatly depends on the amount of the ISI components. After partial equalization, the amplitude distribution of the signal at the ADC input, becomes more uniform rather than Gaussian, due to a considerable reduction in the ISI components.



(c) Zero locations of equalizers in architecture (a) and (b)



For example, Fig. 3.9 shows the effect of a 4-tap analog FIR partial equalizer on the channel impulse response and the channel output, for our target application. Note that cancelling the ISI components by the PAE greatly changes the shape of the signal amplitude distribution, thereby reducing  $\varepsilon_q^2$  by a factor of about 4.5 and according to (3.13), reducing the ADC resolution requirement by more than one bit.

Further bit requirement reduction is due to the reduction of noise enhancement by the DLE. Table 3.2 summarizes the improvement in parameters  $\epsilon_q^2$  and  $K_{DLE}^2$  for this example. According to (3.13), those parameters result in a bit reduction of 2.8 bits. To verify this result, the above PAE has been placed in the system of Fig. 3.8 and the recovered symbols before the slicer are compared to a similar system without PAE. Two different 5-bit and 8-bit ADCs have been used in this simulation. As seen in Fig. 3.10(a) and Fig. 3.10(b), there is a considerable performance degradation when the number of bits is



*Figure 3.9:* The effect of a 4-tap PAE on the ADC input amplitude distribution and 300m coaxial channel impulse response (with normalized energy). (a) Before PAE. (b) After PAE. (a, b) (i) ADC input samples. (a, b) (ii) The corresponding equivalent channel impulse responses.

reduced from 8 to 5 for the case of a full-digital equalization with no PAE. In contrary, by using the analog pre-equalization the performance degradation is fairly tolerable.

Comparing Fig. 3.10 (b, c) shows that the ADC resolution requirement is reduced by about 3 bits. Fig. 3.10 also shows the benefit of using a partial equalizer when the number of ADC bits are low, i.e. 5 bits here. Otherwise, the quantization participation in contributing to the total error is negligible and the existence of the PAE is not of much benefit.

|                           | K <sup>2</sup> <sub>DLE</sub> | $\epsilon_q^2$ | RMSE with<br>8-bit ADC | RMSE with<br>5-bit ADC | Minimum "R"<br>for SER=10 <sup>-9</sup> |
|---------------------------|-------------------------------|----------------|------------------------|------------------------|-----------------------------------------|
| Architecture without PAE, | 17.5                          | 9.4            | 0.09                   | 0.52                   | 8.3 bits                                |
| Architecture with PAE,    | 1.65                          | 2.1            | 0.08                   | 0.09                   | 5.5 bits                                |
| Comparison                | 1 / 10.9                      | 1 / 4.5        | 1 / 1.13               | 1 / 5.78               | -2.8 bits                               |

Table 3.2: comparison of architectures with and without PAE





*Figure 3.10:* Performance comparison of the architectures with and without PAE for different numbers of ADC bits. (a, b) 10-tap DLE, without partial equalizer. (c, d) Using partial equalizer (4-tap PAE, 7-tap DLE).

# 3.6 Partial Equalizer Design

The optimum partial equalizer primarily depends on the structure chosen for the receiver. The variety of the structures come from different topologies, filter types, the

orders of the digital and partial analog equalizers and also number of bits available for the ADC. For example, if a large number of bits is available, there is no need to have analog preprocessing for the quantization noise reduction. However, in this case the analog pre-equalization can be considered to reduce the complexity of digital equalizer and therefore, possibly, saving power and area. In this part, we limit the receiver structure to a FIR DLE and an analog FIR for the partial equalizer.

In PAE design, the ultimate goal is to minimize the total remaining error before the slicer. Generally, the design of the partial analog equalizer and the digital equalizer are not independent. PAE determines the distribution of the quantizer input signal, and thus, affects the amount of the quantization error for a certain bit resolution. The PAE is also related to the shape of DLE regarding the equalization task completion and the amount of channel and quantization noise enhancement. Therefore, the optimization of PAE and DLE with a given total order and ADC bit resolution can be a complex task. Assuming an analog FIR filter for PAE and a digital FIR filter for DLE, several approaches to finding a close to optimum PAE/DLE design solution are proposed and investigated.

#### **3.6.1** Optimizing the PAE Independently for Maximal Equalization

In this approach, as shown in Fig. 3.11, the partial analog equalizer is adapted for maximum possible equalization, independent from its digital counterpart. This method can be performed by removing the digital equalizer or setting its coefficients to zero at the beginning, running an LMS algorithm for PAE, making its coefficient fixed and then performing the adaptation algorithm for the digital equalizer. Since the ISI before ADC will be minimized, the crest factor of ADC input and the noise enhancement factor by DLE will be reduced. Thus, according to the expression (3.13), a lower ADC bit resolution will be



Figure 3.11: Optimizing a partial equalizer independently for maximum equalization.

required. Although this approach is practically simple and straight-forward, it exhibits some imperfections. One concern is that for a given order for PAE and DLE, after optimizing PAE we may not have the best total ISI reduction by adapting DLE with limited order. Generally, adaptation of two cascaded filters independently does not give the optimal solution, unless we adapt them simultaneously (see Chapter 2) [34]. Other methods discussed later in this section will resolve this concern at a cost of increased complexity. Another problem with this method is the proper delay value. If the delay value in Fig. 3.11 is not appropriately set, the solution can be relatively far from optimum. In providing a solution, if the total order of filters is low, the optimization task during initialization can be repeated for different delays and the best delay chosen. On the digital side, if initially we can apply a larger order digital filter, then the location of the nonzero coefficients determines the appropriate delay.

#### 3.6.2 Splitting an Optimum Equalizer into Analog and Digital Filters

In this method, a larger order equalizer with an appropriate total number of zeros will be designed using adapting techniques such as an LMS algorithm. It will then be split into two different filters by grouping its zeros such that they conform to the desired order of PAE and DLE. In this way, we ensure that the final remaining ISI error is minimized for the desired total filters orders. A criterion to group the zeros among PAE and DLE is required. Since this criterion is just based on bit-requirement reduction, the results in part 3.4 can be utilized. Another way to split the zeros is to choose the PAE zeros such that they are close to the zeros of a suboptimal PAE found independently using the previous method.

The zero splitting approach is more reliable since it follows a regular equalization scheme with an optimum lowest channel noise enhancement. However, still it does not necessarily give the best optimum solution regarding the ADC resolution requirement reduction, or in other words, minimizing the factors  $\varepsilon_q^2$  and  $K_{DLE}^2$ . It should be noted that, for some given orders of PAE and DLE the zero splitting method might be not feasible, due to conjugate zeros which cannot be split into separate real filters.

#### 3.6.3 Global Optimization Using Genetic and/or Gradient Search

A general global optimization technique such as *Genetic Algorithm* [35][36] or *Global Gradient Search* [26] can be used to find the zero locations of the PAE and DLE filters. As demonstrated in Fig. 3.12, using the results in part 3.4 a closed formula for the total error can be created. It will include the remaining ISI, the enhanced quantization error and the channel noise versus the design parameters. This formula is used as a criterion to find the optimum filters coefficients and the delay in the equivalent total channel response. Note that the amount of the enhanced quantization noise depends on not only the number of bits (R), but also the shape of PAE and DLE due to their effect on  $\varepsilon_q^2$  and  $K_{DLE}^2$ , respectively. According to the simulation results, *Genetic Search* algorithm is useful to obtain an overall estimation of the optimum delay and filters zero locations quickly. However, after coarse adjustment of zeros locations of PAE and DLE we can either use this result as a guide in the previous method to split the zeros properly or use them as a starting point for



Figure 3.12: Global optimization of PAE/DLE coefficients.

another general optimization algorithm. This could include a *Gradient Search* algorithm who may suffer from local minima existences. Fig. 3.13 depicts the block diagram of the *Genetic Search* algorithm used in this investigation.



Figure 3.13: Genetic search algorithm block diagram used for PAE/ DLE architecture optimization.

As mentioned, the *Global Gradient Search* algorithm can be considered an alternative or continuing method for the *Genetic Search* of the optimum filters. This method in the price of added complexity helps in finding a closer to optimum solution for a partial equalization design, compared to the previous methods. Due to the non-linearity and complexity of the error function, the algorithm might get stuck in some local minimum. In this case, obtaining the initial starting points from either of the previous methods can be helpful. The same total error criteria shown in Fig. 3.12 can be used to generate the gradient components here. Note that as opposed to the case of the *Genetic* algorithm, the delay term has to be defined before running this algorithm. Since the total error depends on the cascaded PAE/DLE coefficients both directly (by their effects on the filters output) and indirectly (through their effect on  $K_{DLE}^2$  and  $\varepsilon_q^2$ ), the gradient term definition is relatively complex. The details of this algorithm are described in Appendix B.

Essentially, the above global optimization methods are useful if we are concerned about imperfections in the previous methods. They can also be used as a verification tool in the design stage or as an aid tool for splitting the zeros in zero splitting method.

#### **3.6.4 Simulation Results and Comparisons**

Table 3.3 compares the performance of the system that was shown in Fig. 3.8 with a 5bit ADC, when PAE and DLE are adapted with previously discussed methods. Also, the components which contribute to the total remaining error according to (3.11) and (3.13) are quantified in each row. In the first row, the system consists of a 10-tap DLE with no analog preprocessing. In this case, the amount of the total error is enormous because of the size of the crest factor and the quantization noise enhancement. The rest of the rows are related to the architectures in which a 4-tap PAE and a 7-tap DLE are utilized. Note that the total number of zeros of both DLE and PAE is the same for all of the rows. In the second row, PAE is optimized independently from DLE for the highest attainable equalization and then the same for DLE (Method 3.6.1). Compared to the first row, it shows a considerable reduction of the enhanced quantization error  $\sigma'_{qz}^2$ . Nevertheless, the remaining ISI error variance is increased by about 1.5 dB because of less-optimum equalization while the total order of equalization (the number of zeros of PAE+DLE) is the same.

| Architecture and<br>Design Method                                                                                 | R (bits) | K <sub>DLE</sub> | Crest<br>Factor | $\epsilon_q^2$ | $\sigma^2_{qz}$<br>(dB) | $\sigma^2 ISI$<br>(dB) | σ <sup>2</sup> err<br>(dB) |
|-------------------------------------------------------------------------------------------------------------------|----------|------------------|-----------------|----------------|-------------------------|------------------------|----------------------------|
| Without PAE, single 10-tap DLE                                                                                    | 5        | 17.5             | 6               | 12             | -6.9                    | -24.6                  | -6.8                       |
| <i>Method 3.6.1:</i><br>4-tap PAE, 7-tap DLE, independently<br>adapted for maximum equalization                   | 5        | 1.40             | 1.41            | 0.66           | -29.37                  | -23.2                  | -22.2                      |
| <i>Method.3.6.2:</i><br>4-tap PAE, 7-tap DLE, PAE obtained<br>from splitting method                               | 5        | 1.75             | 1.52            | 0.77           | -26.9                   | -24.6                  | -22.5                      |
| <i>Method.3.6.3:</i><br>4-tap PAE, 7-tap DLE, adapted by both<br><i>Global Genetic</i> and <i>Gradient Search</i> | 5        | 1.67             | 1.46            | 0.71           | -27.5                   | -24.6                  | -22.9                      |

Table 3.3: Comparison of PAE optimization method for a 5-bit ADC system

The third row utilizes the method of splitting the zeros of a single equalizer (method 3.6.2). The ISI cancellation is as good as the first row at the cost of greater quantization noise, compared to the second row. The last row shows the results of the design using *Global Gradient Search* algorithm combined with *Genetic Search* algorithm (method 3.6.3). The results are now more optimized according to both the remaining ISI and the quantization noise. The difference between the final results of different methods compared to the

overall improvement, versus the case without PAE (first row), is not considerable. However, for other architectures with different ADC resolutions and filter orders, the results can be more variable. In conclusion, the first suboptimal method, i.e. optimizing PAE for maximum independent equalization, could be used. However, in the system level design, we must assure the results are not far from a possible optimum result, and this can be verified through method *3.6.3*.

Fig. 3.14 shows the location of the zeros of the 4-tap PAE and 7-tap DLE for the above examples. If we consider the results of the last row design as the optimal location of the zeros, this figure shows how close to optimum the other methods are.

## 3.7 Two-tap PAE: An Efficient Choice

Fig. 3.15 shows a comparison of error performance improvement by PAE versus their orders and different ADC resolutions for our target application, i.e. 622 Mb/s 4-level PAM data transmission over 300-m coaxial cable with -27dB additive channel noise. The criterion used here, is the relative root mean square error (*RRMSE*). As previously mentioned, RRMSE is the residual RMS error before the slicer, relative to the distance between the PAM modulation levels. *RRMSE*  $\approx$  0.1 results in a bit error rate of about 10<sup>-7</sup> for a Gaus-



Figure 3.14: Zero locations of examples discussed in Table 3.3.

sian residual error. The DLE used in this simulation was 11 taps with 7-bit coefficients. It was observed that further increasing of the DLE order and its coefficients resolution have negligible effect on the error performance in the current system.

In Fig. 3.15(a) it is notable that, for a 5-bit ADC, a major improvement is achieved by the lowest order of a partial analog equalizer compared to the slight extra improvement by an additional PAE order. Fig. 3.15(b) demonstrates the same for different ADC resolutions. As we can see, using PAE is advantageous when a low resolution ADC is used, because for a large number of bits, the total error is dominated by the remaining ISI and channel-noise errors, rather than by quantization error. An important observation in Fig. 3.15(b) is that, for  $RRMSE \approx 0.1$ , a 2-tap PAE / 6-bit ADC performs as well as a 9-bit ADC without PAE. This is equivalent to 3 bits improvement on ADC performance by adding a low-cost low-order analog pre-processor.

The reason for attaining a major improvement using a low order PAE is that the PAE does not have to do fine equalization and to resemble the channel inverse precisely. Based on (3.13), PAE is mainly responsible for reducing the crest factor and the amount of noise boost by DLE. Thus, a rough analog pre-processor such as a 2-tap analog FIR can do this to a great extent. Fig. 3.16(a, b) compares the channel characteristics of a 300-m coaxial



*Figure 3.15:* (a) RRMSE improvement for Fig. 3.8 architecture with 5-bit ADC for different PAE orders. (b) A comparison of RRMSE improvement for different PAE orders and different ADC resolutions.

cable and its first order discrete time approximation. The inverse of this first-order approximation results in an efficient partial equalizer in the form of

$$G(z) = K(1 - az^{-1}).$$
(3.19)

The impulse response after the channel and partial equalizer is shown in Fig. 3.16(c). Although some ISI components still exist, a major portion of them are cancelled. Thus, regarding the discussion made in part 3.4, a considerably lower resolution ADC fulfills the quantization requirement for the system.



*Figure 3.16:* The channel characteristics of a 300-m coaxial cable and its first order discrete time approximation. (a) Frequency Responses. (b) Impulse responses. (c) Equalized impulse response after a 2-tap PAE.

Regarding the resolution of the PAE coefficients, it should be mentioned that their requirements are fairly relaxed due to post-adaptation of the DLE filter. Moreover, the importance of PAE is more evident in worst case ISI which happens for the longest channel. For shorter channels a rough setting of PAE coefficients is sufficient because of a lower ADC resolution requirement. Fig. 3.17 shows the effect of the coefficient variation on the error performance for a 2-tap PAE when channel is set to its maximum, i.e. 300 m. As we can see, up to 10% change of the PAE coefficient (due to quantization or circuit implementation error) does not affect the advantage of having PAE considerably. Regard-

ing the fact that ADC resolution requirements are more crucial for longer channels, the PAE coefficient setting (or quantization) can be performed non-uniformly by assigning more PAE coefficient settings when channel is longer. As it will be mentioned in Chapter 4, having used a 2-tap PAE for our target application, 4 different settings for the coefficient *a* are chosen such that when channel length is close to it longest value the PAE coefficient settings are not more than 10% apart.



Figure 3.17: The effect of coefficient variation of a 2-tap PAE on RRMSE.

#### **3.7.1** Using Decorrelation Concept for a 2-tap PAE Design

Because of the simplicity of the mentioned first order approximation, we may reconsider the approaches of designing PAE for this simpler case. One new approach is considering the 2-tap PAE as a *first order decorrelator* [16]. In essence, the independent uniformly distributed symbols at the input of the channel become highly correlated at the output of the channel because of ISI. Removing the correlation of the symbols at the output of the channel is the same as flattening the power spectrum of the incoming signal [30][33]. A baud-rate equalizer does the same as part of the Nyquist condition [7], and essentially, the output symbols of an ideal equalizer are independent. However, a decorrelator is not necessarily an optimum equalizer since decorrelation is only about the magnitude of the signal spectrum, whereas Nyquist condition relates to the both phase and magnitude of the signal. Nevertheless, by assuming that the output of the channel is a filtered random process, a 2-tap decorrelator is sufficiently close to a *partial coarse* equalizer, playing the role of a whitening filter. Having an FIR baud-rate equalizer is equivalent to assuming that the channel is estimated by an all pole discrete filter and consequently, the output of the channel is an auto-regressive random (AR) signal as shown in Fig. 3.18. Note that, if necessary, by cascading some extra delay cells we can include a precursor ISI into the model too. However, in the case of the 2-tap equalization, the majority of the cancelled ISI components are post-cursor. This is particularly true, in a baud rate system without an analog match filter. In this case, the approximate model of the channel is as shown in Fig. 3.18(a) and we can write its output as

$$y(n) = x(n) + ay(n-1).$$
 (3.20)

Thus a 2-tap partial equalizer (or decorrelator here) would be the inverse of the channel as

$$\frac{1}{C(z)} = 1 - az^{-1}.$$
 (3.21)

By assuming the simple model in (3.20) the correlation between neighbouring output samples can be calculated as

$$R_{yy}(1) = E[y(n)y(n-1)] = a\sigma_{y}^{2}$$
(3.22)



Figure 3.18: The channel modeled by an: (a) AR(1) process and (b) AR(N) process.

and a recursive calculation gives:

$$R_{yy}(k) = E[y(n)y(n-k)] = a^{|k|} \sigma^2_y.$$
(3.23)

Therefore, the coefficient *a* in the decorrelator (3.21) can be well estimated from (3.22) as an autocorrelation factor<sup>1</sup>:

$$a = \rho = E\left(\left(\sum_{i} y(i)y(i-1)\right) / \left(\sum_{i} |y(i)|^{2}\right)\right)$$
(3.24)

The above is another approach to finding the zero or the single coefficient of a 2-tap PAE and it does not need any training sequence and can be estimated directly by channel output samples. The simulation results for our example application (622 Mb/s over 300-m coaxial cable) show that the result of the above approach is sufficiently close to the optimum solution.

#### **3.7.2** Using the PAE Inverse in the Digital Domain

When the analog preprocessor or PAE is designed with the goal of ADC bit requirement reduction, although a low order rough pre-equalizer provides a major improvement, the ISI error and channel noise enhancement may exceed from their optimum performance due to limited order of digital equalizers. To prevent this, we can add an extra post digital filter equal to the inverse of the PAE before DLE. For the 2-tap case, basically, the ADC block is replaced by the subsystem shown in Fig. 3.19. In this way, the digital communication system function is the same as the case without an analog preprocessor and the ISI error and channel noise enhancements remain relatively unchanged. With regard to the

<sup>1.</sup> An ideal estimation must satisfy both (3.22) and (3.23) and this depends on how realistic the AR(1) model in (3.20) is.



Figure 3.19: ADC quantization performance improvement using an analog decorrelator

ADC and quantization noise performance, the system is quite different. In Fig. 3.19 the ADC input has a lower crest factor; thus, less additive quantization noise will be introduced by the ADC. Also, the quantization noise is highly degraded by the post digital filter, particularly at high frequencies. It should be mentioned that this architecture (Fig. 3.19) is mainly applicable when PAE is a 2-tap analog decorrelator. This is because its zero's magnitude is equal to the correlation factor in (3.24) which is less than one. Therefore, the PAE zero is inside the unit circle and the PAE inverse is stable and feasible to implement.

The mismatch between the values of  $\alpha$  in the analog and digital side in Fig. 3.19 can be a concern. Table 3.4 shows a comparison of the receiver RRMSE for 0 to 10 percent of  $\alpha$  mismatch.For low mismatch (up to 2%), RRMSE remains low. For a larger mismatch between the values of  $\alpha$ , the RRMSE can be considerable. However, the promising point is that the adaptation of the digital equalizer can compensate to a great extent for this mismatch effect (last column of the table). It should be noted that in the current design, the

|                   | Fixed equalizer |       |       |       |       | Equalizer is adapted<br>after adding<br>mismatch |  |
|-------------------|-----------------|-------|-------|-------|-------|--------------------------------------------------|--|
| Mismatch          | 0%              | 1%    | 2%    | 5%    | 10%   | 10%                                              |  |
| Relative<br>RRMSE | 0.085           | 0.085 | 0.091 | 0.120 | 0.210 | 0.087                                            |  |

Table 3.4: The effect of mismatch of  $\alpha$  values before and after ADC on the relative RRMSE

value of  $\alpha$  in the analog side is the ratio of two transconductance amplifiers' gain and it is mainly determined by their degeneration resistors ratio. These resistors are carefully laid out beside each other in an inter-segmented form. Therefore, regarding the mismatch properties in CMOS technology [37], the variation of the analog  $\alpha$  is not expected to be more than 10%.

#### **3.7.3** Comparison with predictive coding systems (ADPCM)

A 2-tap PAE/decorrelator is similar to the predictive differential coding systems [33] in which the input samples are predicted by the previous samples with the aid of correlation information among them. Generally, in predictive coding systems, the prediction is done in a feedback loop (Fig. 3.20(b)) rather than in a feedforward loop (Fig. 3.20(a)). In this case, the quantizer can also be moved into the feedback loop (Fig. 3.20(c)). As a result, the quantization noise will be divided by the loop gain as well.





*Figure 3.20:* Comparison of (a) feedforward and (b) feedback predictive systems. (c) The Quantizer (ADC) is placed inside the feedback loop (ADPCM). (d) Post digital filter (reconstructing the original signal).

If the quantizer is placed inside the prediction feedback loop, after signal reconstruction by the post digital filter (Fig. 3.20(d)), the quantization noise remains unchanged. However, it should be noted that for highly correlated signals the quantization noise is much smaller because just the signal  $x_d(n) = x(n) - \hat{x}(n)$  with variance  $\sigma_{x_d}^2 \ll \sigma_x^2$  is being quantized by the ADC. As a result, the quantization noise is reduced by a factor of  $\sigma_{x_d}^2/\sigma_x^2$ , known as the prediction gain.

The analog implementation of the feedback predictive coding (Fig. 3.20(d)) is not always feasible at high speeds, due to the large loop delay caused by ADC and the loop filter. This delay should be smaller than the symbol period. However, the possible ways of combating the loop delay problem, such as using parallel fast circuits, can be considered as an open research topic [38][39].

In feedback predictive systems, the prediction filter uses the quantized samples to predict the next sample. In the feedforward method, the non-quantized samples are used instead. Therefore, in the process of post digital filtering, the variance of the quantization noise will be increased again, but with a *different shape* in the frequency domain. Since the post digital filter is lowpass, the quantization noise will be colored and will be highly reduced at high frequencies. This causes the digital equalizer, which boosts the higher frequency components, to have a reduced enhancement effect on the colored quantization noise. Interestingly, the concept of quantization noise shaping is also used in delta-sigma modulator ADCs, but in a different way [2][3][29].

Although in feedforward decorrelation (Fig. 3.20(a)) the reduction of the quantization error effect is due to the reduction of the crest factor and DLE noise enhancement factor, the power of the colored quantization noise may also be reduced. To evaluate this, we compare the variance of the final quantization noise after the post digital filter  $\sigma_{q_d}^2$  with original quantization noise  $\sigma_q^2$  from the system with no analog prediction or preprocessing. By remembering

$$\sigma_q^2 = \varepsilon_q^2 2^{-2R} \sigma_x^2, \qquad (3.25)$$

and similarly

$$\sigma_{q_d}^2 = \varepsilon_{q_d}^2 2^{-2R} \sigma_{xd}^2 \tag{3.26}$$

and ignoring the change of quantizer factor  $\varepsilon_q^2$ , then we will have

$$\frac{\sigma_{q_d}^2}{\sigma_q^2} = \frac{\sigma_{xd}^2}{\sigma_x^2}.$$
(3.27)

The quantization noise after the digital post filter  $1/(G(e^{j\omega}))$  is as

$$\sigma_{q_d}^2' = \frac{1}{2\pi} \int_{2\pi} \frac{\sigma_{q_d}^2}{|G(e^{j\omega})|^2} d\omega.$$
 (3.28)

Thus, the change in the quantization noise will be as

$$\frac{\sigma_{q_d}^2}{\sigma_q^2} = \frac{\sigma_{xd}^2}{\sigma_x^2} \cdot \frac{1}{2\pi} \int_{2\pi} \frac{d\omega}{|G(e^{j\omega})|^2} = \frac{\frac{1}{2\pi} \int_{2\pi} \frac{d\omega}{|G(e^{j\omega})|^2} \cdot \int_{2\pi} S_{xx}(e^{j\omega}) |G(e^{j\omega})|^2 d\omega}{\int_{2\pi} S_{xx}(e^{j\omega}) d\omega}.$$
 (3.29)

The above value when  $|G(e^{j\omega})|^2 = S_{xx}(e^{j\omega})^{-1}$  is equal to one. For some other  $|G(e^{j\omega})|^2$ , as was found in our simulation results, the above factor can be less than 1. This implies that, in such cases the quantization is more reduced.

#### 3.7.4 PAE Usage in FFE and FFE/DFE Architectures and Comparisons

Decision feedback equalizers (DFE) also reduce channel and quantization noise enhancements. However, while they can only eliminate the post-cursor ISI, their practical use is limited due to propagation error, initialization and clock recovery difficulties. Although the advantage of using PAE is superior in an FFE system, in a FFE/DFE receiver, the usage of PAE can be considerably beneficial as well. To investigate this fact, two FFE and FFE/DFE receivers used in our target application are shown in Fig. 3.21. Using the inverse of PAE, simplifies the equalization performance analysis as the signal remains unchanged within the process of quantization with pre/post filtering. In this case, the equalization task can be freely divided among FFE and DFE filters, independent of pre/post filters, for the best ISI and channel noise reduction performance. By PAE pre-filtering, some equalization is partially, but temporarily, done before ADC and will be neutralized afterwards by the PAE inverse. The pre-filtering reduces the crest factor and the post-filtering colors the quantization noise to a lowpass shape such that it is not significantly enhanced by the FFE.



(b) FFE/DFE

*Figure 3.21:* Receiver architectures including ADC with pre-filter (analog) and post-filter (digital). (a) FFE. (b) FFE/DFE.

Fig. 3.22 shows the frequency response of the post digital filter, the feedforward equalizers and their combinations for both FFE/DFE and DFE architectures in our target application. The numbers on the curves show the amount of their relevant white quantization noise power change after each filter while the signal power at the ADC input and slicer output are normalized to one. It can be seen that the feedforward filter in FFE/DFE architecture has 3.6 dB less noise enhancement compared to the FFE-only system.

The post digital filter (1/G(z)) has a sharp lowpass shape and has a 2 dB quantization noise reduction, due to the prediction gain from equation (3.29). The combination of post digital filter and feedforward equalizer in either of the FFE/DFE or DFE architectures cancel each other's boosting part. As a result, there is a quantization noise power reduction of 10.6dB and 7.5dB for FFE and FFE/DFE architectures, respectively. As we can see, temporary post cursor equalization of the input signal makes a significant reduction in quantization noise after FFE, while the advantage of DFE in the overall system, i.e. less channel noise enhancement, is not affected.

Fig. 3.23 shows the above fact in terms of the power spectrum of the traveling quantization noise in DFE and FFE/DFE architectures in Fig. 3.21 with or without pre/post filter-



*Figure 3.22:* Frequency response of the post digital filter and the feedforward equalizers and their combinations for both (a) FFE/DFE and (b) FFE architectures in Fig. 3.21.

| FFE filter                                 |               |
|--------------------------------------------|---------------|
| PAE inverse or post digital filter         | $- \bullet -$ |
| Combination of FFE and post digital filter |               |

ing at different points: after ADC, after digital post filter, after feedforward equalizer filter or before slicer. The total quantization noise variance, i.e. the integral of each PSD curve, is given in the figure. PSD curves are obtained through Welch's averaged periodogram method and Gaussian windowing [28][40].

As seen in Fig. 3.23(a1,a2), having the analog pre-filter initially reduces the quantization noise by 7.8 dB due to crest factor reduction. Moreover, the quantization noise power is reduced in Fig. 3.23(a3,b3) because of the shaping of the noise by the post digital filter before its enhancement by FFE in both the FFE/DFE and FFE schemes. It is evident that we achieve more performance improvement in FFE-only systems compared to FFE/DFE systems. Nevertheless, the improvement in the DFE system is still remarkable and depends on the amount of non-flatness of the FFE spectrum.



*Figure 3.23:* Power spectral density of the quantization noise in: (a) FFE/DFE (b) FFE architectures (shown in Fig. 3.21) at different points: (1) After ADC, (2) After post digital filter (3) After feedforward equalizer (before slicer), with the variety shown below.



## **3.8 Simulation Results**

#### **3.8.1** Quantization Noise Reduction Demonstration

Fig. 3.24 demonstrates the quantization noise power reduction by employing a low order 2-tap pre/post filtering. A strong low frequency tone combined with a weaker high frequency tone represents a signal with a lowpass shape at the receiver input. The PAE/ decorrelator pre-filter (with no gain) reduces the amplitude of the signal at lower frequencies. After a gain amplifier, the signal is amplified back to the original amplitude but with a stronger high frequency tone. The signal is then fed into a 5-bit ADC and the post digital filter consecutively. The result is shown in Fig. 3.24(f). It shows the reconstructed and



*Figure 3.24:* Demonstration of quantization noise power reduction when a low order pre/ post filtering processor is employed.

quantized form of the original input and can be compared with a regular 5 bit quantized form of input (Fig. 3.24(b)). The former result elaborates better high frequency details because of the quantizing error degradation at high frequencies by the post filter. Fig. 3.24(c) and (g) shows the final results after FFE filter (from the FFE/DFE system). As we can see, there is significantly less noise power in pre/post filtering architecture compared to direct quantization.

#### 3.8.2 Using 2-tap PAE in Coaxial Channel Application

Fig. 3.25 shows the amount of the relative RMSE for different numbers of ADC bits for two receivers with and without PAE filters in a 4-PAM 300-m coaxial application with symbol rates of 300 MS/s and 150 MS/s. For  $RRMSE \cong 0.10$  ( $SER \cong 10^{-7}$ ) the receiver with PAE needs 2.5 bits less resolution than the case without PAE in the 300-MS/s case. A similar comparison for a lower speed of 150 MS/s shows a resolution requirement reduction of 1.5 bits. The use of the PAE is less advantageous in the 150 MS/s case due to a lower ISI as this results in the correlation factor between the input samples dropping from 0.85 to 0.7. It can also be observed that, at a large number of bits, since the RRMSE is saturated by the residual ISI and additive noise, the error reduction is not evident.



*Figure 3.25:* Comparison of relative RRMSE versus number of ADC bits with and without decorrelator for 4-level PAM transmission over 300-m coaxial cable at different speeds. (a) 300 MS/s. (b) 150 MS/s.

 $-\circ$  Without decorrelator  $-\Box$  With decorrelator

### 3.9 Summary

In this chapter we reviewed the importance and the effect of a front-end analog-to-digital converter resolution in a baseband digital communication system receiver and discussed approaches to reduce this resolution requirement through analog preprocessing. The quantization noise and its effect on the symbol error rate were discussed. Estimation of the signal crest factor before the ADC with or without analog preprocessing were discussed. A formula to estimate the number of bits requirement for the ADC for a certain symbol error rate and receiver architecture was proposed. Based on this, the advantages and disadvantages of full analog equalization and partial analog equalization were investigated. Partial equalization was shown to be a better practical solution for ADC resolution requirement reduction. Different approaches for optimization of the splitting the equalization job between the partial analog equalizer (PAE) and the digital linear equalizer (DLE) were proposed and compared.

It was shown that a low order analog filtering block is enough to create significant improvement in the ADC resolution performance. Specifically, a two-tap partial equalizer was demonstrated, and discussed in detail, as a very efficient analog preprocessing choice to reduce the ADC requirements. Two-tap analog pre-filtering enables us to perform the inverse of the pre-filtering in the digital side. Thus, other efficient equalization architectures such as DFE with low order filters and less channel noise enhancement can be employed as well.

For the case of a 2-tap PAE the single zero of the filter can be estimated through a decorrelation factor as well. The 2-tap analog PAE/decorrelator reduces the bit requirement of the analog to digital converter up to 2.5-3 bits for 622Mb/s data rate over 300-m coaxial cable with 4-level PAM scheme application. This improvement is due to reducing the crest factor of the signal before ADC and the amount of quantization noise enhancement in the digital equalizer after ADC. The implementation and verification of PAE and ADC will be reviewed in the next chapters. 

## 4.1 Introduction

According to the results in the previous chapter, a 2-tap PAE or decorrelator followed by a 6-bit ADC, as shown in Fig. 4.1, is an efficient front-end choice for our target cable application. This front-end performs better than an 8-bit-ADC-only one. Here, the implementation issues for this analog front-end design discussed. In the next chapter the experimental results of the fabricated prototype chip in a 0.18-µm CMOS technology will be presented.

## 4.2 Partial Analog Equalizer Design

A two-tap PAE is essentially an analog FIR filter of the form

$$H(z) = G(1 - \alpha z^{-1}), \qquad (4.1)$$



Figure 4.1: Analog front-end top-level block diagram.

where G is the gain adjustment factor to compensate for the signal amplitude attenuation within  $(1 - \alpha z^{-1})$  decorrelation. This makes the signal amplitude conform with the ADC input dynamic range. Several analog FIR filters with different number of taps have been reported[8][6][9][41]. However, in the case of a 2-tap PAE, because of adjustment of only a single coefficient, we can consider a more specific architecture and thus achieve higher speed and performance.

As seen in Fig. 4.1, the building blocks for the 2-tap PAE consists of a delay line, a coefficient  $\alpha$  multiplier, a subtractor, and finally, an  $\alpha$ -dependent gain adjuster.

#### 4.2.1 Delay Line Generation Techniques

Fig. 4.2 depicts how interleaved sample-and-hold blocks can be used for the delay line generation. In Fig. 4.2(a) each S/H output is used more than once; and therefore, the existence of a buffer to preserve the held signal is mandatory. In addition, a switch matrix selects the proper S/H for each output. This architecture saves silicon area by reducing the



Figure 4.2: Delay line generation by interleaved sample and hold blocks.

(b)

C<sub>H</sub>

number of S/H blocks. However, the necessity of the buffer causes a signal drop during buffering and thus requires the S/H to operate for a larger signal swing. This is more difficult in a low voltage technology. Furthermore, the clock feedthrough of the switching matrix through the parasitics of the buffer can change the held signal before its second usage, and therefore, introducing extra distortion by changing the held sample. It should be noted that advanced opamp closed loop buffers can reduce signal loss and distortion, but they limit the maximum speed compared to their open-loop counterparts.

Fig. 4.2(b) shows another delay line generation in which each sampled signal is utilized only once at the cost of adding more interleaved S/H blocks. In this case, since the held samples are not needed for a second usage, by directly connecting the S/H outputs we can do the addition or subtraction via charge sharing techniques.

#### 4.2.2 PAE Topology Choices

#### **PAE/Decorrelator Architecture Using Charge Distribution**

The design in Fig. 4.3 is based on the charge distribution of sample-and-hold capacitors [16]. The minimum number of S/H blocks in this architecture is three. The first S/H is run-



*Figure 4.3:* PAE block diagram using a charge sharing technique.

ning with the main sampling rate and two others with half of the sampling rate. The ratio of the capacitors value in the secondary S/H pairs to the capacitors value in the first S/H determines the value of the coefficient  $\alpha$  in (4.1). Therefore, the secondary S/H pairs are selected among different choices depending on the value of  $\alpha$ . The control signals for this selection come from the digital domain. After the tracking time,  $\overline{\phi_4}$  enables the relevant hold switches, such that the proper S/H block shares its capacitor charge with the main S/H capacitor. After a settling time, the voltage at the subtracting node in Fig. 4.5 is a linear combination of each of the sampled voltages. The negative sign of the coefficient  $\alpha$  is obtained through an opposite connection of differential outputs of S/H blocks. The voltage at the subtracting node is given by

$$V_{out} = \frac{CV_{t_i} - \alpha CV_{t_{i-1}} + C_p V_p}{C + \alpha C + C_p},$$
(4.2)

where  $C_p$  is the parasitic capacitance at the subtracting node and  $V_p$  is its voltage before charge sharing. Although the  $C_p$  value is much smaller than sample-and-hold capacitances, extra switches are used to force the subtracting nodes to a fixed common mode voltage during the tracking time. This is done to remove any memory from the previous cycle and prevent the buffer transistors from being turned off. Thus,  $C_p V_p$  in (3) causes a dc shift at the summing node, which is cancelled in at the differential mode. A differential common drain buffer reduces the  $C_p$  and drives the following variable gain amplifier (VGA). This VGA compensates for the signal loss during the subtraction in order to cover the ADC input full range.

The above architecture is power efficient and good for medium speed and resolution, but at the cost of a large silicon area, due to the need for extra capacitors. Furthermore, as the technology shifts to lower voltages and shorter channel lengths several other concerns exist as well. As seen from (4.2), because of charge sharing we have extra signal loss with a factor of

$$\frac{C}{C + \alpha C + C_p} \cong 0.5.$$
(4.3)

This is in addition to the loss by the decorrelation, i.e.  $V_{in}(1-az^{-1})$ . Therefore, the need for extra gain in the VGA imposes speed and resolution limitations, particularly at lower
voltages. In addition, since the charges in the capacitors are destroyed by charge sharing, the required time of S/H settling will increase. This can be crucial for the speed of low voltage switches in the sample-and-hold blocks if they do not benefit from a bootstrap technique [42]. It should be noted that generally, bootstrapping is often avoided due to reliability and extensive complexity issues. The extra time for charge sharing and the need for an S/H buffer to reduce the  $C_p$  effect are other concerns. The above concerns are relatively common in other forms of switch capacitor FIR implementation as a candidate for PAE architecture [43][41].

### PAE Architecture Using Current Subtraction and Switch Matrix

Fig. 4.4 shows the 2-tap PAE/decorrelator architecture based on current-mode subtraction. The minimum number of sample-and-hold blocks is two, and their outputs are switched every other cycle by the switch matrix. The input samples and their delayed versions are converted to current by two transconductors and then subtracted from each other. The gain ratio of the two transconductors is equal to the PAE factor  $\alpha$ .



Figure 4.4: Current-mode implementation of the 2-tap PAE/decorrelator.

This architecture is more area efficient due to the low number of S/H blocks. Moreover, S/H capacitors charges are not destroyed at the end of each clock cycle. The first stage of the transconductors need a better linearity performance (about 1-2 bits more than the ADC), because they convert the input samples to current before subtraction. The second-stage Gm-Cell is a variable gain amplifier (VGA) and thus relaxes the gain requirement for the first stage Gm-Cells which have to tolerate a larger input swing. In this topology, each S/H output is utilized more than once, thus, the need for the buffer and other concerns mentioned for the switch matrix delay line exist here as well (see 4.2.1).

### Chosen Topology for the 2-tap PAE/Decorrelator

Fig. 4.5 depicts the top-level circuit architecture chosen for the 2-tap PAE followed by a 6-bit flash ADC. Two sets of triple interleaved sample-and-hold blocks provide the baudrate samples of the input signal at the time instants  $t_i$  and  $t_{i-1}$ . In this way, each S/H capacitor is used once and its charge is not destroyed before the next sampling time. Furthermore, interleaving relaxes the S/H design at low voltage operation. Two separate transconductors with an adaptive gain ratio equal to the equalization factor (or decorrela-



Figure 4.5: The top-level circuit architecture for the 2-tap partial analog equalizer.

tion factor)  $\alpha$  convert the consecutive samples to current. In this topology, the VGA is combined with the subtractor stage. Thus, the current signals are first amplified and then subtracted from each other. This improves noise toleration at the current subtracting nodes. Direct connection of the S/H to Gm-Cells reduces signal loss and distortion and allows the use of Nch track-and-hold switches. To reduce the distortion due to the mismatch effect in interleaved sample-and-holds, a single master clock technique is used, as will be described later.

### 4.2.3 Sample-and-Hold design

Sample-and-hold (S/H) circuits play a crucial role in the design of data acquisition interfaces. Front-end sample-and-holds are fundamentally difficult to design because they must operate at the extreme edge of the performance envelope. They must simultaneously achieve good linearity, high speed, large voltage swings, high drive capabilities, and have low power dissipation. In low-voltage and high speed systems, analog sampling because more challenging because limited headroom further tightens the trade-offs among the performance parameters. MOS S/H circuits mainly suffer from channel charge injection of sampling switches and clock feedthrough due to gate-overlap capacitances. Moreover, there are two additional sources of dynamic error: the variation of the switch-on-resistance with the input level and input-dependent sampling instant due to the finite transition of the sampling clock. In addition, when these errors are input-dependent, they create non-linear distortion, which can be more damaging than a fixed offset error.

Although closed loop S/H circuits [3][44] using opamps alleviate these errors and particularly their signal dependency, their operating speeds are seriously degraded due to the necessity of guaranteeing that the loop is stable for the desired loop gain. Therefore, for our target application, which requires medium precision but high-speed sampling rate, an open-loop design is preferred. Among open-loop S/H circuits there are two conventional parallel and serial architectures, as depicted in Fig. 4.6(a, b) [44]. Series sampling has the advantage of isolated input and output common-mode voltages and less nonlinear charge injection when S2 turns off earlier than S1. However, it has the disadvantage of charge sharing with the nonlinear capacitance  $C_p$  and longer hold settling time due to the settling of the voltage at the output node to a new common-mode value. The S/H in Fig. 4.6(c) shows another architecture, in which S2 turns off a little earlier than S1, such that the



*Figure 4.6:* (a) Parallel sampling. (b) series sampling. (c) parallel sampling with a series switch to reduce non-linear charge injection effect.

charge injection from S1 into  $C_H$  is prevented [16]. Note that since the charge injection by S2 is signal independent, it is not crucial and can be cancelled through differential signaling. Despite the above mentioned advantages for Fig. 4.6 (b, c), the conventional parallel S/H in Fig. 4.6 (a) is favorable because of its speed and simplicity. Particularly, if the capacitor charge is not destroyed during the hold time, the tracking settling time will be considerably reduced.

In this design, to meet the speed requirement, a parallel fully-differential S/H with triple-channel interleaving is utilized. Thus, as was seen in the PAE architecture in Fig. 4.5, six S/H blocks exist in the total design, just two of which operate in sampling mode simultaniously. The circuit diagram of a single S/H block is shown in Fig. 4.7.

A low common-mode voltage of 450 mV enables the use of NMOS switches with reasonable sizes. The interleaved S/H architecture provides extended hold time with a length of two full clock cycles, one for the  $z^{-1}$  delay in PAE and the other clock cycle for the analog processing by the following subtractor and amplifier stages. In addition, about one clock cycle is assigned for the tracking time, thereby, permitting the use of a holding capacitor (C<sub>H</sub>) as large as 1 pF. A large C<sub>H</sub> reduces the non-linear charge injection effect of the sampling switches. This charge injection is proportional to the inverse of the holding capacitor value [3]. Moreover, it reduces the effect of charge sharing due to the follow-



Figure 4.7: Sample-and-hold circuit.

ing stage input capacitor. The above mentioned input capacitor is mainly the relativelylinear gate capacitance of the PMOS input transistor of the next stage and its initial charge belongs to the previous sample. As a result, the charge sharing of  $C_H$  and the next-stage input capacitor behaves as a discrete-time linear lowpass filter with a relatively large bandwidth and it can be tolerated. To further reduce the charge injection error, dummy switches (M2, M5) with half of the size of tracking switches, with an inverted clock, are utilized in series with the sampling switches [3]. Regarding the low order of the partial equalizer in the next stage and the chosen PAE topology, each S/H has to drive the input of a single transconductor amplifier. Therefore, by appropriate biasing, it is possible to connect the transconductors of the next stage directly to the S/H blocks. This eliminates the need for an extra buffer; thereby, having less signal loss and additive distortion.

### **Mismatch Effects Due to Interleaving**

While interleaving is becoming an effective solution in high speed low voltage data acquisition systems [45], mismatch among different channels severely limits the precision of such systems. These mismatches can be categorized as different offset, gain and clock timing errors among interleaved blocks [11][46].

**Offset Mismatch Effects:** The offset mismatch is due to the different non-differential mismatch effects in interleaved blocks. This offset mismatch causes fixed pattern noise at the output which is repeated every  $F_s / M$ , where M is the interleaving factor and in this design M=3. This additive noise causes noise peaks at  $f_{noise} = k \times F_s / M$ .

**Gain Mismatch Effects:** Gain mismatch can be caused by various mismatches, such as different charge sharing factors and different time constants. As in the offset mismatch case, the basic error occurs with a period of  $F_s/M$ , but the magnitude of the error is modulated by the input frequency  $f_{in}$ . Therefore, noise spectrum peaks due to the gain mismatch appear at  $f_{noise} = -f_{in} + k \times F_s/M$ .

**Clock Timing Error Effects**: There are two kinds of clock timing errors: clock skew (systematic error) and clock jitter (random error). Interleaved topologies suffer from extra clock skew effects due to the different skews of the interleaved blocks input clocks. Ideally, the sampling edge of each interleaved S/H clock has to occur midway between neighboring S/H clocks. In reality, due to substantial mismatches in frequency dividers components and parallel clock paths, the sampling edges deviate from their ideal time instants. This timing mismatch causes noise in S/H output. The largest error occurs when the input signal has the largest slope. For a sinusoidal input, the envelope of the error is largest at the zero crossing and it varies with a period of  $F_s/M$ . Thus, as with the gain mismatch, the noise spectrum peaks are at  $f_{noise} = -f_{in} + k \times F_s/M$ .

### **Mismatch Effects Reduction**

To reduce the above mismatch effects, there are several approaches that can be taken. These include planning a careful layout of the parallel components to minimize their mismatches, changing the circuit architecture to reduce the number of parallel components and finally, carrying out background calibration to combat existing mismatch effects [47][48][49]. In this design, the interleaved parts are limited to the sample-and-hold (not the entire PAE and ADC) and the required resolution for it is in the order of 8-9 bits. Therefore, by using a simpler S/H with minimum components and a careful layout, the offset and gain mismatches are minimized. In this regard the S/H capacitors are built by interdigitated metal M2-M5, mainly using lateral capacitances with smaller units, to give better matching properties [50].

The clock's timing mismatches can be considerably reduced if a single clock drives all of the interleaved S/H blocks [53]. In this design, the proper edge of a single master clock is assigned to different S/H blocks rather than using a clock divider. As a result, the mismatch effect is reduced to the mismatch between the clock-pass-control switches and the single switches in the S/H blocks. According to Monte-Carlo simulation results, if the size and threshold voltages of these switches varies within three times of the technology mismatch standard deviation (approximately maximum 10%), using this technique the clock skew is below 5 ps. The worst case voltage error for a sinusoidal input is at its zero crossing and can be found from

$$\frac{\Delta v}{\Delta t}\Big|_{max} = V_{pp} \cdot \pi f_{in} \cos(2\pi f_{in} t)\Big|_{t=0} = V_{p-p} \cdot \pi f_{in}.$$
(4.4)

For example, for a typical 600-mV peak-to-peak 100-MHz input, the error voltage is less than 0.94 mV, which is quite tolerable for an 8-bit resolution.

#### **Performance Evaluation**

Fig. 4.8 shows the simulation results for the differential triple interleaved S/H, with a 400-MHz clock and a 102-MHz input tone. The input tone is chosen such that it is not an integer divisor of the clock frequency. Thereby, the sampling points cover the full amplitude range. Fig. 4.8(a) exhibits the S/H output in the time domain. As we can see, due to interleaving, almost a full clock cycle of sampled data is available for the analog processing by the following stages and the worst case settling time from cycle to cycle is less than 0.1 ns. Fig. 4.8(b) shows the S/H output spectrum. The output SNDR is 66 dB, or 10.7 bits. Note that because of sampling, the third and fifth harmonics are aliased to the in-band frequency range. For instance, the harmonic at 306 MHz appears at 400 - 306=84 MHz. Fig. 4.8(c) shows the same output spectrum in the presence of up to 10% additive random mismatch between the components sizes in all interleaved and differential branches. Various mismatch noise tones can be observed at the output spectrum. The total SNDR in this worst case is about 55 dB, which meets the 8-bit resolution requirement. As expected, due to the master clock technique, the clock mismatch noise component is not the major distortion component, compared to the offset mismatch noise component appearing at  $F_s/3 =$ 133.3 MHz.



*Figure 4.8:* Simulation results for the differential triple interleaved S/H, with 400-MHz clock ( $F_s$ ) and 102-MHz input tone ( $f_{in}$ ). (a) Output in time domain. (b) Output spectrum without added mismatch. (c) Output spectrum in the presence of up to 10% additive random mismatch between the component sizes.

# 4.2.4 Non-overlapping Triple Phase Clock Generation and Digital Controls

Fig. 4.9 shows a simplified single ended version of the triple interleaved S/H with relevant clocking signals. The hold control switches select the proper S/H. It is crucial for their clocking signals not to overlap otherwise, the sampled signal will be destroyed. Fig. 4.10 depicts the circuit used for a non-overlapping triple-phase clock generator from the main input clock. The three D flip-flops are initialized by the word 100. Then, by rotating that word in each clock cycle, they generate three different clocks with one third of the main clock frequency and a 120 degree phase shift. These clocks enter the following sub-circuit comprised of three AND gates and three delay cells. This circuit creates non-overlapping intervals on the clock transitions, depending on the delay durations.

The PAE circuit, as shown in Fig. 4.5, includes two sets of triple interleaved S/H blocks. When the PAE factor  $\alpha$  is set to be zero, one of the S/H blocks is turned off by disabling the buffers in Fig. 4.10. This, disables the hold control signals. At the same time,



Figure 4.9: Simplified schematic of the Triple interleaved S/H.

the corresponding S/H output is set to  $V_{CM}$  by the extra switch shown in Fig. 4.9. In this case, the PAE consists of one Gm-Cell and performs as a gain amplifier.

Fig. 4.11 shows a circuit that provides different clocks with proper delays and duty cycles for S/H and the ADC. The master clock, as was shown in Fig. 4.7, is a clock with a large positive duty cycle. This is because its high-level duration is the tracking time, which is close to one full clock cycle. In addition, a multiplexer is used for optimizing the ADC clock delay, by selecting among four different delays via a 2-bit digital control signal.



Figure 4.10: Non-overlapping triple phase clock generator from the input clock.



*Figure 4.11:* The circuit that provides different clocks with proper delays and duty cycles for S/H and the ADC.

# 4.2.5 The Design of Transconductors

According to the PAE topology in Fig. 4.5, two transconductance amplifiers (Gm-Cells) are needed to amplify and convert the input samples to the current. The Gm-Cells bandwidths should be large enough such that the subtraction result is settled at the input of the ADC latch comparators in a fraction of a clock cycle. The subtraction of the two currents must have a minimum of 6 bits resolution. Note that when two correlated signals are subtracted from each other, the resulting SNR degrades depending on the correlation between their noise-distortion components. In the Gm-Cell design here, the main resolution limitation is due to non-linear distortion terms. However, since the non-linear terms belong to one input signal, they are relatively correlated before subtraction. Thus, they do not accumulate considerably during the subtraction.

Fig. 4.12 depicts a conventional topology for a differential transconductance amplifier (Gm-Cell) of which the transconductance gain is [3]:

$$G_m = \frac{i_o}{v_i} = \frac{1}{R_s + 2/g_m},$$
(4.5)

where  $g_m$  is the transconductance gain of the input transistors. A smaller  $R_s$  provides a larger  $G_m$ , but makes  $G_m$  more dependent on  $g_m$  (because in (4.5) the term  $2/g_m$  becomes more comparable to  $R_s$ ). Since  $g_m$  depends on the current of the input transistors, it varies with input voltage; thereby, causing a significant non-linear distortion at the output. As a result, in this design, an improved architecture with inherent



Figure 4.12: Conventional transconductance amplifier circuit architecture.

linearization technique [54], as shown in Fig. 4.13, is utilized. The components sizes of this Gm-Cell design are shown in Table 4.1.



Figure 4.13: Linearized transconductor circuit architecture.

|            |                   | •                  |               |
|------------|-------------------|--------------------|---------------|
| Component  | Size <sup>a</sup> | Component          | Size          |
| M1-2       | 18 x 3.5/ 0.22    | R1-2               | 548.7 Ω       |
| M3-6       | 7 x 3/ 0.22       | R3-4               | 203.9 Ω       |
| M11-14     | 9 x 10.5 / 0.22   | R5-6               | 401.8 Ω       |
| M7-9       | 14 x 3 / 0.22     | R7                 | 2.3 kΩ        |
| M15-18     | 3 x 3 / 0.22      | R'1-2 <sup>b</sup> | 651.4 Ω       |
| R8-9       | 303 Ω             | R'3-4              | 282.8 Ω       |
| C1-2, C5-6 | 580.8 fF          | R'5-6              | 1.6 kΩ        |
| C3-4, C7-8 | 380.2 fF          | M20-21             | 10 x3.5 /0.18 |
| M23-24     | 3 x 2 / 0.18      | M22                | 5 x3.5 / 0.18 |

Table 4.1: Gm-Cell components sizes

a.  $\mu$ m/ $\mu$ m for transistors.

b. R' values relates to the second Gm-Cell in the PAE circuit.

In this architecture, the current of the input transistors M1 and M2 are forced to be constant by a feedback loop comprised of transistors M3-M6. For instance, in the left-handside loop, if the current of transistor M1 increases, the voltage at the drain of M17 will increase (due to the finite output resistance of the current source I2 comprised of M15 and M17). Then, M3 and M5, as a cascode amplifier, decrease the voltage at the source of M1; thereby, reducing the current of M1 and essentially, keeping it constant. Thus, the V<sub>GS</sub>'s of the input transistors become almost constant and the input small signal appears on the resistor between M1 and M2 sources, and produces a current equal to  $V_{in}/R_s$ . This current will be mirrored to the output transistors through the current mirrors comprised of M3-M9.An advantage for this architecture is that we can obtain an extra gain factor by current amplification through the current mirrors. This prevents the use of very small  $R_s$ when larger gains are required. In this design, an extra gain of 2 is attained by these current-mirror amplifiers.

In appendix A, it is shown that the gain of the mentioned feedback loop can be roughly approximated to  $A = g_{m3} \cdot r_{ds1}$ , it is also shown that the transconductance gain of this circuit approximately is

$$G_m = \frac{i_o}{v_i} = \frac{K}{R_s + 2/(g_{m1}(A+1))}.$$
(4.6)

where K is the amplification gain by the current mirrors M3-M9. As seen from (4.6), the effect of the  $g_m$  of input transistors is reduced by the factor A, i.e. the loop-gain factor.

Since the linearity of the transconductor amplifier used here is highly dependent on  $R_s$ , linear poly resistors have been used rather than triode transistors as their  $R_{ON}$  varies with the passing signal. The gain change is performed by switches M20-M22 by making parts of the resistor short circuit symmetrically. Another source of distortion is the finite output resistance of the current sources I1 and I2; particularly, since these resistances are non-linear in short channel technologies. Careful biasing and sizing of the transistors has enabled the use of cascode current sources; thus, considerably increasing the linearity of the circuit. In addition, the current mirrors M3-M10 are of the cascode type. This further improves the linearity of the circuit, as it makes the voltages of the M3-M4 drains close to the drains of M7-M8. Otherwise, even for small signal swings there could be considerable

linearity degradation. Moreover, the cascode current mirrors are also helpful in preventing the transconductance gain reduction, due to the limited output resistance of output transistors M7-M10.

Another advantage of this circuit is that it uses p-channel input transistors which carry certain benefits. First, it enables the use of n-channel transistors for the switches in the front-end sample-and-hold stage. Using n-channel transistors in S/H circuits is crucial, because for the same on-resistance the switches have smaller sizes and create less parasitic capacitance and charge injection. The second advantage of p-channel input transistors is that the body effect can be reduced by connecting the body of the input transistor to their sources, as shown in Fig. 4.13. The body effect directly affects the linearity of the circuit by contributing a current component, which itself depends on the voltage of the input-transistors. However, the drawback of a source-bulk connection is the addition of extra parasitic bulk capacitance to the source of input transistor. This will cause the second pole of the transfer function to move towards lower frequencies as the first pole is roughly determined by the equivalent RC at the drain of the input transistors. To improve the frequency response of the circuit two major techniques are utilized: First, larger currents are used to increase the  $g_m$  of input transistors and  $g_{m3}$  (see Appendix A). Second, lead compensation is employed when a large  $R_s$  is chosen for lower gain values.

The combinations R8-C1-M23 and R9-C2-M24 in Fig. 4.13 perform as lead compensation circuits to improve the stability characteristic of the circuit and the phase margin of the feedback loops. M23 -M24 are the enable switches and they will be turned on in cases where the transconductor is set to its lowest gain value. At lower gains, the large value of  $R_s$  increases the loop unity gain frequency  $\omega_u$  and decreases the non-dominant poles. By enabling the compensation circuit, while the capacitors C1-C2 reduces the  $\omega_u$ , adding the lead resistors introduces an extra zero at higher frequency. This cancels part of the phase lag caused by the non-dominant poles. A detailed analysis of the loop gain, frequency response and the lead compensation of this Gm-Cell circuit can be found in Appendix A.1. Fig. 4.14 shows the simulation results regarding the overshoot reduction, both in the step and the frequency responses of the transconductor circuit, by enabling the compensation circuit.



*Figure 4.14:* Effect of lead compensation in transconductor circuit regarding overshoot reduction. (a) Time domain step response. (b) Frequency response.

### 4.2.5.1 Gm-Cell Performance Characterization

To evaluate the performance of the Gm-Cells, their specific application in Fig. 4.5 must be considered. Specifically, in conventional VGA applications the input swing is at a minimum when the gain is set to maximum. However, here the output of Gm-Cells A and B are subtracted from each other so that even at maximum gain the input can be at full swing along with the output after subtraction. This complicates the design and verification of each single Gm-Cell. As a result, the performance of the Gm-Cells are verified both individually and together in the subtractor topology.

Table 4.2 shows the specification of a single transconductor over the gain variationrange. The test-bench for this individual performance test is shown in Fig. 4.15. In this test bench, the Gm-Cell outputs are pulled up to the power supply with a linear RC load. This load is set such that the output has 600 mV swing and the capacitor value is roughly equal to the parasitic load capacitance of that node in the main circuit, including the next stage (300 fF in here). Fig. 4.16 shows the FFT of the transconductor output current for the worst case linearity which is 58 dB when the gain is at its highest value and input is at full swing (600 mV pk-pk-diff) at 100 MHz.

Table 4.2: Transconductor Specification

| Specification                         | Value              |
|---------------------------------------|--------------------|
| Transconductance Gain                 | 2.9-0.9 mA/V       |
| Total Harmonic Distortion (THD)       | 58 - 66 dB         |
| Input Swing                           | 600 mV pk-pk diff. |
| 3-dB Bandwidth                        | 0.85 - 2.8 GHz     |
| Power (single Gm-Cell) / Power Supply | 6.8 mW / 1.8 V     |



Figure 4.15: Gm-Cell performance evaluation test circuit.



*Figure 4.16:* FFT of the transconductor output current for the worst case linearity which is 58 dB when the gain is at its highest value and input is at full swing (600 mV pk-pk-diff) at 100 MHz.

### 4.2.6 Current Amplifier and I/V Converter and Voltage Buffer

Fig. 4.17 shows the current amplifier stage and I/V converter and the differential source followers as voltage buffers. In this circuit, the dc current accumulated during the current subtraction is removed and is then doubled and converted to voltage. Eliminating this dc current enables the use of linear resistors for final I/V conversion and provides room for an extra current amplification gain of two. During the subtraction, the small signal amplitude is reduced while the dc components and the even harmonics (as opposed to odd harmonics) are added together.

Open-loop n-channel source followers with single-transistor current sources provide appropriate bandwidth and linearity as the ADC input buffer. The following ADC input load capacitance is approximately 3 pF. As shown in Fig. 4.17, a separate bias-generator circuit is used for this voltage buffer in order to prevent coupling large distortions to the rest of the bias circuit. The voltage buffer sinks 4 mA from a 1.8-V supply, thereby, consuming 7.2mW of power. For a 100 MHz 800mv diff-pk-pk input and 3pF capacitive load, the bandwidth of the buffer is 1GHz and its THD is 74dB, as shown in Fig. 4.18.



Figure 4.17: I/V converter, current amplifier, voltage buffer.



*Figure 4.18:* (a) FFT of the output of the ADC input buffer with 3 pF capacitance load and 800-mV peak-to-peak differential 100-MHz input tone. (b) Frequency response of the ADC buffer with 3 pF capacitance load.

# 4.2.7 Overall Performance and Simulation Results

To assess the performance of the analog processing blocks together, the test-benches shown in Fig. 4.19 were used. Table 4.3 summarizes the overall bandwidth, gain, settling time and linearity performance of the circuit for different gains, corresponding to four different PAE factors.

Fig. 4.19 (a) is used to evaluate the individual gains and bandwidth of the path from each single Gm-Cell to the ADC input. This path includes one Gm-Cell, a current amplifier, a V/I converter and the voltage buffer. The first three rows of the Table 4.3 show the simulation results for both of the transconductors for different gain settings. In the second row, the test-bench of Fig. 4.19(a) is slightly altered such that the input voltage is fed to Gm-Cell<sub>B</sub> while Gm-Cell<sub>A</sub> input is ac-grounded. The third row exhibits the PAE/decorrelation factor  $\alpha$  for each selected gain combination, which is basically the ratio of G<sub>A</sub>/G<sub>B</sub>.



Figure 4.19: Test benches for PAE performance characterization.

As we can see, for a larger  $\alpha$ , larger gains are used; thereby, compensating for the signal amplitude degradation after subtraction.

To evaluate the distortion performance of the Gm-Cells together, the test circuit shown in Fig. 4.19 (b) is used. The delay cell in this test bench is ideal and is equal to 2.5 ns, one period of 400-MHz clock. The total harmonic distortion at the output are shown in rows 4 and 5 of Table 4.3. The last row of the table shows the settling time for a full swing step output shown in Fig. 4.19 (a). The results ensure that the ADC input is settled to a fraction of the clock cycle; thereby, leaving enough time for comparator circuits.

In summary, the total power consumption of the analog preprocessing circuit, including the ADC buffer, is 27 mW. The final THD before the ADC is 47 dB with a 600 mV differential swing. The voltage amplification gain varies from 1.7 to 5.7, corresponding to a bandwidth of 970 MHz to 680 MHz.

| Parameter                                                                                  | Gain<br>Selection I | Gain<br>Selection<br>II | Gain<br>Section<br>III | Gain<br>Selection<br>IV |
|--------------------------------------------------------------------------------------------|---------------------|-------------------------|------------------------|-------------------------|
| $G_A$ / BW: Output voltage gain and bandwidth versus Gm_Cell (A) input (Fig. 4.19 (a))     | 4.5 V/V,<br>675 MHz | 3.6 V/V,<br>750 MHz     | 2.5 V/V,<br>970MHz     | 1.4 V/V,<br>972 MHz     |
| <i>G<sub>B</sub></i> / BW: Output voltage gain and<br>bandwidth versus Gm_Cell(B)<br>input | 3.9 V/V,<br>714 MHz | 2.9 V/V,<br>900 MHz     | 1.24 V/V,<br>972 MHz   | 0                       |
| PAE factor: $\alpha \cong G_A/G_B$                                                         | 0.87                | 0.8                     | 0.5                    | 0                       |
| THD / Gain for 10 MHz input in PAE test set-up in (Fig. 4.19 (b))                          | -54 dB,<br>1.1 V/V  | -56 dB<br>0.9 V/V       | -58 dB<br>1.2 V/V      | - 58 dB<br>1.3 V/V      |
| THD / Gain for 100 MHz input in PAE test set-up in (Fig. 4.19 (b))                         | -47 dB,<br>6.9 V/V  | -47 dB,<br>5.5 V/V      | -47 dB<br>3.1 V/V      | -48 dB<br>1.3 V/V       |
| Compensation                                                                               | None                | None                    | None                   | lead comp.              |
| 1% Settling time for step input (Fig. 4.19 (a))                                            | 1.4 ns              | 1.35 ns                 | 1.16 ns                | 1.15 ns                 |

| Table 4.3: | Performance | of the PAE | analog p | processina | blocks |
|------------|-------------|------------|----------|------------|--------|
|            |             | •••••••••  |          |            |        |

# **4.3 ADC Flash Architecture**

A variety of architectures exist to implement the ADC in CMOS technology [4][3]. The target design specification for ADC is 400-MHz sampling with 6-bit resolution for input frequencies up to 200 MHz. This is slightly higher than what is required for our target cable application. Pipeline architectures are popular candidates for high resolution applications [42]. However, they suffer from large latency and operating them at clocks more than 150 MHz is problematic, mainly due to the accuracy limitation of their inter-stage blocks. Since digital processing blocks after the ADC control the front-end adaptive gain amplifier (AGC) and adapt the clock recovery system, generally low conversion latency is desired. ADCs with folding architectures are a good choice for high speed, low power, and low latency. However, they have bandwidth and performance limitations due to the circuitry that processes the folded signal. Interpolating architecture is another choice that reduces the power by reducing the number of preamplifiers. However, it increases the driving load for amplifiers. Regarding the above considerations, and given that the required resolution is 6 bits, a full flash ADC architecture was chosen for the current application.

Fig. 4.20 shows the top level architecture for the flash ADC used in this design, one that is proceeded by PAE. The architecture is fully differential in order to reduce the effect of power supply noise in the presence of the digital circuitry. The ADC consists of 64 com-



Figure 4.20: Flash architecture for the analog-to-digital converter.

parators with switch capacitor offset cancellation. The control circuits for this prototype are designed such that three autozeroing modes of operation are possible. Autozeroing all comparators together periodically, every several clock cycles, within every single clock cycle and interleaved autozeroing technique (IAZ) are three autozeroing modes considered. In the IAZ mode the offset of each comparator is cancelled (autozeroed) in the background, while the other 63 comparators are in a comparison state, and a column of multiplexers selects the proper outputs of the working comparators for the following digital back-end stage.

### **4.3.1** Comparator Design

In ADCs the offset of each comparator must be within a certain level of accuracy. Differential comparators without switch capacitor offset cancellation can continuously compare the difference between  $V_{in}$  and reference voltages  $V_{ref}$ . However, the comparator elements must be optimized for minimum inherent offset and generally, the design would be difficult with the minimum of transistor sizes. Averaging and digital calibration are the most effective techniques to improve accuracy; particularly, for continuous time comparators [55]. However, they increase the complexity, power consumption and area usage. Moreover, the DC common mode level of the input signal is restricted as it provides the bias of the input transistors of the comparator preamplifier. Another difficulty, as shown in Fig. 4.21, is the capacitive feedthrough due to the  $C_{gs}$  of the input transistors from the input signal to the reference ladder. This causes an effect called resistor-string bowing [4].

In this ADC implementation a differential comparator with switch capacitor offset cancellation (autozeroing) is utilized [4]. Fig. 4.22 shows the comparator architecture. It con-



*Figure 4.21:* Input signal feedthrough to the reference ladder in continuous time comparators.

sists of a two-stage preamplifier and a regenerative latch, followed by an SR latch. The timing diagram in Fig. 4.22(b) shows the comparator's front-end operation. In autozero mode, when  $\phi_{az}$  and  $\phi_{az_a}$  is high, the two-stage preamplifier is configured in closed loop mode and the input sides of the coupling capacitors are connected to the reference voltages. This resets the capacitors charges by the reference voltages and the preamplifiers input offset.  $\phi_{az_a}$  is the same as  $\phi_{az}$ , except that it falls a bit earlier than  $\phi_{az}$  in order to prevent the charge injection effect by S<sub>3</sub> and S<sub>4</sub>.

In comparison mode,  $\phi_{az}$  is low and  $\phi_{Vin}$  is high. Thus, the feedback loop is broken and the difference between the input and the reference voltages is amplified by the preamplifier. Note that the input offset is cancelled by the offset previously stored in the capacitors during autozero mode. The amount of this cancellation depends on the gain of the preamplifier. To find the final residual offset in more detail, we can consider the feedback circuit in offset cancellation mode and write

$$(-A)(V_{12} - V_{OSA}) = V_{12}, (4.7)$$



*Figure 4.22:* (a) The autozeroing (switch-cap) comparator block diagram. (b) Timing diagram of comparator operation.

where *A* is the open-loop amplifier gain and  $V_{OSA}$  is the amplifier input offset. Therefore, assuming  $V_{ref+}=V_{ref-}$ , the capacitors are charged totally by

$$V_{C_0} = V_{12} = \left(\frac{A}{A+1}\right) V_{OSA}.$$
 (4.8)

Thus, the residual input offset  $V_{OSA}$  will be reduced by A+1 as

$$V'_{OSA} = V_{OSA} - \left(\frac{A}{A+1}\right)V_{OSA} = \frac{V_{OSA}}{A+1}.$$
 (4.9)

In addition to  $V'_{OSA}$ , the input referred latch offset  $V_{OSLCH}/A$  and the charge injection mismatch between S<sub>5</sub> and S<sub>6</sub> ( $\Delta q$ ) contribute to the comparator input offset [4]. Thus, the total offset will be

$$V_{OS(residual)} = \frac{V_{OSA}}{A+1} + \frac{\Delta q}{C} + \frac{V_{OSLCH}}{A}, \qquad (4.10)$$

where  $C = C_1 = C_2$ . Generally, due to charge injection mismatch, *KT/C* noise, and leakage considerations, a large value for the input coupling capacitors is desired. However, the loading effect of their parasitics to the previous buffer stage and the time constant in the reset mode must be taken into account. The chosen value of C in this design is about 300 fF. An interdigitating layout structure using metal layers 2-4 reduces the parasitics such that the settling time of nodes 1 and 2 meets the target speed requirement.

#### 4.3.1.1 Circuit Description

Fig. 4.23 shows the circuit architecture of the comparator design. The corresponding component sizes are listed in Table 4.4. Fig. 4.23(a) shows the pre-amplifier comprised of two cascaded Nch-MOS differential input amplifiers, loaded by non-silicided poly resistors. The preamplifier provides a proper gain-bandwidth to reduce the effect of the offset and the kickback noise of the following regenerative latch.

The dc current of each preamplifier stage is 164  $\mu$ A. Using resistive loads rather than diode connected loads enhances the speed due to less parasitic load capacitance (5.3 fF for each load resistor) and better recovery from overdrive situation. The total bandwidth of the

two stage amplifier is 550 MHz with a total gain of about 26 dB and the unity gain bandwidth is 4.4 GHz. A useful optimization technique for multiple stage amplifiers according to their number of stages and gains, can be found in [56]. Two major poles are created at the output of each amplifier as

$$p = \frac{1}{RC_{load}},\tag{4.11}$$

where R is the pull-up load resistor and  $C_{load}$  is the load capacitance at the output of each amplifier. When the autozeroing loop is closed the  $C_{load}$  of the second amplifier is larger



*Figure 4.23:* Comparators building blocks. (a) two-stage pre-amplifier comparator. (b) Comparator and SR latches.

due to extra loading parasitic of the loop. This makes the second amplifier pole smaller and helps the stability of the autozeroing feedback loop.

| Table 4.4: Comparator circuits |                             |             |             |  |
|--------------------------------|-----------------------------|-------------|-------------|--|
| Transistors                    | <b>W/L</b> (μ <b>m/μm</b> ) | Element     | Sizes       |  |
| M1-4                           | 5x2/ 0.220                  | M23-24      | 2x1.2/0.200 |  |
| M5-6                           | 3x2/0.200                   | M27-28      | 4x1.2/0.180 |  |
| M7-8                           | 3x2/0.180                   | M29-30      | 2x1.2/0.180 |  |
| M9-12                          | 3x1/0.180                   | M31-32      | 4x2.4/0.180 |  |
| M13-16                         | 6x1/0.220                   | INV1-2(N,P) | 2x900/0.180 |  |
| M17-18                         | 1x1/0.200                   | R1-4        | 5 kΩ        |  |
| M19-20                         | 2x1.2/0.200                 | C1-2        | 250 fF      |  |
| M21-22                         | 4x1.2/0.200                 | MD5,MD6     | 3/0.200     |  |

Transistors M5 and M6 are the feedback switches which turn on in autozero mode and are Pch-MOS because of the relatively high bias voltage (1.4V) across them. The dc common mode of the input signals and reference voltages are designed to be 450 mV. This is different from the comparator preamplifier input, which is around 1.4 V. This is possible due to the existing coupling capacitors in this comparator architecture. The lower input common mode enables the use of NMOS front-end switches and ensures reliable switch operation without bootstrap clocking [42] or very large switch sizes. In the autozero mode, the switches M5 and M6 turn off prior to M9-12. Therefore, the charge injections of M9-12 do not affect the coupling capacitors charges. To reduce the charge injection and clock feedthrough by M5-6 onto the coupling capacitors, their sizes are minimized and dummy transistors MD5-6, with half the size of M5-6, are added to the circuit.

To attain enough resolution for a relatively low input swing (600-800 mV pk-pk diff.) and low  $V_{LSB}$  (9 -12 mV), a two-stage amplifier provides a gain of 20. Such a gain would cost more power and cause a speed limitation on the comparator preamplifier. However, augmenting the input swing, by increasing the gain of the previous stage, is more difficult. The reason for this is that the linearity performance of the amplifiers before the comparators should meet the resolution requirement of the ADC while this is not the case for preamplifiers which only amplify the sign of the difference between signal and reference

voltages. According to the simulation results shown in Fig. 4.24, for the worst case situation when the input varies from full swing above the reference to 3 mV below the reference, it takes 1.6 ns for the preamplifier to provide 65 mV at the input of the regenerative latch. This ensures correct comparison results against worst-case overdrive, and dynamic and mismatch offset effects.



*Figure 4.24:* Comparator performance, worst case for extreme overdrive to the weakest conversion, with a clock of 500-MHz, and input changing by 250 MHz.

The regenerative and SR latches are shown in Fig. 4.23(b). The core of both of them is a conventional back to back inverter architecture. The comparator latch output is reset to the power supply through M23-24, While the preamplifier is tracking the input signal. Also, in the tracking period M26-27 turn off; thereby, no static power is consumed. M17-18 injects enough differential current to force the latch to go to the proper state in the regeneration process. At the end, the SR latch preserves the CMOS level of the comparator output during the full clock cycle.

The latching time for the back-to-back inverters can be approximated by [3]

$$T_{ltch} \propto \frac{L^2}{V_{eff}} \ln\left(\frac{\Delta V_{logic}}{\Delta V_0}\right),$$
 (4.12)

where  $\Delta V_0$  is the initial input voltage difference,  $\Delta V_{logic}$  is the final desired logic differential voltage levels, and *L* is the transistors channel length. Therefore, to maximize speed, the channel length has to be minimized. However, due to matching consideration for the comparator latch transistors, the chosen channel lengths are slightly greater than the minimum, i.e. 200 µm. For the SR latch, since the inputs are already close to logic values, matching is not a major issue, and the transistors channel lengths are minimized.

## 4.3.2 Autozeroing Techniques

Allocating extra time for a resetting or autozeroing operation is one of the disadvantages of switch capacitor comparators, compared to continuous-time ones. As shown in Fig. 4.22(b), after each autozero cycle, several comparison cycles can be performed, depending on the charge leakage of the coupling capacitors. To estimate the longest tolerable interval  $T_H$  between two autozero cycles we can write [57]:

$$T_H < \frac{C(0.5V_{LSB})}{I_{lkg}}$$
 (4.13)

where  $I_{lkg}$  is the leakage onto the coupling capacitor C and  $0.5V_{LSB}$  is the permissible error voltage. The current  $I_{lkg}$  is mainly flowing through the reverse biased PN junction of the autozeroing switches and is roughly 20 pA. Therefore, for C = 300 fF and  $V_{LSB} = 9$  mV, the accuracy of ADC is not affected up to around 67 µs. Experimental results on the prototype chip showed that the ADC output is valid up to about 40 µs after each single offset cancellation.

Conventionally, an autozero cycle can be performed at every clock cycle if the clock period is long enough. In some applications, such as disk-drive read channels [59], a short period of time occurs periodically, where the output of the converter is not used. Those time slots can be utilized for autozeroing purposes. Another solution includes using an interleaved autozero technique (IAZ), where the offset of each comparator is cancelled in the background while the other comparators are in comparison mode [57]. The clock

control logic in the current work is designed such that it is able to exploit all of the above three modes of autozeroing technique.

#### **Interleaved Autozeroing (IAZ)**

As seen in Fig. 4.20, by adding an extra comparator row and a set of multiplexers, a single comparator row can be calibrated in the background while the other comparators are in comparison mode. Fig. 4.25 demonstrates the up/down order of the comparators calibrations and the alternative usage of the neighboring reference voltages.  $T_{AZ} = 8 T_{CLK}$  is the total calibration time interval. The longest interval between two calibration operations happens for the end comparators. This is 126  $T_{clk}$ , i.e. 2520 ns for a 400-MHz clock frequency.

A detailed timing diagram of the control signals for the ADC design in Fig. 4.20 is depicted in Fig. 4.26. During the initialization of the ADC, all comparators are reset to the proper reference voltages. At the moment of transferring the comparator from autozeroing into comparison mode the amplifier output is distorted by the clock feedthrough and possibly ringing because of the opening of the feedback switch. Therefore, as seen in Fig. 4.26, four clock periods are assigned for resetting the comparator and the other four are used for extra delays, before and after resetting. Autozeroing starts one clock after disconnecting  $V_{in}$  from the coupling capacitors. The input signal  $V_{in}$  will be reconnected one clock after autozeroing is terminated. Then, after two more clock cycles, the output of the comparator



Figure 4.25: Interleave up/down autozero demonstration.

will be selected by the multiplexer and becomes available for the following digital circuitry. These extra delays ensure the isolation between calibration and comparison modes and the settling of the amplifier outputs after autozeroing.



Figure 4.26: Timing diagram of the control signals for interleave autozero technique.

The hardware generating the control signals for IAZ and other autozeroing modes is shown in Fig. 4.27. When the signals Az\_mode2\_en and Az\_mode2\_en are disabled, IAZ control signals will be generated. As shown in Fig. 4.27(a), an up/down serial shift register, clocked with one eighth of the main clock, generates 64 command signals AZ<sub>en</sub>(i). This determines which row has to undergo a calibration process. The circuits shown in Fig. 4.27(b) produce the final control signals with proper delays, as shown in Fig. 4.26. When Az\_mode2\_en is high, while IAZ signalling is disabled, the circuit in Fig. 4.27(b) generates two non-overlapping clocks for the autozeroing and comparison modes within each clock cycle. The last mode of operation occurs when AZ\_mode1\_en is high. In this mode, using asynchronous set and reset of the serial shift register, all of the comparators are calibrated simultaneously. This same function is performed during the initialization of the circuit when the reset signal is low.



(b)Inside the logic block for control generation in each row



Figure 4.27: The logic circuits generating control signals for autozeroing.

# **4.3.3 ADC Reference Voltages**

A differential reference ladder composed of two separate pieces of silicided poly resistors with the value of 1.6 k $\Omega$  were used to generate the reference voltages. Two separate resistor ladders, as compared to a single ladder, provide the possibility of calibration for PAE/ADC systematic offset error. For example, in Fig. 4.28 in the case of  $V_{max} = 650$  mV and  $V_{min} = 250$  mV in the single ladder, the reference voltages vary from 400 mV to -400 mV. This demands a zero offset to cover the ADC full range. In the dual ladder, for instance, if there exists an offset of +50 mV, then by shifting the  $V_{maxp}$  and  $V_{maxn}$  by 50 mV, the reference voltages will be in the range of 450mV to -350mV. This can compensate for the mentioned offset.



Figure 4.28: Reference voltage generation. (a) Single ladder. (b) Dual ladder.

The reference-ladder resistance is required to be low in order to accelerate the comparators calibration process, in addition to reducing the coupling of the input signal onto the reference ladder through the switch capacitor resistance. This coupling is demonstrated in Fig. 4.29 and it should be mentioned that it becomes negligible when the comparators are autozeroed, every several cycles. Based on the above considerations, power consumption and layout constraints, each resistor division was chosen as 25  $\Omega$ .



*Figure 4.29:* Input signal feedthrough to the reference ladder in a switch-capacitor (autozeroing) comparator.

The common mode voltage of the ADC input signal is designed to be close to the common mode of the reference voltages, i.e. 0.4 V. This assures that comparator input amplifier does not exceed its proper input common-mode bias range in comparison mode. This is because the gate of the pre-amplifier input transistors (V<sub>G</sub> in Fig. 4.29) is shifted by the value  $V_{in} - V_{ref}$  in comparison mode. The maximum possible shift of V<sub>G</sub> is  $V_{in_{max}} - V_{ref_{min}}$ . Thus, the common mode of  $V_{in}$  and  $V_{ref}$  has to be close enough such that the amplifier biasing point remains in its proper linear range and that the voltage of the V<sub>G</sub> does not exceed the power supply regarding the reliability issues.

### 4.3.4 Digital Back-End

Due to the sample-and-hold prior to PAE, the ADC input at the time of the latching is held. Thus, mismatch between latch delays among the comparators and different clock cycles does not produce significant distortion or sparkle errors [4]. However, to further reduce the probability of sparkle codes, a column of 4-input AND gates that detect 0001 transitions is used. Further bit error reduction techniques, such as the ones presented in [57][58], could be used; although in this PAE/ADC prototype implementation were not utilized. Finally, as shown in Fig. 4.30, the AND gates output code words are fed to the binary decoder ROM and the final outputs are fed into digital I/O drivers. The schematic of the ROM decoder is shown in Fig. 4.30.



Figure 4.30: Thermometer to binary ROM decoder.

## **4.3.5** Layout

Due to the high sampling speed and the complexity of the total circuit, the layout was carefully performed, with appropriate consideration, to minimize coupling and skew effects. Careful separation of the analog and digital and I/O grounds, the use of metal shields around sensitive signals, the separation of the chip into two digital and analog sides, including their PADs and drivers, the buffering and distributing of the clock in a tree shape were part of these considerations. Every two comparators and control circuit generators were placed in one row. Thus, the distance between the first and the last row was reduced by a factor of two. This improved matching and reduced clocks and signals skew along the chip. Interdigitated capacitors using metals 2-6 were used for comparator capacitors. These capacitors utilize lateral fringing capacitances and show better capacitor per area and matching properties compared to vertical metal to metals [50].

# 4.4 Bias Circuit

Fig. 4.31 depicts the bias circuit. It generates the reference voltages for different current sources in the circuit. The core of this circuit is the loop made from M1-M8, which is a combination of the well-known constant-gm circuit with wide swing cascode current mirrors [51][52]. If  $(W/L)_8 = 4(W/L)_7$ , to a first order approximation, it can be shown that

$$g_{m7} = \frac{1}{R_{bias}}.$$
 (4.14)

Therefore, the transistors transconductances are independent of power supply voltage as well as process and temperature variations. The transconductances of other transistors



Figure 4.31: Bias circuit.

| Table 4.5: Bias circu | uit component sizes |
|-----------------------|---------------------|
|-----------------------|---------------------|

| Transistors | <b>W/L</b> (μ <b>m</b> /μ <b>m</b> ) | Components                              | Sizes             |
|-------------|--------------------------------------|-----------------------------------------|-------------------|
| M1-4        | 1x3/ 0.220                           | M15                                     | 1/5               |
| M5-7        | 1x10.5/0.220                         | M16                                     | 3/0.220           |
| M8          | 4x10.5/0.220                         | M17-18                                  | 1x3/ 0.220        |
| M9-10       | 6x10.5/0.220                         | M <sub>cp1,2</sub> - M <sub>cn1,2</sub> | 8x20/3.5          |
| M11         | 1x3/ 0.220                           | R2                                      | 1 kΩ              |
| M12         | 1x10.5/0.220                         | R1                                      | 400 Ω             |
| M13-14      | 6x3/ 0.220                           | R <sub>bias</sub> (R1+R2)               | 1.4KΩ $(0.4 + 1)$ |

driven by the same bias circuit are stabilized as well as they are mainly dependent on the ratio of the corresponding transistors geometries.

Considering the secondary effects such as body effect, finite output resistances and mobility degradation, the equation (4.14) can slightly deviate from its stabilized form [5]. In this design, by using the  $R_{bias}$  at the Pch transistors and connecting the body and source of M8 together, the body effect error is considerably reduced. In addition, the effect of the limited output resistance is reduced by using wide swing cascodes. The performance of the bias circuit has been verified at different temperatures and process corners at a 1.8V supply, within 10% variation. If the power supply drops up to 10%, in a worst case situation of slow precess (SS) and T=100 C, there is a possibility that cascode transistors M10 and M13 move slightly to the edge of the triode region. However, this does not harm the performance of the main bias circuit as the main loop transistors remain in the active region. For lower power supply voltages a modified version of this bias circuit using opamps can be used [60].

One difficulty with this bias circuit is the possibility of oscillation if the gain of the positive feedback loop exceeds unity at higher frequencies, due to the pad and pin capacitances at the source of M8. To avoid this, part of  $R_{bias}$  (400  $\Omega$ ) is placed on-chip. The total  $R_{bias}$  in this circuit is 1.4 k $\Omega$  and the current of the current-mirror loop is about 84  $\mu$ A. This current value enables the use of a reasonable unit size for all of the current source transistors in the circuit; thereby, providing better matching. Another consideration is the requirement of a start-up circuit to ensure that the positive feedback loop does not become stuck at zero currents. This start-up circuit automatically turns off after the circuit currents are stabilized. Finally, to further stabilize the bias voltages and decouple high frequency noises, 2pF MOS capacitors are utilized. Note that the Pch biases are decoupled to Vcc and the Nch biases are decoupled to the ground.

# 4.5 Summary

In this chapter circuit design issues for the combination of 2-tap PAE and a 6-bit 400-MHz ADC were presented. All blocks on the signal path are fully differential. The chosen topology for PAE uses two sets of triple interleaved sample-and-holds (S/H) followed by two variable gain transconductors, a current-to-voltage converter and an ADC input buffer.
Chapter 4: Circuit Implementation

nismatch betw

A single master clock for interleaved sample-and-hold reduced the mismatch between clock edges. The harmonic distortion of the S/H in the presence of up to 10% added mismatch between interleaved blocks was -56 dB. Another challenging part of the PAE design was the analog subtractor, comprised of two variable gain transconductors. Due to the subtraction operation, even for larger gains, the input can be close to full swing. The transconductors harmonic distortion range was -58 to -66 dB with a bandwidth of 850 MHz to 2.9 GHz, respectively. Each Gm-Cell consumed 6.9 mW at 1.8 V supply to accommodate the proper bandwidth and settling time. PAE output THD before the ADC, including the ADC buffers, was 47-54 dB for 100 and 10 MHz inputs, respectively.

A 6-bit 400-MHz ADC was designed using a flash architecture. An autozero offset cancellation was used for the comparators. To have flexibility in the testing of the prototype chip, three modes of interleaved, periodic every cycle, periodic every several cycle autozeroing were applied. Each comparator consisted of a two-stage pre-amplifier followed by regenerative and SR latches. According to simulations, the comparators were capable of resolving 3-mV against a worst case clock speed and overdrive situation. The 6-bit ADC output was provided by a digital back-end circuit consisting of bubble removal, a ROM decoder and eventually open-drain PAD drivers. Open drain digital PAD drivers enable the use of an off-chip pull-up load resistor, depending on the test-equipment loading capacitor. Careful layout of the whole circuit, particularly the floor planning of the ADC rows, was a critical part of the design implementation and was reviewed in this chapter. 

## Experimental Results

## 5.1 Introduction

In this chapter, we discuss the measurement results for the prototype chip discussed in Chapter 4. The chip was fabricated a in standard digital 0.18-µm CMOS technology with six metal layers. The microphotograph of the chip is shown in Fig. 5.1. The chip was inserted in a CFP-80-pin package and the 80-pin IC was soldered on a CFP80-TF test board as shown in Fig. 5.2

The chip consists of a partial analog equalizer and a 6-bit flash analog-to-digital converter. In addition to the overall performance characterization such as power, SNDR, DNL and INL, the performance of the chip in a 400-Mb/s 240-m coaxial cable system was evaluated and the advantage of having a PAE prior to ADC, (i.e 2-3 bits improvement), is demonstrated. The other channel models and losses such as 622-Mb/s 300-m cable has been emulated by an arbitrary function generator and tested on the chip. A list of measured chip specification can be found at the end of this chapter.

## 5.2 Test Set-up and Functional Testing

Fig. 5.3 shows the test set-up for testing the chip with input tones at different frequencies. The digital output PADs are pulled up by  $100-\Omega$  resistors to a 1-V supply since their



*Figure 5.1:* Chip microphotograph consisting of the partial analog equalizer and the 6-bit ADC in 0.18- $\mu$ m CMOS technology.



Figure 5.2: printed circuit board (PCB) layout (PCB-TF80).



*Figure 5.3:* Test set-up for testing with sinusoid inputs at different frequencies.

corresponding on-chip drivers are open drain type. The low pull-up resistors provide appropriate time constant considering the logic analyzer probes input capacitances (1-2 pF).

To achieve a better on-board supply noise rejection, separate supplies and grounds are used for analog, digital and PAD-drivers circuits. These separate VDDs and GNDs are connected at the power supply outlets and they are separate on the PCB as well [61]. A set of parallel decoupling capacitors are used between dc supplies and grounds to decouple the power supply noise at different frequency ranges. Most of the measurements have been based on the ADC digital output codes, captured by the TLA714 logic analyzer (LA). This output represent the performances of both the ADC and the front-end partial analog equalizer. However, to evaluate the PAE output directly, a pair of on-chip differential analog test switches on the signal path are enabled and drive the signal to the output PADs. Fig. 5.4 shows the PAE output with a 1-MHz input tone and a S/H clock of 6 MHz while the PAE factor is set as  $\alpha = 0.8$ . In this experiment, the dc bias of the PAE output was verified as well.



*Figure 5.4:* The PAE output captured by 100-MHz Tektronix oscilloscope for a 1-MHz sinusoid input tone and a sample-and-hold clock of 6 MHz.

Fig. 5.5 shows the spectrum of the PAE output for a 10-MHz input tone while S/H blocks were running at 100 MHz. The observed THD for this measurement was 42 dB. The measured performance at this stage was limited, due to the large loading effect of the test equipment, the limitation of the high current analog PAD driver, and additive distortion by the analog test switches on the signal path. The mentioned test circuits were mainly designed for functional test purposes. More accurate measurements, particularly at higher speed, were performed by sampling the PAE outputs by the existing on-chip ADC and analyzing the ADC output codes.



*Figure 5.5:* The PAE output frequency response for a 10-MHz input tone and a 100-MHz sampling frequency.

Fig. 5.6 and Fig. 5.7 show the output codes and their equivalent reconstructed waveform on the TLA714 logic analyzer for a 10-MHz and 85-MHz input tone at 250-MHz clock frequency while the front-end PAE was active with the factor of  $\alpha = 0.8$ . Note that for  $\alpha = 0.8$  (which corresponds to maximum partial equalization) the transconductors gain were set to their maximum and as a result, the worst case situation at higher speed inputs occurs. In Fig. 5.7, since the input frequency is large compared to sampling frequency, the quality of the large swing transition by the ADC circuits, via an examination of the beat envelope tone appearing on the reconstructed signal, can be observed [4]. In this measurement, for a 250-MHz clock frequency and an 85-MHz input tone, three envelope beats with 120 degree phase shifts appeared, with a frequency of

$$f_{beat} = f_{in} - \frac{f_{clk}}{k} = 85Mhz - \frac{250MHz}{3} = 1.66Mhz.$$
 (5.1)

The (total harmonic distortion) THD of these beats envelopes, obtained by FFT analysis, was 33 dB.

Among the comparator autozeroing modes, periodic autozeroing was utilized for all measurements. It was seen that the ADC output was valid for 40  $\mu$ s after each autozero pulse. The interleaved autozeroing mode was not functioning properly in the tested chips due to possible break down of some logic gates in the complex logic structure. Due to limitation of the number of available test chips and boards, it was preferable to use them for major characterization and performance tests using a well-functioning periodic autozeroing mode.



*Figure 5.6:* ADC output bit streams and their equivalent magnitude at a 250-MHz sampling clock for a 10-MHz input tone captured by a TLA714 Logic Analyzer.



*Figure 5.7:* ADC output bit streams and their equivalent magnitude at 250MHz sampling clock for a 85 MHz input tone captured by TLA714 Logic ANalyzer. A beat of 1.66 MHz (85-250/3) with 33 dB THD is seen.

### 5.3 Dynamic Test and SNDR Measurement [62]

**SNDR**:Signal-to-noise and distortion ratio (SNDR) is a well-known parameter for ADC dynamic performance evaluation. For sinusoidal input signals SNDR is defined as the ratio of rms signal to rms noise, including the harmonic distortion components or

$$SNDR = 20\log \frac{A_{SIGNAL}[rms]}{A_{NOISE + HD}[rms]}$$
(5.2)

where  $A_{SIGNAL}[rms]$  is the rms output signal level and  $A_{NOISE + HD}[rms]$  describes the rms sum of all spectral components below the Nyquist frequency, excluding dc.

ENOB: In the case where the input signal is sinusoidal, ENOB is related to SNDR as

$$ENOB = \frac{(SNDR - 1.76)}{6.02}.$$
 (5.3)

**SFDR:**One definition of spurious-free dynamic range (SFDR) is the ratio of RMS amplitude of the fundamental signal component to the RMS value of the largest distortion component [62]. SFDR indicates the practical output dynamic range of an ADC.

By applying different sinusoid input tones, and using the FFT of reconstructed signal form the prototype PAE/ADC output codes, the dynamic performance of both ADC and front-end PAE was evaluated. Fig. 5.8 (a) and Fig. 5.8(b) show the FFT plots of the ADC output for sampling frequencies of 250 and 400 MHz with input tones at 45 and 124 MHz, correspondingly. Fig. 5.9 exhibits the SNDR versus different input frequency at different sampling frequencies ranging from 100 MHz to 400 MHz. The peak SNDR at 250-MHz clock and 45 MHz input tone is about 34 dB and slightly decreases at higher input frequencies. The equivalent ENOB for 34 dB SNDR according to expression (5.3) is 5.4 bits. In this measurement, PAE has been enabled with a maximum partial equalization factor, i.e.  $G(1 - \alpha z^{-1}) = 5.7(1 - 0.8z^{-1})$ . In this case, the high gain of the Gm-C amplifiers causes the worst case linearity for the amplifier part. Due to the high pass effect of PAE, the input signal at high frequencies has a low swing and this relaxes the S/H functioning. The combination of these two effects makes the variation of the SNDR versus input frequency relatively flat for the combined PAE /ADC architecture. By setting the equalization factor  $\alpha$  to zero, the PAE part of the circuit acts as a sample-and-hold stage with a close-to-unity gain amplifier, i.e.  $G(1 - \alpha z^{-1}) = 1.2(1 - 0z^{-1})$ . In this case, while the gain and input full scale is fixed at different input frequencies, SNDR drops evenly from 35 dB to 31.5 dB.



*Figure 5.8:* FFT plot of the PAE / 6-bit ADC output. (a) 250-MHz sampling clock and 45-MHz input tone, SNDR=34 dB, SFDR=45 dB. (b) 400-MHz sampling clock and 124-MHz input tone, SNDR=30dB, SFDR=37 dB.



*Figure 5.9:* Measured SNDR vs. analog input frequency for different sampling clocks at 100 MHz, 250 MHz and 400 MHz.

## 5.4 Code Density Measurement and INL/DNL

Code density test is a common approach for the characterization of ADCs' non-linearities. The output code density or histogram is the number of times every individual code has occurred [63]. Differential non-linearity (DNL) for an output code is related to the deviation of its corresponding input range from the averaged standard bin width. Therefore, the number of output codes occurrences is related to its corresponding DNL. For example, for an ideal ADC with a full scale ramp input and random sampling, an equal number of output codes is expected. In practice, since generating an ideal linear ramp is difficult, a sine wave signal source is exploited. In this case, for an ideal ADC, the number of output codes is proportional to the probability of the sample occurrence in the related input range (bin). The DNL corresponding to the *i* th bin in an LSB unit can be defined as [63]

$$DNL_i = \frac{C_i / N_{total}}{P_i} - 1, \qquad (5.4)$$

where  $C_i$  is the number of counts in the *i*th bin,  $N_{total}$  is the total number of output samples and  $P_i$  is the ideal bin width or the ideal probability of sample occurrence in the *i*th amplitude range.

The probability density function p(V) for a function of the form  $A\sin(\omega t)$  is

$$p(V) = \frac{1}{\pi \sqrt{A^2 - V^2}}$$
(5.5)

Thus, the probability of input samples occurring in the range of  $(V_i, V_{i+1})$  is

$$p(V_i, V_{i+1}) = \frac{1}{\pi} \left[ \operatorname{asin}\left(\frac{V_{i+1}}{A}\right) - \operatorname{asin}\left(\frac{V_i}{A}\right) \right]$$
(5.6)

Fig. 5.10 shows the ideal probability density function from (5.6) and the measured relative code density for each output code among 8000 samples for the combination of PAE/ADC, running at 250-MHz clock frequency with a 124-MHz input tone. The offset of the output codes was removed by adjustment of the differential reference voltages. It should be mentioned that 124 MHz input tone generates a 1-MHz low frequency beat envelop. Thus, large number of samples cover the full amplitude range. In this way, the worst case high frequency input and large swing transitions are included in the DNL test. The measured DNL or each output code according to (5.4) is depicted in Fig. 5.11. The worst case DNL is less than a 0.5 LSB.

Integral nonlinearity (INL) is the maximum deviation of the input/output characteristic from a straight line. INL can be obtained directly from DNL accumulation according to the following expression [4]

$$INL_{j} = \sum_{k=0}^{j-1} DNL_{k}.$$
 (5.7)

Fig. 5.12 exhibits the integral nonlinearity (INL) performance of the combined PAE/ADC. It should be mentioned that the nonlinearity of the PAE block was included in this measurement as well. Therefore, the INL and DNL less than 0.5 LSB shows a satisfactory result and good layout matching among ADC rows.



*Figure 5.10:* Experimental and theoretical relative output code density for 250-MHz clock frequency and 124-MHz input tone (1-MHz envelope beat).



Figure 5.11: Differential nonlinearity (DNL) obtained from code density measurement.



Figure 5.12: Integral nonlinearity (INL) obtained from code density measurement.

## 5.5 Experiment in a 400-Mb/s 240-m Coaxial Cable System

#### 5.5.1 Test Set-up

In this experiment the prototype chip was tested in the data transmission system shown in Fig. 5.13. Two streams of binary data were generated by a pattern/data generator. By setting the amplitude of one channel to one half of that of the other, and combining them



*Figure 5.13:* Test set-up for 4-level PAM data transmission over 240-m Belden Coaxial cable.

through a zero-degree power combiner, a 4-level PAM data with a reasonable eye opening was produced. The eye diagram of the combiner output is shown in Fig. 5.14. This eye opening suffices for this experiment, although due to the jitter and frequency response of the data generator and the combiner, it is relatively degraded. Generally, using line drivers [64] and a high-speed 4-level PAM generator [65] could improve the quality of the channel input.

A 75- $\Omega$  240-m Belden coaxial cable was used as the transmission channel. The losses of this channel at frequencies of 50 MHz, 100 MHz and 150 MHz are -15 dB, -21dB and -25 dB correspondingly. At both channel ends an impedance converter, to convert from 50  $\Omega$  to 75  $\Omega$  and back, was employed for impedance matching. Fig. 5.13 shows a simple resistive impedance converter designed for this purpose. It should be mentioned that commercial types of this converter are available as well.

As shown in Fig. 5.13, a lowpass filter with a bandwidth of about 0.75 times the baud rate rejects out-of-band noise after the channel. In the end, a 180-degree power splitter and two bias-Ts provide differential inputs with proper bias (450 mV) to the chip input. The prototype chip samples the channel output and after partial equalization and/or decorrelation, they are fed to the ADC. The chip output codes were captured by a logic analyzer. The TLA714 logic analyzer is able to use an external clock up to 200 MHz. However, due to its moderate external clock precision, the use of its internal clock was preferred. The TLA714 internal clock sampling maximum is 500 MHz plus an extra option of 2 GS/s for a short period of time (i.e.  $1 \mu s$ ). The latter was employed for off-chip clock recovery of the system. The clock source of the original pattern generator was used as the clock source of the test chip. The phase delay of the transmitted data relative to the clock source was manually adjusted for best clock matching. This was performed by transmitting an impulse pattern and the best clock phase matching that which produced the strongest channel impulse response peak. Also, a specific data pattern was placed at the beginning of the bit streams, so that the bit locations could be recognized. The code words, captured by the logic analyzer, were transferred to the computer in order to perform the rest of the signal processing. This consisted of an adaptive linear equalization and symbol detection as shown in Fig. 5.16.



Figure 5.14: Four-level eye diagram produced at the input of the channel (Fs=200 MHz).



*Figure 5.15:* A 50 $\Omega$  / 75 $\Omega$  resistive impedance converter.



*Figure 5.16:* Off-chip signal processing block diagram for rest of equalization and final symbol detection.

#### 5.5.1.1 Measurement Results

Fig. 5.17 depicts the measurement results for the system in Fig. 5.13 at 200-MHz baud rate in two cases of PAE, on and off. All the signal amplitudes were normalized such that they had a variance of  $\sigma_x = \sqrt{5}/3$ , the variance of an NRZ 4-level PAM with a maximum amplitude of one. Fig. 5.17 (i) shows the channel output waveform. Fig. 5.17 (ii) shows the 64-level reconstructed waveform corresponding to 6-bit code words captured from the chip output. When PAE is off, the chip is equivalent to an ADC with a front-end S/H and amplifier. Therefore, as seen in Fig. 5.17 (i) (b), the chip output is the quantized version of the channel output. When PAE is on, the chip output is the quantized version of the partially equalized channel output. Compatible with the simulation results presented in Chapter 3, it is seen that the ADC output has a more uniform distribution and a lower crest factor. The part (iii) of the figure shows the restored data after the off-chip adaptive linear equalizer. This equalizer in case (a) when PAE is on, consists of a 7-tap 6-bit coefficient feedforward equalizer (FFE). In case (b), the equalizer order was chosen so that it would not to limit the quality of the eye opening. As seen in Fig. 5.17(b)(iii), when PAE is active, the transmitted data can be restored with a significantly better error performance (i.e.  $RMSE \approx 0.08$  versus  $RMSE \approx 0.18$ ). This improvement is equivalent to the advantage of using an 8-bit ADC without PAE. It should be mentioned that, although PAE and the ADC were tested up to 400 MS/s, when placing the PAE/ADC in the test system of Fig. 5.13, the maximum sampling rate was 200 MS/s, due to clock recovery test limitations.



(a) PAE is on

(b) PAE is off

*Figure 5.17:* Measured signals in the system with 4-level NRZ PAM transmitted at 200-MHz baud-rate over a 240-m coaxial cable. (a) PAE is on. (B.) PAE is off. (i) channel output. (ii) ADC output. (iii) Restored data from ADC output after digital equalizer. (a) (iii) Eye is opened using 7-tap 6-bit-coefficient FFE (RRMSE=0.08). (a)(iii) Best possible eye opening (RRMSE≅0.18). Here, the order of the digital FFE was chosen such that not to limit the performance.

#### 5.5.2 Special Considerations

**Baseline wander effect:** Since the transmitted signal is ac-coupled by several blocks, such as splitter and bias-Ts, during the transmission path, the dc component of the signal was discarded. This caused a low frequency drift on the signal amplitude known as baseline wander effect. Various analog and digital techniques can be employed to remove this low frequency drift [5][66]. In the current experiment, this effect was removed by an extra simple digital feedback filter as shown in Fig. 5.16. This filter is a lowpass filter of the form  $G(z) = Kz^{-1}/(1-z^{-1})$  that extracts the dc drift of the error signal and subtracts it from the input. Note that this filter is different from an error-feedback-equalizer filter [7] which can compensate for high frequency loss. The output of the baseline removal filter could be subtracted from the FFE output as well. This would give a slightly better result for a simple G(z). However, in this case, extra complexity for the adaptation algorithm of the digital FFE equalizer is needed because the adaptation states have to be modified to include the cascading effect of the baseline filter on the signal path. The effect of baseline wander removal is shown in Fig. 5.18. This figure shows the restored data in the experimental system of Fig. 5.13 with 100 MHz baud-rate with and without baseline wander removal.



*Figure 5.18:* 100MHz 240m system output. (a) baseline wander exists. (b) baseline wander removed.

**Decision feedback equalizer**: As mentioned in Chapter 3, using a DFE /FFE for the digital equalizer reduces quantization error enhancement by the FFE as well. Therefore, when PAE is off, we can obtain a better eye opening compared to an FFE-only system. However, an FFE is required because of the precursor ISI and the limited order of the feedback filter due to error propagation and digital clock recovery difficulties. Experimental results show that using PAE in an FFE/DFE system is also beneficial due to crest factor reduction at the ADC input and reduction of the quantization enhancement by the FFE part. This advantage is more obvious at a lower number of ADC bits.

To evaluate the PAE/ADC system with a lower number of bits, the least significant bits of the ADC output can be discarded. In this way, 5-bit and 4-bit code-words were obtained. Figures 5.19, 5.20 and 5.21 show the measurement results for the restored data and their RRMSE for different number of bits for FFE-only and FFE/DFE systems. The DFE and FFE filters are 5-tap and 7-tap FIR filters. As seen in Fig. 5.19, using PAE is more advantageous in the FFE-only system. However, considerable improvement is achieved in the FFE/DFE system as well. For example, the PAE/4-bit ADC front-end performance is equivalent to a 6-bit-ADC-only system in a FFE/DFE receiver and almost a 7-bit ADC-only system in an FFE-only digital receiver. Thus, regarding the results in Fig. 5.19, adding the PAE reduces the ADC resolution requirement to 5-6 bits instead of 7-9 bits.



*Figure 5.19:* Restored data relative root mean square error (RRMSE) of the restored data in the 400-Mb/s 240-m coaxial cable system with an off-chip digital FFE or an FFE/ DFE for different number of ADC bits and cases of PAE on and off.



*Figure 5.20:* Restored data points for a 400-Mb/s 240-m coaxial cable system with off-chip digital FFE/DFE for different number of ADC bits and the cases of PAE on and off.





*Figure 5.21:* Restored data points for a 400-Mb/s 240-m coaxial cable system with off-chip digital FFE for different number of ADC bits and cases of PAE on and off.

### 5.6 Channel Emulator Using Arbitrary Waveform Generator

To evaluate the performance of the chip for different channel models and losses, an arbitrary waveform generator (AWG) was used to emulate the channel output waveform directly. The test bench for this measurement is shown in Fig. 5.22. The advantages of this experiment include the ability to test different channel models, that the transmitter's non-idealities are prevented and the channel loss can be normalized to a lower frequency. This latter one, results in less clock recovery difficulties and ability to perform better functionality analysis for a larger channel loss.



*Figure 5.22:* Test set-up for 4-level PAM data transmission over 240-m Belden coaxial cable.

Since the previous cable test was bounded to 240-m cable and up to 400 Mb/s, the maximum tested channel loss was -21dB. In 622-Mb/s 300-m Coaxial cable system, 32dB loss at Fs/2 exists, where Fs is the baud rate frequency. To investigate the functionality of the PAE/ADC in the above system, the channel output was emulated by the AWG with 100-MHz sampling rate while the baud-rate was normalized to 20 MHz. The 10-MHz reference clock of the AWG and the clock generator were shared for clock matching. Since the chip input is constant during 50 ns periods, the off-chip clock recovery is more relaxed. An specific data pattern was placed at the beginning of the bit streams, so that we could recognize the bit locations and measure the bits correctness.

Fig. 5.23 exhibits the measured data waveforms at the chip input, chip output (6-bit codes) and after the off-chip linear equalizer in both cases of PAE on and off. As seen, due to larger channel loss the effect of PAE is significant; i.e. RRMSE=0.07 versus



Figure 5.23: Measured signals in the system of Fig. 5.22, where the channel output of 622-Mb/s 300-m coaxial cable is emulated by an AWG.(a) PAE is on. (B.) PAE is off. (i) channel output. (ii) ADC output. (iii) Restored data from ADC output after digital equalizer. (a) (iii) eye is opened using 7-tap 7-bit-coefficient FFE (RRMSE=0.07). (B.) (iii) best possible eye opening (RRMSE≅0.27). Here, the order of the digital FFE was chosen such that not to limit the performance.

RRMSE=0.27. The RMSE=0.07 is close to the performance of a 9-bit ADC front-end which indicates an improvement of about 3 bits.

## 5.7 Summary

The fabricated chip consisting of a partial analog equalizer and a 6-bit ADC, was characterized and its performance in the receiver front-end of a 400-Mb/s 240-m coaxial cable channel system was measured. Table 5.1 shows a summary of the measured IC characteristics along with some other recent state-of-the-art ADCs. To demonstrate the impact of the ADC effective resolution, using simulations on a 622-Mb/s system with a 20-tap-FFE receiver and a symbol-error rate below 10<sup>-7</sup>, the maximum achievable coaxial cable length for the corresponding ADC effective resolutions were obtained and are shown in this table. While the changes in technologies is to be noted in Table 5.1, it is seen that the combination of PAE and 6-bit ADC is an efficient approach. This is true according to the area and power consumption, when compared to a single 8-bit ADC for use in wired communication applications.

| Specification, Unit                                                                                                                                              | This work | [67]           | [57]    | [53]           |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------|-----------|----------------|---------|----------------|
| CMOS Technology, (µm)                                                                                                                                            | 0.18      | 0.35           | 0.35    | 0.6            |
| Sampling Frequency, (MS/s)                                                                                                                                       | 400       | 200            | 400     | 150            |
| Power Supply, (V)                                                                                                                                                | 1.8       | 3              | 3       | 3.3            |
| ADC resolution, (bits)                                                                                                                                           | 6         | 8              | 6       | 8              |
| Equivalent bit resolution for long cable applications, (bits)                                                                                                    | > 8       | 6.8            | 5.5     | 6.5            |
| Maximum coaxial cable length for 622 Mb/s application for SER <10 $^{-7}$ according to the given effective resolution, assuming ADC runs at $f_s = 311$ MHz, (m) | 300       | 225            | 140     | 200            |
| Total Active Area, (mm <sup>2</sup> )                                                                                                                            | 0.83      | 3.3            | 1.2     | 1.2            |
| PAE / ADC Area (mm <sup>2</sup> /mm <sup>2</sup> )                                                                                                               | 0.74/0.09 |                |         |                |
| Power (PAE + ADC), (mW)                                                                                                                                          | 84 + 22   | 655            | 190     | 395            |
| SFDR / SNDR @250 MHz clk, 45 MHz input, (dB)                                                                                                                     | 45 / 34   | 59 / 42<br>@Fs |         | 50 / 40<br>@Fs |
| SFDR / SNDR @400 MHz clk, 124 MHz input, (dB)                                                                                                                    | 37 / 30.2 |                | 37 / 31 |                |
| ADC INL /DNL, (LSB)                                                                                                                                              | < 0.5     | 0.8            | < 0.5   | 1.2/0.6        |
| Input Swing, (V)                                                                                                                                                 | 0.6       | 1.3            | 1       | 1.6            |

Table 5.1: Measured IC characteristics and comparisons



# Summary and Future Research

## 6.1 Summary and Conclusions

In this thesis, the ADCs bit requirements for wired communication applications were investigated and an efficient partial analog equalization (PAE) approach to improve the performance of the front-end ADC was presented. The motivation for this work was that high-speed high-resolution analog-to-digital converters (ADC) is one of the major challenges in the design of digital communication systems. This is due to the high-speed high-resolution ADCs power and area consumption and the feasibility of their implementation in low-voltage standard CMOS technology. The contributions of this thesis included two major components. First, through an analytical study the benefit of partial equalization in terms of ADC bit requirements was elaborated. Second, an implementation of a high speed PAE / ADC combined on a single 1.8-V CMOS chip was demonstrated and the benefit of 2-3 bits improvement was verified, experimentally.

By studying the ADC input characteristics and the effect of the quantization noise on a digital communication system, a formula for the required resolution for a given system error rate was proposed. Different approaches to reduce the ADC resolution requirements were investigated. A 2-tap PAE was selected as an efficient choice for analog preprocessing in order to reduce the ADC bit requirements. It was shown that the similarity between the 2-tap PAE and a first-order feedforward decorrelator could be utilized to determine the single zero of the PAE. Replacing the ADC with an analog decorrelator and its inverse, as a digital post filter, before and after the ADC in a digital communication system, enabled the use of a lower resolution ADC with the same error performance. The analog decorrela-

tor, or 2-tap PAE, reduces the ADC input crest factor and the PAE inverse converts the quantization noise to a lowpass shape. Thus, the quantization noise is not enhanced by digital equalization. The combination of these two effects reduces the ADC resolution requirement considerably. The efficiency of the proposed approach depends on the amount of channel loss or inter-symbol interference (ISI). The greater the ISI, the more advantageous this approach would be.

A combination of a 2-tap PAE and a 6-bit 400-MHz ADC was implemented in standard digital 0.18-µm CMOS technology. The PAE consisted of two sets of triple-interleaved sample-and-holds (S/H), two variable gain transconductors, a current-to-voltage converter and an ADC input buffer. The circuit design contributions and the relevant simulation results were reviewed within the thesis. A flash architecture was utilized for the high-speed 6-bit ADC. Different autozeroing or offset cancellations and test modes enabled testing and characterizing the prototype chip experimentally. Apart from the main blocks' design, the peripheral design techniques contributions included a non-overlap triple clock generator, using master clock for interleaved S/H, adaptive lead compensations in transconductors, efficient interdigitated capacitors layout, careful floor planning of the ADC rows and symmetric clock distribution.

The functionality and performance of the chip was tested and presented in Chapter 6. An SNDR of 34 dB was achieved for a 250-MHz sampling frequency. In addition, the chip was placed in an experimental 400-Mb/s 240-m coaxial cable communication system. In this way, the advantage of having the PAE was demonstrated. Using the proposed PAE, the ADC performance was shown to be improved by more than 2 bits at a cost of only 12% extra area and 17% extra power.

### 6.2 Suggestions for Future Work

There are several directions to focus on for continuation of this work. Extending the application of this research to complex modulation receivers and full-duplex transceivers are two areas that can be studied. Moreover, for the current application, advanced circuit design techniques can be applied to extend the speed and efficiency of the design in both PAE and the ADC blocks.

Appendix C shows the preliminary simulation results for using partial equalization to reduce the resolution and sampling speed requirements of the front-end ADC in a system with carrierless amplitude and a phase modulation scheme (CAP). CAP is a modulation scheme [23][68] used in digital communication systems and can be considered to be a bandpass PAM (Pulse Amplitude Modulation) [7], in which the carrier frequency is near baseband. On the other hand, it can also be viewed as a variation of Quadrature Amplitude Modulation (QAM) without explicit modulation/demodulation blocks. CAP is used in wired applications because of its simplicity, bandwidth efficiency and zero dc component. Conventional CAP receivers consist of a relatively high-speed high-resolution front-end ADC along with two parallel fractionally-spaced, tapped-delay-line filters for channel equalization and I/Q separation. By using a low order partial analog equalizer/demodulator the ADC requirement can be reduced to two ADC with one third of the speed and 1-2 bit less resolution for a 16-CAP 1.2 Gb/s over 200-m coaxial cable. The implementation issues of such a new mixed topology and the related circuit design trade offs could be considered as a future research topic.

As mentioned in Chapter 3, a 2-tap PAE can be considered to be feedforward decorrelator. Another variation is a feedback decorrelator. Under this method, each ADC input sample is predicted from the previous quantized sample rather than a non-quantized analog sample (see Fig. 3.20). The advantage would be a further reduction in quantization noise (or reducing the ADC resolution requirement). The difficulty in this method is speed limitation due to the delay of the feedback loop that consists of the ADC, an  $\alpha$  multiplier and subtractor. Investigating different circuit techniques to cope with this loop delay is an interesting research topic to work on.

It was observed that the signal amplitude distribution after the PAE had some local peaks. Using a non-uniform quantization technique, the quantization intervals can be adjusted such that at those peaks less quantization error is added. In this way, a further reduction of average quantization noise can be achieved, which is equivalent to a lower number of bits requirement. However, a specific non-uniform quantization for an ADC may not be desirable in some applications.

To improve the speed and performance of the current design, several other circuit design techniques could be investigated. Using multiple gain stages in a PAE is a candi-

date for achieving a higher bandwidth. Using coding schemes, such as gray coding in the ROM part, is recommended for further ADC error rate reduction. For ADC speed enhancement, a continuous time comparator with digital calibration could be used. This is a good approach for low resolution, high speed applications such as back-plane chip to chip applications [69].



## Transconductor Circuit Analysis

## A.1 Transconductor feedback loop analysis and lead compensation

The linearity of the transconductor blocks in the partial analog equalizer is improved by placing a feedback loop around the input transistors. To verify the stability performance of the transconductors, shown in Fig. A.1, the feedback loop is broken at the gate of transistors M3 and M4, and thus, the loop gain transfer function  $v_o/v_i$  can be evaluated. Note that the parasitic capacitors  $C_{P2}$  and  $C_{P2}'$  include the loading effect of the gates of M3 and M4 in addition to the total parasitic capacitance at M1 and M2 drains or node 2.



*Figure A.1:* Opening the feedback-loop to characterize the open loop circuit and stability analysis.

To calculate the loop gain transfer function  $v_o/v_i$ , the half circuit shown in Fig. A.2 (a) and its small signal equivalent in Fig. A.2 (b) are utilized. As a first step, the cascode transistors M3 and M5 are replaced by a Norton equivalent circuit consisting of a current source with the value of  $Kv_i$  and an output resistance  $r_{o5}$ , as shown in Fig. A.3. The capacitor  $C_{P3}$  represents the total parasitic capacitance at node 3. To find  $Kv_i$  in Fig. A.3, node 1 is made short circuit to ground and its current is calculated as

$$I_{eq} = -v_3(g_{m5} + g_{ds5}), \qquad (A.1)$$

where  $v_3$  is the voltage at node 3 and can be calculated by KCL at node 3 as

$$v_3 = \frac{-g_{m3}v_i}{g_{ds3} + g_{ds5} + g_{m5} + C_{P3}s}.$$
 (A.2)

By replacing  $v_3$  from (A.2) into (A.1) and considering  $g_{ds3}, g_{ds5} \ll g_{m5}$ , we can write:

$$I_{eq} = K v_i = \left(\frac{g_{m3}(g_{m5} + g_{ds5})}{(g_{ds3} + g_{ds5} + g_{m5})}\right) \left(\frac{1}{1 - s/p_3}\right) v_i \cong g_{m3} \left(\frac{1}{1 - s/p_3}\right) v_i, \quad (A.3)$$

where

$$p_3 = -\frac{g_{ds3} + g_{ds5} + g_{m5}}{C_{P3}} \cong \frac{-g_{m5}}{C_{P3}}.$$
 (A.4)

It can also be shown that the impedance seen from node 3 in Fig. A.3 is equal to [70]:

$$r_{o5} = (r_{ds3}(1 + g_{m5}r_{ds5}) + r_{ds5}) \left(\frac{1 - s/z_4}{1 - s/p_4}\right) \cong g_{m5}r_{ds3}r_{ds5} \left(\frac{1 - s/z_4}{1 - s/p_4}\right),$$
(A.5)





*Figure A.2:* (a) The open loop half circuit of the linearized transconductor. (b) The small signal equivalent.



Figure A.3: Replacement of the cascode transistors by the Norton equivalent circuit.

where

$$p_4 = \frac{-1}{r_{ds3}C_{P3}},\tag{A.6}$$

and

$$z_4 = -\frac{r_{ds3}(1 + g_{m5}r_{ds5}) + r_{ds5}}{r_{ds5}r_{ds3}C_{P3}} \cong \frac{-g_{m5}}{C_{P3}}.$$
 (A.7)

As a second step in Fig. A.2 (b), considering  $r_{ds1}$ ,  $r_{o5}$ ,  $r_{I1} \ll 1/gm1$ , the voltages v1 can be approximated as

$$v_1 = \frac{K v_i}{g_{m1} + C_{P1} s + 1/R_s}.$$
(A.8)

To find the relationship between  $v_o$  and  $v_1$  we can writ the KCL at node 2 can be written as

$$v_o(g_{12} + C_{P2}s + Y_c) + (v_o - v_1)g_{ds1} - g_{m1}v_1 = 0.$$
(A.9)

A rearrangement of (A.9) gives:

$$\frac{v_o}{v_1} = \frac{g_{m1} + g_{ds1}}{g_{I2} + C_{P2}s + Y_c + g_{ds1}} \cong \frac{g_{m1}}{C_{P2}s + Y_c + g_{ds1}}$$
(A.10)

where  $Y_c$  is the compensation network admittance added to node 2. A Combination of (A.3), (A.8) and (A.10) results in

$$v_o = \frac{g_{m1}g_{m3}v_i}{(g_{m1} + C_{P1}s + 1/R_s)(C_{P2}s + Y_c + g_{ds1})(1 + (C_{P3}/g_{m5})s)}.$$
 (A.11)
#### A.1.1 No Compensation

In cases where there is no compensation circuit, i.e.  $Y_c = 0$ , the open-loop transfer function becomes:

$$\frac{v_o}{v_i} = g_{m3} r_{ds1} \left( \frac{g_{m1}}{g_{m1} + 1/R_s} \right) \left( \frac{1}{1 - s/p_1} \right) \cdot \left( \frac{1}{1 - s/p_2} \right) \left( \frac{1}{1 - s/p_3} \right),$$
(A.12)

where

$$p_1 = -\frac{g_{ds1}}{C_{P2}} = \frac{-1}{r_{ds1}C_{P2}},\tag{A.13}$$

$$p_2 = -\frac{g_{m1} + 1/R_s}{C_{P2}},\tag{A.14}$$

$$p_3 = -\frac{g_{m5}}{C_{P3}}.$$
 (A.15)

Note that  $p_1$  is the dominant pole of the loop transfer function because  $g_{ds1} \ll g_{m1}, g_{m5}$ . To find the unity gain frequency  $\omega_u$ , by assuming  $|p_1| \ll \omega_u \ll |p_2|, |p_3|$  we can write:

$$\left|\frac{v_o}{v_i}\right| = g_{m3} \left(\frac{g_{m1}}{g_{m1} + 1/R_s}\right) \frac{1}{C_{P2}\omega_u} = 1, \qquad (A.16)$$

and thus:

$$\omega_u = -g_{m3} \left( \frac{g_{m1}}{g_{m1} + 1/R_s} \right) \frac{1}{C_{P2}}.$$
 (A.17)

Therefore, in the design, it is important to ensure that the other poles are far enough from  $\omega_u$  in (A.17) in order to have a proper phase margin. Since  $C_{P3}$  is much smaller than  $C_{P2}$  in Fig. A.1,  $p_2$  could represent the second pole. According to (A.14), for larger  $R_s$ ,  $p_2$  falls at a lower frequency. This could affect the phase margin when a lower transconductance gain is chosen (by increasing  $R_s$ ).

#### A.1.2 Lead Compensation

In this design, when the gain is set to its lowest value (i.e  $R_s$  is set to its largest value) a lead compensation circuit, as shown in Fig. A.1 is activated to improve the phase margin of the loop transfer function. Considering the equivalent value of  $Y_c$  from the compensation network, the second term in the denominator of (A.11) can be written as

$$\frac{1}{Y_2} = \frac{1}{C_{P2}s + Y_c + g_{ds1}} = \frac{1}{C_{P2}s + \frac{1}{R_c + 1/(C_cs)} + \frac{1}{r_{ds1}}}$$

$$= \frac{r_{ds1}(1 + R_c C_c s)}{(R_c C_c r_{ds1} C_{P2})s^2 + (R_c C_c + r_{ds1} C_{P2} + r_{ds1} C_c)s + 1}$$
(A.18)

If we write (A.18) in the form of

$$=\frac{r_{ds1}(1-s/z_1)}{(1-s/p_{1a})(1-s/p_{1b})}$$
(A.19)

and assuming its poles are far from each other, (i.e.  $p_{1b} \gg p_{1a}$ ), then the poles and zeros can be estimated as

$$p_{1a} = -\frac{1}{R_c C_c + r_{ds1} C_{P2} + r_{ds1} C_c},$$
(A.20)

$$p_{1b} = -\left(\frac{1}{r_{ds1}C_{P2}} + \frac{1}{r_{ds1}C_c} + \frac{1}{R_cC_c}\right),\tag{A.21}$$

and

$$z_1 = -\frac{1}{R_c C_c}.$$
 (A.22)

Therefore,  $\frac{v_o}{v_i}$  in (A.12) can be rewritten as

$$\frac{v_o}{v_i} = g_{m3} r_{ds1} \left( \frac{g_{m1}}{g_{m1} + 1/R_s} \right) \frac{1 - s/z_1}{(1 - s/p_{1a})(1 - s/p_{1b})(1 + s/p_2)(1 - s/p_3)}.$$
 (A.23)

By comparing  $p_{1a}$  and  $p_1$  in (A.20) and (A.13), we can see that the dominant pole is shifted to slightly lower frequencies. Meanwhile, although a new non-dominant pole  $p_{1b}$ is introduced at much higher frequencies, the zero  $z_1$  compensates for the phase margin reduction due to the non-dominant poles. From (A.23) and by assuming  $|p_{1a}| \ll \omega_u \ll |z_1|, |p_2|, |p_3|, |p_{1b}|$ , the open-loop unity gain frequency can be estimated by

$$\left|\frac{v_o}{v_i}\right| = g_{m3} r_{ds1} \left(\frac{g_{m1}}{g_{m1} + 1/R_s}\right) \frac{1}{(\omega_u / p_{1a})} = 1.$$
(A.24)

Then,  $\omega_u$  is found as

$$\omega_{u} = g_{m3}r_{ds1} \left(\frac{g_{m1}}{g_{m1} + 1/R_{s}}\right) \frac{1}{R_{c}C_{c} + r_{ds1}C_{P2} + r_{ds1}C_{c}} 
\approx \frac{g_{m3}r_{ds1}}{R_{c}C_{c} + r_{ds1}C_{P2} + r_{ds1}C_{c}}$$
(A.25)

As expected,  $\omega_u$  has moved to lower frequencies compared to  $\omega_u$  in (A.17) before compensation. This further improves the phase margin at the cost of lower bandwidth. However, note that compensation is needed when we choose lower transconductance gains or larger  $R_s$ . In this case, the bandwidth is already increased because of the lower gain and larger  $R_s$  (see (A.17)). Thus, reducing the bandwidth by compensation should be tolerable as it might be still higher than when the gain is maximum and  $R_s$  is minimum.

In the end, a test circuit architecture similar to Fig. A.1 has been simulated to evaluate the effect of the compensation network. Fig. A.2 shows the magnitude and phase response of the transconductor loop gain, with and without compensation circuits. As we can see, while  $\omega_u$  is reduced after compensation, the added zero cancels some of the phase lag caused by the non-dominant poles.



*Figure A.4:* Frequency response of the loop gain of the transconductor amplifiers with and without lead compensation. (a) Magnitude Response. (b) Phase response (Phase margin improvement: 74 to 101 degree).

## A.2 Transconductance Gain Expression and the Effect of the Feedback Loop

Fig. A.5(a) shows the half circuit and its small signal equivalent of the transconductor amplifier shown in Fig. A.1. Here an expression for transconductance gain  $i_o/v_i$ , considering the effect of the feedback loop around input transistor, is derived. In the small signal equivalent shown in Fig. A.5(b), the cascode transistors M3 and M5 are replaced by a simplified equivalent model, shown in Fig. A.3.



*Figure A.5:* (a) The half circuit of the linearized transconductor. (b) The small signal equivalent.

The KCL at node 1 gives:

$$v_1\left(\frac{1}{r_{I1}} + \frac{1}{r_{o5}} + \frac{1}{R_s}\right) + g_{m1}\left(v_1 - \frac{v_i}{2}\right) + g_{ds1}(v_1 - v_2) + g_{m3}v_2 = 0, \qquad (A.26)$$

or

$$v_1 \left( \frac{1}{r_{I1}} + \frac{1}{r_{o5}} + \frac{1}{R_s} + g_{m1} + g_{ds1} \right) + v_2 (g_{m3} - g_{ds1}) = g_{m1} \frac{v_i}{2}.$$
 (A.27)

Using the approximation  $g_{ds1} \ll g_{m3}$  and  $\frac{1}{r_{I1}}, \frac{1}{r_{o5}}, g_{ds1} \ll g_{m1}$  we can simplify (A.27) as

$$v_1\left(\frac{1}{R_s} + g_{m1}\right) + v_2(g_{m3}) = g_{m1}\left(\frac{v_i}{2}\right).$$
 (A.28)

Also, the KCL at node 2 gives:

$$g_{ds1}(v_2 - v_1) + g_{m1}\left(v_1 - \frac{v_i}{2}\right) = 0$$
 (A.29)

or

$$g_{ds1}v_2 = v_1(g_{ds1} + g_{m1}) - g_{m1}(v_i/2).$$
(A.30)

Using the approximation  $g_{ds1} \ll g_{m1}$ , (A.30) can be simplified as

$$g_{ds1}v_2 = v_1g_{m1} - g_{m1}(v_i/2).$$
(A.31)

Now, by combining (A.28) and (A.31) and cancelling  $v_2$ , we can write:

$$v_1\left(\frac{1}{R_s} + g_{m1}\right) + g_{m3}\left(\frac{v_1g_{m1} - g_{m1}(v_i/2)}{g_{ds1}}\right) = g_{m1}(v_i/2), \qquad (A.32)$$

#### and it can be simplified as

$$\frac{v_1}{v_i/2} = \frac{g_{m1} + \frac{g_{m1}g_{m3}}{g_{ds1}}}{\frac{1}{R_s} + g_{m1} + \frac{g_{m1}g_{m3}}{g_{ds1}}},$$
(A.33)

or

$$\frac{v_1}{v_i/2} = \frac{1}{1 + \frac{1}{R_s g_{m1}(A+1)}},$$
(A.34)

where  $A = \frac{g_{m3}}{g_{ds1}}$ . To find  $\frac{i_o}{v_i}$  we can write

$$\frac{i_o}{v_i} = \frac{v_1}{v_i/2} \times \frac{1}{2} \times \frac{i_o}{v_1},$$
(A.35)

in which  $\frac{i_o}{v_1} = \frac{1}{R_s}$  (see Fig. A.5). Therefore, the final transconductance value can be written as

$$\frac{i_o}{v_i} = \frac{1}{2R_s + \frac{2}{g_{m1}(1+A)}}.$$
(A.36)

As seen in (A.36), compared to a conventional transconductor, the effect of  $g_{m1}$  is reduced by a factor of 1 + A where  $A = \frac{g_{m3}}{g_{ds1}}$ . Note that 2Rs is the total degeneration resistance between the source nodes of the input transistors and it determines the total gain.



# Global Gradient Search for PAE Optimization

In this appendix, the algorithm for searching the optimum filter coefficients by *Gradi*ent Search method in a communication system with a partial analog equalizer (PAE) is presented. This communication system is shown in Fig. B.1.

For the coefficient of filter G(z) in Fig. B.1, using steepest descent gradient search, we can write [26]:

$$g_{k+1} = g_k - \mu \frac{\partial e^2}{\partial g_k}, \qquad (B.1)$$

where  $g_{k+1}$  is the updated value of  $g_k$  for the next time instant and  $\mu$  is the gain constant that regulates the speed and the stability of the adaptation. The parameter *e* is the error between the delayed input signal and the final recovered output signal as

$$e_{(n)} = x_{d(n)} - \hat{x_{(n)}}.$$
 (B.2)

Therefore we can write:

$$\frac{\partial e^2}{\partial g_k} = 2e \frac{\partial e}{\partial g_k} = -2e \frac{\partial \hat{x}}{\partial g_k}.$$
 (B.3)



Figure B.1: A communication system with a PAE and a DLE in the receiver.

Thus, (B.1) can be rewritten as

$$g_{k+1} = g_k + 2\mu e \frac{\partial \hat{x}_k}{\partial g_k}.$$
 (B.4)

In Fig. B.1, we can write the filtering operations as

$$\hat{x}_n = [(x * c + n) * g * h]_{(n)} + (q * h)_{(n)}$$
(B.5)

where \* is the convolution operator.  $q_{(n)}$  is the added quantization noise caused by the ADC and can be written as

$$q_{(n)} = \frac{y_{peak}}{2^{R}} q_{norm_{(n)}},$$
 (B.6)

where  $q_{norm_{(n)}}$  is a uniformly distributed white noise normalized within the interval [-1, 1]. From (3.9)  $y_{peak}$  is approximated as

$$y_{peak} = x_{peak} \sum_{i} |(c * g)_i|$$
(B.7)

where

$$\left| (c * g)_i \right| = \left| \sum_k c_{i-k} \cdot g_k \right|.$$
 (B.8)

Defining the sgn[] as the sign operator, Using the equation

$$\partial \frac{|f(u)|}{u} = \operatorname{sgn}[f(u)] \cdot \frac{\partial f(u)}{\partial u}, \qquad f(u) \neq 0$$
(B.9)

(sgn[] is a sign operator), and (B.8), we can write:

$$\partial \frac{|(c * g)_i|}{\partial g_k} = c_{i-k} \cdot \operatorname{sgn}((c * g)_i).$$
(B.10)

Using the above expression, we can rewrite (B.7) as

$$\frac{\partial y_{peak}}{\partial g_k} = x_{peak} \sum_i c_{i-k} \cdot \operatorname{sgn}((c * g)_i)$$
(B.11)

The combination of the above expression and (B.6) can be used to obtain the gradient of the second term in (B.5) versus the PAE coefficients  $g_k$ .

For the gradient of the first term in (B.5), we must consider the effect of cascading of the DLE with the PAE. The gradient of cascaded filters were reviewed in Chapter 2. The gradient of the first term in (B.5) versus  $g_k$  can be written as

$$\frac{\partial}{\partial g_k} \left[ (x * c + n) * g * h \right]_{(n)} = \frac{\partial}{\partial g_k} \sum_k \left[ (x * c + n) * h \right]_{(n-k)} \cdot g_k$$
$$= \left[ (x * c + n) * h \right]_{(n-k)} \qquad . \tag{B.12}$$

Using (B.6) and (B.12) for the gradient of (B.5) versus to  $g_k$  we can write:

$$\frac{\partial}{\partial g_k} \hat{x}_n = \left[ (x * c + n) * h \right]_{(n-k)} + \frac{\partial y_{peak}}{\partial g_k} \left( \frac{q_{norm}}{2^R} * h \right)_{(n)}$$
(B.13)

where  $\frac{\partial y_{peak}}{\partial g_k}$  can be obtained from (B.11).

The gradient of the output versus  $h_k$  is the delayed version of H(z) input as:

$$\frac{\partial}{\partial h_k} \hat{x}_k = \left[ \left[ (x * c + n) * g \right] + q \right]_{(n-k)}$$
(B.14)



*Figure B.2:* The Global Gradient Search algorithm implementation for PAE and DLE coefficients optimization

Fig. B.2 demonstrates the simulation block diagram for implementing the Global Gradient Search algorithm using (B.11) and (B.14). Although this algorithm is relatively complex to implement in hardware, it is useful for system-level design and verification of the filter coefficients obtained from other suboptimal but practically-efficient methods.



Reduction of ADC Requirements in CAP/QAM Receivers

#### C.1 Introduction

Carrierless amplitude and phase modulation (CAP) is a modulation scheme [23] in digital communication systems and can be considered a form of bandpass pulse amplitude modulation (PAM) [7], in which the carrier frequency is near baseband. On the other hand, it can also be viewed as a variation of quadrature amplitude modulation (QAM) with combined in-phase (I) and quadrature (Q) components but without explicit modulation/demodulation blocks. In contrast to uncoded PAM, CAP has zero dc power. This spectral characteristic makes it well suited to ac-coupled channels. It also allows the possibility of leaving a frequency band, near dc, which can be used for other signals (such as standard telephone signals).

Most CAP systems are designed and implemented in the digital domain by applying a relatively high-rate high-resolution analog-to-digital converter (ADC); thereby, taking advantage of digital signal processing. Throughout this thesis the advantage of using partial equalization in a PAM system was elaborated. As we shall see, using a partial analog equalizer/demodulator in CAP systems allows the use of less-costly ADC and less digital filtering complexities for a desired error performance. As mentioned previously, partial analog equalization, compared to full-analog equalization, has the advantage of a considerably lower order and feasible analog adaptive filtering. At the same time, the flexibility and reliability of a digital receiver is maintained at a lower cost compared to conventional fully digital receivers. The main approach in the presented topology involves splitting the I and Q equalizers into two low-order fractionally-spaced analog and baud-rate digital cascaded equalizers. In addition, a low-order complex decision feedback equalizer (DFE) contributes to a further reduction in the remaining inter-symbol (ISI) and I-Q interference errors. To show the efficiency of this technique, simulation results for a target application of 1.2Gb/s data transmission over 200-m coaxial cable using 16-CAP for serial digital video applications [71] is presented. It is shown that two 6-bit 300-MHz ADC, with two low order analog FIR equalizers and some low-order digital processing, give the same error performance as a system with an 8-bit 900-MHz ADC with two large order I-Q filters.

#### C.2 CAP/QAM Modulation

Fig. C.1 (a) shows a simple QAM system where g(t) and f(t) are lowpass shaping filters such as root raised cosine pulses [7][72]. In near-baseband wired applications  $f_c$  represents a carrier frequency which is equal to or slightly greater than the bandwidth of g(t), i.e.  $f_c \ge f_s/2$  where  $f_s$  is the symbol rate frequency. Note that, the carrier signals in both the transmitter and receiver must be in phase.



*Figure C.1:* (a) QAM modulation scheme: transmitter and receiver. (b) CAP Modulation scheme: transmitter and receiver.

Fig. C.1 (b) shows an alternative passband modulation scheme, called carrierless AM/ PM or CAP. This scheme consists of only I and Q filters. These filters, in contrast to lowpass filters in QAM systems, are passband filters centered at  $f_c$  and they form Hilbert pairs where  $f_c$  is larger than the largest frequency of the baseband envelope pulses g(t)and f(t) (see Fig. C.1) [73][74]. It can be shown that by adding complex rotators of the form  $exp(j\omega_c nT)$  and  $exp(-j\omega_c nT)$  to the input and output of a CAP transceiver a QAM transceiver, is obtained [73]. Therefore, it is possible to use a CAP receiver for a QAM transmitter. It should be noted that a CAP system is practical when the carrier frequency is near-baseband, so that it can be realized by reasonable order of I-Q filters.

#### C.3 Conventional CAP Receiver

Fig. C.2 shows a conventional CAP receiver in which digital filters  $L_I(z)$  and  $L_Q(z)$  perform both the passband equalization and I-Q signals separation [73][74][23]. The above task requires a front-end ADC with a sampling rate considerably larger than the symbol rate in order to prevent aliasing. The ADC rate depends on the symbol rate, excess bandwidth, carrier frequency and the order of the filters. Using raised cosine shaping filters with 20% excess bandwidth and a near-baseband carrier frequency, around 0.6f<sub>s</sub>, the practical value for the ADC sampling rate is about 3 times the symbol rate or 3f<sub>s</sub>. According to system-level simulation results for our target application, i.e. 1.2 Gb/s 16-CAP over 200-m coaxial cable with 300-MHz baud rate, for a symbol error rate of about 10<sup>-8</sup>, the minimum



*Figure C.2:* Conventional hardware-efficient CAP receiver with passband equalization.

hardware requirements for a conventional receiver are an 8-bit 900-MHz front-end ADC along with a pair of 15-tap adaptive FIR digital filters. As a result, the conventional topology is more appropriate for medium speed applications.

#### **C.3.1** Front-end ADC Requirements

The equivalent I-Q path of a complex CAP system is shown in Fig. C.3. To maintain an acceptable system bit error rate, the ADC has to meet certain resolution and speed requirements. To determine the resolution requirement, similar to (3.11) for PAM systems in Chapter 3, we consider the variance of the total remaining error before the slicer in Fig. C.3, as

$$\sigma_{err(I,Q)}^2 = \left[ (\sigma_{qz}^2 + \sigma_n^2) K_{FFE}^2 \right] + \sigma_{ISI-IQ}^2$$
(C.1)

where  $\sigma_{qz}^2$  is the quantization noise variance introduced by the ADC,  $\sigma_n^2$  is the variance of the additive channel noise,  $K_{FFE}^2$  is the error enhancement factor caused by the feedforward equalizer (FFE), and  $\sigma_{ISI-IQ}^2$  is the remaining inter-symbol (ISI) and I-Q interference error variance.



Figure C.3: The equivalent I-Q path of a complex 16-CAP system.

Similar to (3.15), it can be shown that the required number of bits R for the front-end ADC can be obtained from

$$R > (L_{(dB)}^2 - K_{FFE(dB)}^2 - \varepsilon_{q(dB)}^2) / 6.02, \qquad (C.2)$$

where  $\varepsilon_q^2$  is proportional to the crest factor of the ADC input as  $\varepsilon_q^2 = (1/3)(x_{max}/\sigma_x)^2$ . *L* is related to the desired symbol error rate (SER) as well as the participation ratio of the enhanced quantization noise  $\sigma'_{qz}^2$  to the total error  $\sigma_{err}^2$  as mentioned in (3.14). For example, for  $SER = 10^{-9}$  and assuming 33% participation of quantization noise to the total error,  $L_{(dB)}^2 = -32$  dB. According to (C.2), the main approach to reducing the bit requirement *R* is to reduce the factors  $\varepsilon_q^2$  (the crest factor of the ADC input signal) and  $K_{FFE}^2$  (the amount of noise enhancement by the digital equalizer).

Since  $\varepsilon_q^2$  depends on the equivalent channel response before the ADC and  $K_{FFE}^2$  depends on the existing ISI and I-Q interferences after the ADC, by transferring part of equalization and I-Q separation from digital to the analog side, the values of  $\varepsilon_q^2$  and  $K_{FFE}^2$ can be reduced, simultaneously. As a result, the required ADC resolution *R* can be reduced as well.

#### C.4 The Proposed Architecture

Fig. C.4 depicts the proposed mixed analog CAP receiver architecture. In this architecture, parts of the digital I and Q passband equalizers are transferred to the analog side. In this topology, an approximation of the fractionally-spaced I and Q filters are split into two



Figure C.4: The purposed mixed CAP receiver architecture.

152



*Figure C.5:* (a) Conventional CAP receiver I-Q path. (b) Splitting of approximated L(z) into two cascaded filters. (c) The proposed receiver I-Q path with two low-order analog and digital filter and low-rate low-resolution ADC. (d) Simulation results for I-Q path filters impulses and their 17-tap equivalent filters.

fractionally spaced and baud-rate filters with lower orders, as demonstrated in Fig. C.5. For example, a 4-tap analog FIR filter [6][17] running at  $3f_s$  followed by a down-sampler and a 5-tap FIR filter running at  $f_s$  is equivalent to a 16-tap ((3x(5-1)+4) FIR filter running

at  $3f_s$ . In this way, after adaptation of the cascaded filters, several advantages are realized. *First*, the secondary digital I and Q filters will be low order while they run at 1/3 of the speed. *Secondly*, the ADCs before digital filters will run at symbol rate  $f_s$ , rather than at  $3f_s$  in conventional topology. *Thirdly*, partial equalization and I-Q separation by analog filters, simultaneously reduce the crest factor of the ADC input signal and the quantization noise enhancement after the ADC. This reduces the bit requirement of the ADC by about 1-2 bits.

Although in this architecture there is a need for two ADCs, the speed of each is about one third and their resolutions are 1-2 bits lower. Thus, regarding the power, area and design feasibility of just the ADC part of the hardware [2][30], this architecture would be a preferred choice at high speeds. It should be noted that since the output of the analog FIR filters are downsampled by 3, a polyphase implementation of them can be utilized [75]. In this way, each tap multiplication is performed with one third of the speed.

To further relax the complexity of analog and digital I/Q feedforward filters, cross-coupled low order decision feedback equalizers (DFE) are employed as well. They partially assist the post-cursor equalization and I-Q separation without noise enhancement [7]. The DFEs are chosen to be low order (3-tap) to reduce propagation error effect and improve the convergence speed of the adaptation algorithm, particularly during initialization.

#### **C.5** Simulation Results

The proposed CAP receiver, depicted in Fig. C.4, and the conventional one, shown in Fig. C.2, were simulated for performance comparisons. The general criteria used in this work is the relative root mean square error (RRMSE). This is defined as the maximum I-Q residual RMS error, before the slicers, relative to the distance between CAP constellation points. According to the simulation results,  $RRMSE \approx 0.08$  results in a symbol error rate of about  $10^{-8}$  for our target application.

To achieve RRMSE=0.08, the filters  $L_{IQ}(z)$  in conventional topology (Fig. C.2) are 15 taps and the filters  $G_{IQ}(z)$  and  $H_{IQ}(z)$  in the proposed one are 5 taps. All DFE filters are only 3 taps. The filters are adapted using least mean square algorithm (LMS) [26] using the error signal between the input and output of the slicers. However, a training sequence at the beginning accelerates the initialization. The total number of digital filter taps in the

proposed architecture is 22 (=5x2 + 4x3), while in the conventional one, it is  $30 (=15 \times 2)$  for an equivalent performance. Note that the input signal of the DFE filters are 2-bit digital output of the slicers. Therefore, they require much simpler multipliers compared to those in FFEs.

In Fig. C.5(c) the impulses of the 5-tap  $G_{IQ}(z)$  and  $H_{IQ}(z)$  and their equivalent 17-tap filters with the form of

$$L_{I,Q}(z) = G_{I,Q}(z) \cdot H_{I,Q}(z^{3})$$
(C.3)

are shown. Table C.1 shows the reduction of the ADC input crest factor and the reduction of quantization noise enhancement by the digital equalizers according to the obtained impulse responses and expressions (3.10) and (3.16). In total, a potential 12.5 dB reduction in ADC resolution requirement is achieved and this is equivalent to 2 bits. This resolution enhancement is credible when the remaining total error is not saturated by other errors such as remaining ISI or enhanced channel noise. With the chosen filter orders this is not the case and thus, the effect of ADCs resolutions is evident in the resulting error performance, as is shown in Fig. C.6.

|                                                               | Conventional<br>Architecture | Proposed<br>Architecture | Improvemen<br>t |
|---------------------------------------------------------------|------------------------------|--------------------------|-----------------|
| Crest Factor change before ADC or $\Delta \varepsilon_q^2$    | 12.8 dB                      | 6.4 dB                   | 6.4 dB          |
| Quantization noise<br>enhancement $\Delta K_{FFF}^2$          | 5.5 dB                       | -0.6 dB                  | 6.1 dB          |
| Required ADC bits: R<br>(for SER=10 <sup>-9</sup> , see(C.2)) | 8.1 bits                     | 6 bits                   | 2.1 bits        |
| RRMSE, 5-bit ADC                                              | 0.18                         | 0.09                     | 6 dB            |
| RRMSE, 6-bit ADC                                              | 0.11                         | 0.07                     | 3.9 dB          |
| RRMSE, 7-bit ADC                                              | 0.08                         | 0.07                     | 1.2 dB          |
| RRMSE, 8-bit ADC                                              | 0.07                         | 0.07                     | 0 dB            |

Table C.1: Comparison of conventional and the proposed 16-CAP topologies simulation results

Fig. C.7 shows the recovered constellations before the slicers for both the conventional and the proposed topology. As we can see, the proposed system with two 300-MHz 6-bit ADCs gives the same performance as the conventional system with a 900-MHz 8-bit ADC.



*Figure C.6:* RRMSE of the recovered 16-CAP data for different ADC resolutions for (a) conventional architecture, and (b) proposed mixed architecture.



*Figure C.7:* Recovered 16-CAP data constellation before the slicer for different frontend ADC resolution in: (a) conventional architecture with a 900-MHz ADC, and (b) proposed architecture with two 300-MHz ADCs.

#### C.6 Summary

The approach of partial equalization and I-Q separation for a near baseband I-Q system (CAP) receiver in wired applications was investigated. According to simulation results, it was shown that for an application of 1.2 Gb/s over 200-m coaxial cable, the proposed topology with two 300-MHz 6-bit ADCs along with two 3-to-5-tap analog FIR partial equalizer filters gives the same error performance as a conventional system with a 900-MHz 8-bit front-end ADC. Moreover, the digital filtering complexity, according to the total number of taps and multiplications as well as clock rate, was considerably reduced. The implementation and circuit design issues of the purposed architecture is a suggested topic for future research.

### References

- [1] D. Slepian (editor), "Key papers in the development of information theory," McGraw-Hill *IEEE Press, New York* (1974).
- [2] R. H. Walden, "Analog-to-digital converter survey and analysis," *IEEE J. Selected Areas in Communications*, Vol. 17, pp. 539-550, April 1999.
- [3] D. A. Johns and K. W. Martin, "Analog integrated circuit design," *Johnson Wiley and Sons Inc.*, 1997.
- [4] B. Razavi, "Data conversion system design," *IEEE Press*, 1995.
- [5] G. P. Hartman, "Continuous-time adaptive-analog coaxial cable equalizer in 0.5mm CMOS," MASc Thesis, *University of Toronto*, 1997.
- [6] B. S. Kiriaki, et al.,"A 160 MHz analog equalizer for magnetic disk read channels," EEE J. of Solid-State Circuits, Vol. 32, Nov. 97, pp 1839-1850, Nov. 1997
- [7] E. A. Lee and D. G. Messerschmitt, "Digital communication," *Kluwer Academic Publishers*, 1994.
- [8] X. Wang and R. R. Spencer, "A low-power 170-MHz discrete-time analog FIR filter," *IEEE J. of Solid-State Circuits*, vol. 33, pp. 417 -426, March 1998.
- [9] D. Xu, Y. Song, and G.TUehara,"A 200 MHz 9-tap analog equalizer for magnetic disk read channels in 0.6μm CMOS," in IEEE Solid-State Circuits (ISSCC) Dig. Tech. Papers, 1996, pp. 74 -75.
- [10] A. Shoval, D. A. Johns and W. Snelgrove, "Comparison of dc offset effects in four LMS adaptive algorithms," *IEEE Tran. on Circuits and Systems II*, vol. 42, March 1995, pp. 176-185.
- [11] N. Kurosawa, et. al., "Explicit analysis of channel mismatch effects in timeinterleaved ADC systems," *IEEE Tran. on Circuits and Systems I*, vol. 48, pp. 261-271,March 2001.
- [12] C.J. Abel, C. Michael, M. Ismail, C.S. Teng and R. Lahri, "Characterization of transistor mismatch for statistical CAD of submicron CMOS analog circuits," *in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, vol.2, May 1993, pp. 1401 -1404.
- [13] P. M. Aziz and J. L. Sonntag, "Equalizer architecture tradeoffs for magnetic recording channels," *IEEE Tran. on Magnetics*, vol. 33, pp. 2728-2730, Sep. 1997.
- [14] J. Cheng and D.A. Johns, "A 100 MHz partial analog adaptive equalizer for use in wired data transmission," *in Proc. Eur. Solid-State Circuits Conf. (ESSCIRC)*, 1999, pp. 42-45.
- [15] J. Huang and R. Spencer, "Simulated performance of 1000base-T receiver with different analog front-end designs," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS) 2001, pp. 617-620.
- [16] A. Hadji-Abdolhamid and D. A. Johns, "ADC resolution enhancement by an

analog decorrelator," in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), vol. 3, pp. 435-438.

- [17] A. Hadji-Abdolhamid and D. A. Johns, "A 400-MHz 6-bit ADC with an analog partial equalizer," *in Proc. Eur. Solid-State Circuits Conf. (ESSCIRC)*, 2003.
- [18] Belden Inc., *Data sheet of product code* 8281, Copyright 1996.
- [19] IEEE, "IEEE Standards Catalog," *IEEE standard associations*, [Online], Available: *http://standards.ieee.org/*
- [20] W. Y. Chen, *DSL simulation techniques and standards*, Macmillan Technical Publishing, 1998.
- [21] Belden Inc, *Digital studio cable guide*, Belden technical bulletin TB65, 3rd edition, 2003.
- [22] S. H. Lampen, "Wire, Cable, and Fiber Optics for Video and Audio Engineers," *McGraw-Hill Professional*, 3rd edition, Aug. 1997.
- [23] J. J. Werner, "Tutorial on carrierless AM/PM-part I- fundamentals and digital cap transmitter," Contribution to ANSI X3T9.% TP/PMD Working Group, Minneapolis, June 23, 1993.
- [24] Gennum Corporation, "GENLINX GS9004A serial digital equalizer," *Gennum Corporation Data Book*, pp. 2.55-2.65, 1996.
- [25] National Semiconductor, "Comlinear CLCO14 adaptive cable equalizer for highspeed data recovery," *Product Data Sheet*, pp. 1-12, Aug. 1996.
- [26] B. Widrow and S. Stearns, "Adaptive signal processing," *Prentice-Hall Signal processing series*, 1985.
- [27] P.C. Magnusson, G.C. Alexander and V.K. Tripathi, "Transmission lines and wave propagation", *CRC Press*, London, 3rd edition, 1992.
- [28] The Mathworks Inc, "Matlab software and User Guide version 6.1 and 5.0," Copyright 1995.
- [29] A. R. Norsworthy, R. Schreier, G. C.Temes, "Delta-sigma data converters," *IEEE* press, 1997
- [30] N.S. Jayant and P. Noll, "Digital coding of waveforms, principles and applications to speech and video," *Prentice Hall*, 1984.
- [31] R. Zavari, "A high speed CMOS A/D converter employing variable nonuniform quantization," *M.A.Sc Thesis*, University of Toronto, 1998.
- [32] A. Leon-Garcia, "Probability and random processes for electrical engineering", *Addison Wesley*, 1994.
- [33] A. Papoulis, "Probability, random variables and stochastic processes," *McGraw-Hill*, 1991.
- [34] O.J. Tobias and R. Seara, "Analytical model for the mean weight behavior of adaptive interpolated-FIR filters using the constrained filtered LMS algorithm,"

*IEEE Symposium on Adaptive Systems for Signal Processing, Communications, and Control (AS-SPCC), 2000, pp. 272 -277.* 

- [35] T. Chan Carusone, "Genetic algorithm for transfer function design," Available: *http://www.eecg.toronto.edu/~tcc/gen\_algo.html.*
- [36] S. C. Ng, S. H. Leung, C. Y. Chung, A. Luk and W. H. Lau, "The genetic search approach - a new learning algorithm for adaptive IIR filtering," *IEEE Signal Proc. Mag.*, pp. 38-46, Nov. 1996.
- [37] M. J. Pelgrom, A. C. J. Duinmaijer, and A. P. G. Welbers. "Matching properties of MOS transistors," *IEEE J. Solid State Circuits*, vol. 24, no. 5, pp. 1433--1439, Oct. 1989.
- [38] M. Q. Lee, P. J. Hurst and K.C. Dyer, "An analog DFE for disk drives using a mixed-signal integrator," *IEEE J. of Solid-State Circuits*, vol. 34, pp. 592 -598, May 1999.
- [39] B. C. Gaudet, "Adaptive differential ADC architecture," United States Patent 6,229,469.
- [40] A. V. Oppenheim and R. W. Schafer, "Discrete time signal processing", Prentice Hall, 2nd edition, 1999.
- [41] C. Lacy, "An adaptive analog FIR filter in CMOS technology," *M.A.Sc. Thesis*, University of Toronto, 1999.
- [42] A.M. Abo and P.R. Gray, "A 1.5-V 10-bit, 14.3-MS/s CMOS pipeline analog-todigital converter," *IEEE J. Solid-State Circuits*, vol. 34, pp. 599-606, May 1999.
- [43] G. Fischer, "Analog FIR filters by switch-capacitor Techniques," *IEEE Tran. on Circuits and Systems*, vol. 37, pp. 808-814, July 1990.
- [44] B. Razavi, "Design of sample-and-hold amplifiers for high speed low voltage A/D converters," in IEEE Custom Integrated Circuits Conf. (CICC) Dig. Tech. Papers, 1997, pp. 59-66.
- [45] K. Poulton, et. al.," A 4GSample/s 8b ADC in 0.35µm CMOS," in IEEE Solid-State Circuits (ISSCC) Dig. Tech. Papers, Feb. 2002, vol. 45, pp. 166-167, Feb. 2002.
- [46] N. Kurosawa, et al., "Explicit formula for channel mismatch effects in timeinterleaved ADC systems," in Proc. of Inst. and Meas. Technology Conference, May 2000, vol. 2, pp. 763-768.
- [47] K.C. Dyer, D. Fu, S.H. Lewis and P.J Hurst, "An analog background calibration technique for time-interleaved analog-to-digital converters," *IEEE J. of Solid-State Circuits*, vol. 33, pp. 1912-1919, Dec. 1998.
- [48] D. Fu, K.C. Dyer, K.C. Lewis, S.H. and P. Hurst, "A digital background calibration technique for time-interleaved analog-to-digital converters," *IEEE J. of Solid-State Circuits*, vol. 33, Dec. 1998, pp1904-1911.
- [49] J. Huawen and E.K.F. Lee, "A digital-background calibration technique for

minimizing timing-error effects in time-interleaved ADCs," *IEEE Tran. on Circuits and Systems II*, vol. 47, pp. 603-613, July 2000.

- [50] H. Samavati and A. Hajimiri, *et al.*, "Fractal Capacitors," *IEEE J. of Solid-State Circuits*, vol. 33, pp. 2035-2041, July1998.
- [51] J.M. Steinger, "Understanding wide-band MOS transistors," *IEEE Circuits and Devices*, vol. 6, pp. 26-31, May 1990.
- [52] N. S. Sooch, "MOS cascode current mirror," U. S. patent no. 4550284, Oct. 1985.
- [53] Y. Wang, B. Razavi, "An 8-bit 150 MHz CMOS A/D converter," IEEE J. of Solid-State Circuits, vol. 35, pp. 308-317, March 2000.
- [54] T.Kwan and K.W. Martin, "An adaptive analog continuous-time CMOS biquadratic filter," *IEEE J. of Solid-State Circuits*, vol. 26, pp. 859-867, June 1991.
- [55] K. Bult and A Buchwald, "An embedded 240-mW 10-b 50-MS/s CMOS ADC in 1mm<sup>2</sup>," *IEEE J. of Solid-State Circuits*, vol. 32, pp. 1887-1895, Dec. 1997.
- [56] B. Fotouhi, "Optimization of chopper amplifiers for speed and gain," *IEEE J. of Solid-State Circuits*, vol. 29, pp. 823-828, July1994.
- [57] S. Tsukamoto, et al., "A CMOS 6-b, 400-MSample/s ADC with error correction," IEEE J. Solid-State Circuits, vol. 33, pp.1939-1947, Dec. 1998.
- [58] C.L. Portmann and T. H. Y. Meng, "Power efficient metastability error reduction in CMOS flash A/D converters," *IEEE J. Solid-State Circuits*, vol. 31, pp. 1132 1140, Aug. 1996.
- [59] I. Mehr and D. Dalton, "A 500-MSample/s, 6-Bit Nyquist-Rate ADC for Disk-Drive Read -Channel Application," *IEEE J. of Solid-State Circuits*, vol. 34, pp. 912-920, July1999.
- [60] A. Mclaern "Low voltage bias circuit," M.A.Sc Thesis, University of Toronto, 2000.
- [61] Analog Devices, AD9708 8-bit 100MSPS TxDAC D/A Converter, Technical Data Sheet., [Online], Available: http://www.analog.com.
- [62] MAXIM Inc., "Defining and testing dynamic parameters in high-speed ADCs," *MAXIM Inc. product solutions and application notes*, [Online] Available: http://www.maxim-ic.com/appnotes.cfm/appnote\_number/641/ln/en.
- [63] J. Doernberg, H. Lee and D. Hodges, "Full-Speed Testing of A/D Converters," IEEE J. of Solid-State Circuits, vol. 19, pp. 820-827, Dec. 1984.
- [64] R. Mahadevan and D. Johns, "A differential 160-MHz self-terminating adaptive CMOS line driver," *IEEE J. of Solid-State Circuits*, vol. 35, pp. 1889-1894, Dec. 2000.
- [65] K. Farzan and D. Johns, "A CMOS 7-Gb/s power -efficient 4-PAM transmitter," *in Proc. Eur. Solid-State Circuits Conf. (ESSCIRC)*, 2002, pp. 235 238.
- [66] T. J. Gabara and W.C. Fischer, "Capacitive coupling and quantized feedback applied to conventional CMOS technology," *IEEE J. of Solid-State Circuits*, vol.

32, no. 3, pp. 419-427, Mar. 1997.

- [67] K. Uyttenhove, J. Vandenbussche, E. Lauwers, G. Gielen and M. S. J Steyaert, "design techniques and implementation of an 8-bit 200-MS/s interpolating / Averaging CMOS A/D converter," *IEEE J. Solid-State Circuits*, vol. 38, pp. 483-493, March. 2003.
- [68] J. J. Werner, "Tutorial on carrierless AM/PM-part II- performance of bandwidthefficient line codes," Contribution to ANSI X3T9.% TP/PMD Working Group, Minneapolis, Feb. 16, 1993.
- [69] W. F. Ellersick, "Data converter for high speed links," *Ph.D. Thesis*, Stanford University, August 2001.
- [70] A. Sedra and K. C. Smith, "Microelectronic circuits," *OXFORD press*, fourth edition, 2004.
- [71] "Society of Motion Pictures and Television Engineers, "SMPTE standards list 2003," [Online], Available: http://www.smpte.org, copyright 2003.
- [72] A. Hadji-Abdolhamid and D.Johns, "A comparison of CAP and QAM architectures", *in Proc. IEEE Int. Symp. Circuits Syst. (ISCAS)*, vol. 4, May1998, pp. 316, 316/1-316/3
- [73] M. G. Lee, J Yang and J. J. Werner, "Blind equalization algorithm for dual mode CAP-QAM reception," *IEEE Trans. on Comm.*, vol. 49, pp 455-466, Mar. 2001.
- [74] K.H. Muller and J.J. Werner," A hardware efficient passband structure for data transmission", *IEEE Trans. on communications*, vol. 30, No.3, pp. 538-541, Mar. 1982.
- [75] P.P. Vaidyanathan, "A tutorial on multirate digital filter banks," *Proc. IEEE Int. Symp. Circuits Syst. (ISCAS), 1988,* pp. 2241- 2248.