# Algorithmic Design Methodologies and Design Porting of Wireline Transceiver IC Building Blocks Between Technology Nodes

S.P. Voinigescu, T.O. Dickson, T. Chalvatzis, A. Hazneci, E. Laskin, R. Beerkens\*, and I. Khalid\*

Edward S. Rogers Dept. of ECE, University of Toronto, 10 King's College Rd., Toronto, ON, M5S 3G4, Canada \*) STMicroelectronics, 16 Fitzgerald Rd., Ottawa, Ontario, K2H 8R6, Canada

Abstract-This paper presents an analysis of sub-2.5-V topologies and design methodologies for SiGe BiCMOS and sub-90nm CMOS building blocks to be used in the next generation of 40-100 Gb/s wireline transceivers. Examples of optimal designs for 40-80Gb/s broadband low-noise input comparators, low-voltage high-speed MOS- and BiCMOS CML logic gates, 30-100 GHz low-noise oscillators, and 40/80 GHz output drivers with wave shape control are provided.

## I. INTRODUCTION

As the data rates of broadband communication systems continue to increase, noise generated inside the circuit becomes a critical component in limiting the sensitivity of wide-band input stages for building blocks such as A/D converters, equalizers, and decision circuits. With each new technology generation, this situation is further exacerbated by the trend towards lower supply and breakdown voltages, yet another reason for dynamic range and link reach degradation in wireline data transmission. At the transmitter end, being able to provide maximum swing with adjustable wave shape to compensate for package, backplane and connector loss and reflections at data rates beyond 10 Gb/s has become a critical requirement in extending the reach of wireline communication systems. As supply voltages are lowered, the number of transistors that can be stacked in a circuit topology must also be reduced without compromising performance. To make up for the limited flexibility of low-voltage circuit topologies, optimal transistor sizing and biasing for low-noise, adjustable output swing, and broadband switching will play an even more dominant role in high-performance circuit design. It becomes increasingly important to re-examine the suitability of commonly deployed low-noise broadband amplifier and VCO topologies for applications beyond 40 Gb/s. Finally with the migration of large digital chips to prohibitively expensive 90nm and 65-nm CMOS-only technologies, it is likely that, to remain economically viable, 40-Gb/s or 80-Gb/s transceivers will simply evolve into re-usable IP blocks on a large digital die. This scenario points to the importance of developing algorithmic design and IP porting methodologies for highspeed digital and broadband CMOS building blocks from one technology node to the next.

Fig. 1 shows that the measured maximum available power gain of single-transistor and cascode stages fabricated in stateof-the-art SiGe BiCMOS and 90-nm CMOS technologies rises above 8 dB at 65 GHz. Taking advantage of this outstanding



Fig. 1. Measured maximum available power gain for SiGe HBTs, 90-nm n-MOSFETs, 130-nm HBT-MOS (BiCMOS), and 90-m MOSFET cascodes.

transistor performance, we have recently demonstrated large levels of integration at 80 Gb/s in a PRBS generator with 2<sup>31</sup>-1 pattern length, implemented in 130-nm SiGe BiCMOS technology and operating from 3.3-V supply [1]. In this work, we revisit CMOS, SiGe HBT, and SiGe BiCMOS high-speed and low-noise circuits in the context of deep submicron technologies and of operation from 2.5 V or lower supply voltages. Our goal is to prove that all the building blocks for a sub 3-W transceiver, featuring at least 30 dB dynamic range and operating at 40 Gb/s or 80 Gb/s, are realizable in state-ofthe-art silicon technologies. Therefore, the focus of the paper is on optimizing the key building blocks that limit dynamic range: input comparators, VCOs and output drivers.

In Section II, analytical noise models are derived for CMOS and SiGe HBT broadband amplifiers. For the first time, an algorithmic low-noise design methodology for broadband preamplifiers is described and verified experimentally in a 130-nm SiGe BiCMOS process. Experimental results on 40-Gb/s preamplifiers in SiGe BiCMOS technology are discussed and compared with simulations of 90-nm, and 65-nm CMOS-only TIAs that are currently in the fab.

In Section III, we compare CMOS, MOS-CML and BiCMOS-CML logic gates. A simple methodology is proposed for the design of MOS- and BiCMOS-CML digital gates. It relies on the invariance of the peak  $f_T$  current density between foundries and technology nodes [2], on the self-resonant-frequency x inductance (*SRF\*L*) product of a given semiconductor process, and on minimizing voltage swing [3].

This methodology also allows to port and scale designs easily from one foundry to another and between CMOS/BiCMOS generations. Next, an analysis of mm-wave CMOS and SiGe BiCMOS VCOs is carried out in *Section IV*. Finally, *Section V* looks ahead to the International Technology Roadmap for Semiconductors (ITRS) time horizon [4] in an attempt to overcome the problem of data and clock transmission over 5cm long on-chip interconnect. The first 2.5-V, 80-GHz driver with pre-emphasis control is described and characterized.

## II. LOW-NOISE BROADBAND INPUT COMPARATORS

At lower data rates, issues related to reflections from poor PCB traces and connectors dominate backplane or chip-to-chip transceiver performance. Beyond 10 Gb/s, circuit noise itself, integrated over increasingly larger bandwidths, becomes yet another limiting performance factor, raising the need for lownoise input stages. The noise of a two-port network is usually modeled in terms of two input-referred correlated noise sources  $\langle v_n^2 \rangle$  and  $\langle i_n^2 \rangle$ . The correlation between these noise sources can be captured using an admittance formalism, in which case the noise in the network is completely described by the correlation admittance  $Y_{COR}$ , the noise conductance  $G_n$ , and the noise resistance R<sub>n</sub> [5], or employing an impedance formalism, which gives rise to an equivalent set of noise parameters z<sub>cor</sub>, r<sub>n</sub> and g<sub>n</sub>. The minimum noise factor F<sub>MIN</sub> is obtained for a unique optimum source admittance  $Y_{SOP} = z_{sop}^{-1}$ . For clarity, noise parameters in the impedance and admittance formalisms are denoted throughout this work by lower-case and upper-case letters, respectively. The impedance formalism is convenient for analysis of noise in circuits with series feedback while shunt feedback is more readily investigated using the admittance formalism.

A series-series feedback circuit, such as the resistively degenerated INV in Fig 2(a), can be described by the sum of the two-port Z-parameter matrices of the forward amplifier and the feedback network, namely  $Z = Z_A + Z_F$ . Assuming that the forward amplifier is nearly unilateral and that its forward transmission dominates that of the overall network, it can be shown that the optimum source impedance of the feedback amplifier is expressed in terms of the noise parameters of the forward and feedback networks as

$$z_{SOP} = \sqrt{r_{SOPA}^{2} + \frac{r_{NF}}{g_{NA}} + 2r_{CORA} \Re(Z_{11F}) + \Re^{2}(Z_{11F})}$$

$$\sqrt{\frac{+|z_{CORF} - Z_{11F}|^{2}g_{NF}}{g_{NA}}} + j[X_{SOP} - \Im(Z_{11F})]$$
(1)

Here, subscripts ending in A and F refer to the noise parameters for the forward amplifier and feedback network, respectively. The minimum noise factor of the overall amplifier is

$$F_{MIN} = 1 + 2g_{NA}[r_{CORA} + r_{SOP} + \Re(Z_{11}F)]$$
<sup>(2)</sup>

Likewise, shunt-shunt feedback systems such as the TIAs in Figs. 2(c), 3(b) and 3(c) can be analyzed using the Yparameters and admittance-formalism noise parameters of the forward amplifier and feedback network. Similar assumptions are made about the forward amplifier and feedback network as were made for the series-series case, such that  $Y_{21} = Y_{21A}$  and  $Y_{12} = Y_{12F}$ . The optimum source admittance and minimum noise factor can be derived.

Equations (1)-(4) indicate that transimpedance feedback lowers  $z_{SOP}$  and is therefore useful when  $z_{SOPA}$  is higher than the generator impedance. Since the  $z_{SOP}$  of a transistor decreases with increasing size, bias current, and operation frequency [6], it follows that, by using shunt feedback for noise impedance matching, the size and bias current of the input transistor will be smaller than in other topologies and thus lead to lower power dissipation and broader bandwidth.

$$Y_{SOP} = \sqrt{G_{SOPA}^{2} + \frac{G_{NF}}{R_{NA}} + 2G_{CORA} \Re(Y_{11F}) + \Re^{2}(Y_{11F})}$$

$$\sqrt{\frac{+|Y_{CORF} - Y_{11F}|^{2}R_{NF}}{R_{NA}}} + j[B_{SOP} - \Im(Y_{11F})]$$

$$F_{MIN} = 1 + 2R_{NA}[G_{CORA} + G_{SOP} + \Re(Y_{11F})]$$
(4)

It is important to note that the CMOS inverter of Fig. 3(c) can be analyzed as a composite transistor with twice the transconductance per bias current, 2/3 times the  $f_T$ , and 3/2 times the  $(F_{MIN} - I)$  of the n-MOSFET of identical gate-length [3]. It becomes apparent that, if the  $f_T$  is adequate for the application, the CMOS inverter will require half the size and bias current of an n-MOSFET fabricated in the same technology node to achieve a certain noise resistance and optimum noise impedance with only a relatively small degradation of the noise figure [7]. This surprisingly little-known property of the CMOS inverter can significantly reduce the notoriously large power dissipation of noise-matched tuned and broadband MOS low-noise amplifiers, especially below 10 GHz [8].

Based on the preceding discussion, the INV, EF-INV, and TIA amplifiers of Figs. 2 and 3 are investigated to determine the best topology for high-bandwidth low-noise amplifiers. The noise factor of the INV in Figs. 2(a) and 3(a) as a function of the source impedance  $Z_0$  is given by

$$F_{Zo} = 1 + \frac{1}{1 + \left(\frac{\omega L_0}{Z_0}\right)^2} + Z_0 \left(R_N \left|Y_{COR} + \frac{2}{Z_0}\right|^2 + G_N\right)$$
(5)

Noting that the noise parameters of the transistor  $R_N$ ,  $G_N$ , and  $Y_{COR}$  scale with the emitter length/gate width [6], one can



Fig. 2 SiGe HBT-based a) INV, b)EF-INV, and c) TIA input comparators.

determine an optimal emitter length /gate width  $l_{EOPT}$  / $W_{EOPT}$  such that the right-hand side of (5) is minimized.

$$I_{E}/W_{OPT} = \frac{1}{\omega} \frac{2}{Z_{0}} \sqrt{\frac{1}{\frac{G}{R} + G_{C}^{2} + B^{2}}}$$
(6)

R, G, G<sub>c</sub>, and B are technology-dependent constants that characterize the geometry dependence of the transistor noise parameters at a given bias [9]. While resistive degeneration improves INV bandwidth and linearity, the noise performance is compromised. Hence,  $R_E$  has been neglected in (5)-(6) and should be eliminated in a low-noise INV. Similarly, series feedback at the input of an EF(SF)-INV also increases  $z_{SOP}$  and  $F_{MIN}$ . Following a rather lengthy derivation, it can be shown that the EF-INV noise parameters  $z_{COR}$ ,  $r_n$ , and  $g_n$  are approximately those of the EF transistor, which can then be sized according to (6).

The noise factor of the TIAs in Figs 2(c), 3(b) and 3(c) is determined by considering the series combination of the feedback resistor  $R_F$  and feedback inductor  $L_F$  as a parallel feedback network across the transistor amplifier

$$F_{Zo} = 1 + R_{NA} Z_0 \left| Y_{CORA} + \frac{1}{Z_0} + \frac{1}{R_F} \frac{1 - j \omega_0}{1 + \omega_0^2} \right|^2 + Z_0 G_{NA} + \frac{Z_0}{R_F} \frac{1}{1 + \omega_0^2}$$
(7)

with  $\omega_0 = \omega L_F/R_F$ . As was the case with the INV amplifier, the optimal size for the input transistor Q1 of the TIA can be derived (eqn. 8) by minimizing the noise factor at the 3-dB bandwidth of the amplifier. It is interesting to note that if the feedback resistor  $R_F$  equals  $Z_0$ , the TIA and INV stages have identical noise figure. Typically,  $R_F$  is larger than  $Z_0$ , resulting in lower noise figure, smaller transistor sizes, and hence smaller bias currents than that of the INV amplifier.



Fig. 3 n-MOSFET a) INV, b) TIA, and c) CMOS TIA input comparators.

$$I_{E}/W_{OPT} = \frac{1}{\omega} \sqrt{\left(\frac{1}{Z_{0}} + \frac{1}{R_{F}} \frac{1}{1 + \omega_{0}^{2}}\right)^{2} + \left(\frac{1}{R_{F}} \frac{\omega_{0}}{1 + \omega_{0}^{2}}\right)^{2}} \sqrt{\frac{1}{\sqrt{\frac{G}{R} + G_{C}^{2} + B^{2}}}}$$
(8)

The preceding analysis leads to a straightforward methodology for the design of low-noise INV stages. First,

the optimal noise current density  $J_{OPT}$  is determined at the appropriate frequency (typically 36 GHz for 43-Gb/s applications) as shown in Fig. 4(a). Technology constants R, G, G<sub>c</sub>, and B can then be found for this bias point. The transistor Q1 is then biased at  $J_{OPT}$  and sized using (6), which is equivalent to noise-matching the real part of  $z_{sop}$  to the 25- $\Omega$  impedance seen looking from the transistor towards the generator. The load resistor is then chosen to achieve the required gain. While this methodology results in a very low noise figure, comparable to that of the transistor, the large device size required (see Fig. 4(b)) limits the bandwidth. Adding resistive feedback can improve bandwidth, but increases the noise figure as demonstrated in Fig. 4(b).



Fig.4 a) Noise figure and associated gain as a function of current density at 36 GHz in SiGe HBTs. b) SiGe-HBT broadband LNA sizing for 50-Ω noise matching.

Adding EFs to the input of a low-noise inverter improves bandwidth at the expense of noise. For lowest noise, the EF is biased and sized using the methodology for the low-noise INV. However, to minimize the noise contribution of the transistor in the inverter, its size must be increased such that its  $z_{sop}$  is close to the output impedance of the EF. This results in higher noise than the noise-optimized INV with only marginal bandwidth improvement. Contrary to common practice [e.g., 10], the use of EF input stages preceding INV or Cherry-Hooper amplifiers should be avoided for low-noise high-speed applications.

Concomitant noise and impedance matching in the TIA input can be achieved through device and loop optimization. First, the loop gain *T* is selected based on the linearity requirements for the amplifier. This sets the product of the bias current and collector resistance  $R_c$ , and hence the upper limit on the dynamic range. The feedback resistance is then appropriately chosen such that the input impedance is 50  $\Omega$  as given by  $Z_0 = R_{F}/(1+T)$ . The input transistor Q1, biased at  $J_{OPT}$ , is sized using (8) such that the optimum source impedance with feedback is close to 50  $\Omega$ . Finally, inductors are employed throughout the circuit to obtain broader bandwidth and to filter high-frequency noise.

Table 1 summarizes key design parameters for each SiGe HBT amplifier. Two EF-INV are investigated - the first optimized for low noise as described above, and the second designed by adding EF inputs to the noise-optimized INV. The former design has poor bandwidth and high power consumption, as expected, while the latter yields an unacceptable noise figure. Also included in Table 1 is a TIA showing superior noise performance up to 36 GHz as compared with other broadband topologies.

CMOS designs in three technology nodes are summarized

in Table 2 and illustrated in Fig.5. In all cases a current density of  $0.2 \text{mA}/\mu\text{m}$  was employed, corresponding to the peak  $f_{\text{MAX}}$ bias and close to the optimum noise bias. The simulated noise figure is comparable to that of the SiGe HBT TIAs with identical bias current, 4mA. The 65-nm CMOS TIA has three times lower current than that of the n-MOSFET TIA. More interestingly, since the optimum noise current density is invariant, the size and bias current of the MOSFETs remains practically unchanged from one technology node to the next while the noise figure and bandwidth are improved as 40-Gb/s designs are scaled from the 90-nm to the 65-nm node. The layout and simulated 80-Gb/s eye diagram of the 65-nm CMOS TIA are shown in Fig. 6.

To validate the theoretical analysis, differential versions of the broadband amplifier topologies presented in Figs. 2 and 3 (a) were fabricated in a 130-nm SiGe BiCMOS process [11]. The noise figures for all amplifiers were measured up to 20 GHz, and are reported in Fig. 7. All measurements are singleended with the unused input terminated in 50  $\Omega$ . This results in typically 3-dB higher noise figure than in differential mode. The TIA and INV amplifiers exhibit the lowest noise, both around 10 dB at 10 GHz. However, simulations show and

| Table 1: S | SiGe I | Broadband | LNA | Design 1 | Data |
|------------|--------|-----------|-----|----------|------|
|------------|--------|-----------|-----|----------|------|

|                        | INV       | EF-INV      | EF-INV         | TIA       |
|------------------------|-----------|-------------|----------------|-----------|
|                        |           | noise opt.  | bandwidth opt. |           |
| $l_{E}, w_{E}(\mu m)$  | 4x6.7x0.2 | 4x6.7x0.2   | 2x8.0x0.2      | 2x8.0x0.2 |
| -/ - 4 /               |           | 8x9.0x0.2   | 4x6.7x0.2      |           |
| I <sub>c</sub> (mA per |           | 10 (EF)     | 4 (EF)         | 4 mA      |
| side)                  | 10        | 26 (INV)    | 10 (INV)       |           |
| R <sub>F</sub>         | -         | -           | -              | 260 Ω     |
| f <sub>3dB</sub>       | 14/11     | 16/- GHz    | 22/31 GHz      | 39/40     |
| (sim/meas)             | GHz       |             |                | GHz       |
| Gain (diff)            | 16.9 dB   | 17.2 dB     | 16.7 dB        | 13.8 dB   |
| Sim diff. NF           | 5.0/6.9   | 6.3/10.4 dB | 9.0/10.7 dB    | 4.6/5.5   |
| @10/36GHz              | dB        |             |                | dB        |
| Meas. NF               | 9.6 dB    | -           | 12.8 dB        | 10.3 dB   |
| @10 GHz                |           |             |                |           |

Table 2: Si MOSFET Broadband LNA Design Data 130nm CMOS 90nm CMOS 65nm CMOS 65nm nMOS 30 µm 20 µm 20 µm W (µm) 60 µm R<sub>F</sub> (Ohm) 200 200 200 200 I<sub>DS</sub>(TIA) 6 mA 4 mA 4 mA 12 mA

| f <sub>3dB</sub> (sim)       | 15.7 GHz    | 39.6 GHz    | 57.8 GHz   | 59. 4 GHz   |
|------------------------------|-------------|-------------|------------|-------------|
| Gain (sim)                   | 9 dB @10GHz | 8.4 dB      | 8.9 dB     | 7.5 dB      |
| NF (dB)                      | 5@10GHz     | 5.3 @36 GHz | 4.9 @36GHz | 5.0 @36 GHz |
| $V_{\text{DD}}\left(V ight)$ | 1.4 V       | 1.2 V       | 1.2 V      | 1 V         |



Fig. 5. CMOS TIA design scaling.



Fig. 6. 65-nm CMOS TIA layout and simulated 80-Gb/s output eye diagram.



Fig. 7. Measured noise figures of SiGe HBT and n-MOS inverter comparators.

measurements confirm that the TIA has significantly better bandwidth and broadband input matching. The lower noise figure of the TIA results in higher sensitivity than that of the EF-INV even though the latter has larger gain. As demonstrated in Fig. 8(a), the EF-INV output eye diagram has a Q factor of 5.8 for a 20-mVpp single-ended input (10-mVpp per side). The TIA eye diagram of Fig. 8(b) has a Q factor of 7 for the same input while consuming 50 mW, 20 mW less than the EF-INV stage.

The TIA circuit was operational with a supply voltage as low as 1.9 V. The CMOS INV of Fig. 3(a) has 6-dB higher noise figure than that of the SiGe HBT INV. These results prove the direct link between noise figure and sensitivity and the importance of low-noise design in wireline applications.

# III. HIGH-SPEED LOGIC GATES

It has been recognized that the base resistance term is the major roadblock limiting the switching speed of SiGe HBT logic [12]. In MOS-CML, the gate resistance term can be



Fig. 8. 40-Gb/s eye diagrams of SiGe a) EF-INV and b) TIA broadband differential amplifiers with a  $2^{31}$ -1, 20 mV<sub>p-p</sub> PRBS input.

rendered negligible through layout techniques by reducing the unit finger width. We have recently proposed a novel BiCMOS-ECL logic family that employs a cascode stage consisting of a MOSFET common-source device followed by a common-base HBT [13]. Such a structure takes advantage of the large intrinsic slew rate of the HBT and of the small gate resistance of the MOSFET, resulting in faster switching speed than either MOS or HBT CML families. At the same time, as a result of the low MOSFET threshold voltage and superior  $f_T$  at low  $V_{DS}$ , it operates with lower (less than 2.5 V) supply voltages than SiGe HBT ECL.

The open-circuit time constant (OCTC) of a chain of CMOS, differential MOS-CML, cascode HBT-CML and BiCMOS cascode [13] inverter chain (Fig. 9) with a stage-to-stage loading factor of k can provide a useful metric of the ultimate digital speed of these technologies.

$$\tau_{CMOS} = \frac{3r_o}{2} \left[ C_{gg} + C_{db} + \left( k + \frac{R_g}{r_o} \right) \left[ C_{gs} + (1 + g_m r_o) C_{gd} \right] \right]$$
(9)

$$\tau_{MOSCML} = \frac{\Delta V}{I_T} \left[ C_{gd} + C_{db} + \left( k + \frac{R_g}{R_L} \right) \left[ C_{gs} + (1 + g_m R_L) C_{gd} \right] \right]$$
(10)

$$\tau_{HBTCML} \approx \frac{\Delta V}{I_T} \left[ C_{\mu} + C_{cs} + \left( k + \frac{R_b}{R_L} \right) \left[ C_{\pi} + (1 + g_m R_L) C_{\mu} \right] \right]$$
(11)

$$\tau_{BiCMOSCML} \approx \frac{\Delta V}{I_T} \left[ C_{\mu} + C_{cs} + \left( k + \frac{R_g}{R_L} \right) \left( C_{gs} + C_{gd} \right) \right]$$
(12)

 $I_T$  is the tail current,  $R_L$  is the load resistance, and  $\Delta V$  is the logic swing. For highest digital speed, the tail current of the MOS-CML inverter corresponds to the peak  $f_T$  bias (i.e. each transistor in the differential pair is biased at 0.15 mA/µm) irrespective of technology node.

$$W = \frac{I_{T}}{0.3 \, m A / \mu \, m}; \ A_{E} = \frac{I_{T}}{1.5 \, J_{peakfT}}$$
(13)

This allows full switching with a voltage swing of 450 mV<sub>p-p</sub> and 350 mV<sub>p-p</sub> in 130-nm and 90-nm CMOS, respectively. HBT-CML inverters have 250 mV<sub>p-p</sub> swing and are biased at a tail current 1.5 times the peak  $f_T$  current density. The latter increases with every new technology generation [12, 14] and may vary from foundry to foundry.

The basic design equations (14) without inductive peaking can be modified as (15) to account for inductive peaking and the SRF of the inductor, resulting in 60% bandwidth improvement with constant group delay (14).

Note that series-shunt peaking occurs almost by default due to



$$R_{L} = \frac{\Delta V}{I_{\tau}}; \quad BW_{3dB} = \frac{1}{2\pi R_{L}C_{L}} = \frac{I_{\tau}}{2\pi C_{L}\Delta V}$$
(14)

$$L_{p} = \frac{C_{L}R_{L}^{2}}{3.1} = \frac{C_{L}}{3.1} \frac{\Delta V^{2}}{I_{T}^{2}}; \quad I_{Tmin} = \Delta V \sqrt{\frac{C_{L}}{3.1 L_{pmax}}}$$
(15)

the inductance of the interconnect leading to fanout stages. Hence, an even larger improvement in bandwidth is regularly achieved without the need for more area-consuming t-coil schemes [15]. These equations provide the underlying reasons why, for a given technology back-end, characterized by a fixed SRF\*L product, using bipolar devices with lower logic swing and lower output capacitance will result in smaller tail currents and lower power dissipation despite the 200-mV higher supply voltage requirement. Table 3 summarizes optimized full rate latch designs (Fig. 10) implemented in various logic families and technology nodes. To further lower the supply voltage, the current tail in Fig. 10 can be removed [16] or a narrow band transformer could be employed [17]. When scaling CML gates from 90 nm to 65 nm, the same current and transistor size can be preserved with improved switching speed. Alternatively, for the same speed, the transistor size, tail current, and power can be reduced.

Table 3: Scaling of CMOS, MOS-, and BiCMOS-CML fanout-of-1 latches

| Latch Family  | Rate:Gbs | $V_{DD}(V)$ | $\Delta V(V)$ | $I_T(mA)$  | $P_D(mW)$ |
|---------------|----------|-------------|---------------|------------|-----------|
| 130-nm CMOS   | 5.5      | 1.2         | 1.2           | -          |           |
| 130-nm MOSCML | 40       | 1.8         | 0.5           | 1.5        | 2.7       |
| BiCMOS CML    | 40       | 1.8         | 0.2           | 0.83 (1.5) | 1.5 (2.7) |
| BiCMOS ECL    | 50       | 2.5         | 0.25          | 4          | 10        |
| 90-nm CMOS    | 7.5      | 1           | 1             | -          |           |
| 90-nm MOSCML  | 40       | 1.2         | 0.38          | 2.75       | 3.3       |
| 65-nm CMOS    | 11.5     | 1           | 1             | -          |           |
| 65-nm MOSCML  | 60       | 1           | 0.35          | 2.5        | 2.5       |

As proof of concept, a 2.5-V, 45-Gb/s broadband retimer was fabricated in 130-nm SiGe BiCMOS technology (Fig.11). It employs the SiGe HBT TIA discussed in *Section II*, the SiGe BiCMOS ECL logic family, an output driver with 5.5ps rise and fall times capable of 80 Gb/s operation [1, 10] and a 2.5-V broadband clock path consisting of 3 EF-INV stages that can be driven with a single-ended clock signal at 49 GHz. Eye diagrams at 10, 45 and 49 Gb/s with adjustable output swing up to  $2x600 \text{ mV}_{pp}$  are reproduced in Figs. 12 and 13.



Fig. 10. 40-Gb/s 130-nm MOS- and BiCMOS-CML Latches.



Fig. 11. Broadband 49 Gb/s 2.5-V retimer layout



Fig. 12. a) 10 Gb/s and b) 45 Gb/s input (top) and 2x280mVpp output (bottom) after retiming



Fig. 13. a) 45 Gb/s, and b) 49 Gb/s 2x600mVpp retimed output

# IV. MM-WAVE OSCILLATOR TOPOLOGIES

The cross-coupled VCO topologies of Fig. 14 have been very popular [18,19] in (SOI) CMOS technology due to the low bias current required to achieve negative resistance and oscillation at frequencies as high as 60 GHz in 90-nm SOI [18]. However, at mm-wave frequencies, even CMOS designers [20] have recognized the benefits of the Colpitts topology. The latter has been favoured in bipolar implementations [21] as a result of its lower parasitic capacitance and built-in buffering of the resonant tank from the load.

For each of the three topologies above, one can derive the expressions for the maximum oscillation frequencies and find a direct link to the fundamental device characteristics of a given semiconductor technology.





Fig. 15. a) 70-GHz 65-nm CMOS and b) 35-GHz 130-nm BiCMOS Colpitts VCO schematics

$$\omega_{osc}(n-MOS) \leq \frac{g'_{m}Q_{eff}}{C'_{gs} + 4C'_{gd} + C'_{db} + \frac{C_{L}}{W}}$$
(16)

$$\omega_{osc}(CMOS) \leq \frac{2}{3} \frac{g'_{m} Q_{eff}}{C'_{gs} + 4C'_{gd} + C'_{db} + \frac{C_{L}}{W}}$$
(17)

$$\omega_{osc}(Colpitts) \le \frac{g'_{m} Q_{eff}}{C'_{gs} + C'_{sb}}$$
(18)

 $Q_{eff}$  is the effective quality factor of the L-C-varactor tank which includes the loading effect of the transistor.  $C_L$  is the load capacitance and  $g'_m$ ,  $C'_{gs}$ ,  $C'_{gd}$ ,  $C'_{sb}$ ,  $C'_{db}$ , represent the transconductance and parasitic capacitances of the transistor per unit gate width. Since only  $g'_m$  improves with scaling while the rest remain largely unchanged over nodes and foundries,  $\omega_{osc}$  will also scale if the MOSFET gate width and current remain constant. C'sb has no equivalent in HBTs and both C'sb and C'db are small in SOI, thus explaining why record  $\omega_{osc}$  are obtained with HBT and SOI processes. It is interesting to note that: (i) the load capacitance places an upper bound on  $\omega_{osc}$  of cross-coupled topologies, but does not affect the Colpitts topology, (ii) if  $C_L$  is ignored, the transistor parasitic capacitances, tank Q, and  $g'_m$  ultimately limit  $\omega_{osc}$ , (iii) the maximum possible oscillation frequency does not depend on the tail current  $I_{SS}$  nor does it depend on the transistor size as long as a small enough inductor L with adequate O can be realized and the load is negligible, and (iv)for the same O, and  $C_L=0$ , the n-MOS cross-coupled and the Colpitts VCOs have almost the same maximum oscillation frequency while the CMOS cross-coupled VCO has 2/3 times lower oscillation frequency.

If  $R_g$  and  $g_{ds}$  are accounted for, then the ultimate  $Q_{eff}$  is reduced by  $R_g$  and  $g_{ds}$ . One should replace  $Q_{eff}$  with

$$\frac{1}{Q_{eff}} = \frac{1}{Q_{tank}} + \frac{R_g}{\omega_{osc}L} + \frac{g_{ds}}{\omega_{osc}C_T}$$
(19)

where  $C_T$  is the sum of all capacitances across the tank L. Expressions (16)-(18) which resemble  $f_T$  now evolve into  $f_{MAX}$ , an intuitively pleasing result.

Finally, a link can be found between phase noise  $L(f_m)$ , equivalent transistor input noise current  $I_n$ , oscillation amplitude  $V_{OSC}$ , transistor bias current  $I_{BIAS}$ , and  $C_l/C_2$  ratio.

$$V_{osc} = V_1 \left( 1 + \frac{C_1}{C_2} \right) = \frac{2I_{BIAS}Q}{C_2 \omega_{osc}}$$
(20)

$$L(f_m) = \frac{|I_n|^2 \omega_{osc}^2}{I_{BIAS}^2 \omega_m^2 4 Q^2} \frac{1}{\frac{C_1^2}{C_2^2}} \frac{1}{\left(\frac{C_1}{C_2} + 1\right)^2}$$
(21)

From the phase noise analysis and design point of view, an oscillator can be treated exactly as a low-noise amplifier which needs to be noise and impedance-matched to the signal source impedance. In the VCO case, the signal source impedance is represented by the tank impedance at resonance. In addition, the transistor must be biased in such a manner so as to ensure maximum linearity, as in a class A power amplifier. With these observations, VCO design for the lowest phase noise either using the Colpitts or the cross-coupled topologies becomes rather trivial and algorithmic: (i) set the tank voltage Vosc to the maximum allowed by the breakdown voltage of the technology (1.2  $V_{p-p}$  for 130-nm, 1  $V_{p-p}$  for 90-nm and 65-nm MOSFETs, respectively, and 3  $V_{p-p}$  for SiGe HBTs [21]), (*ii*) select the minimum inductor value that can be reliably fabricated with a Q > 10 at  $\omega_{osc}$ , (iii) bias the transistor at the optimal minimum noise figure current density (0.15mA/µm in n-MOSFETs irrespective of foundry and technology node), and (iv) size the transistor and the  $C_1/C_2$  ratio such that the noise impedance of the transistor matches that of the tank at  $\omega_{osc}$ , without changing  $V_{osc}$ . Step (iv) typically requires several iterations, especially if  $\omega_{osc}$  is close to the transistor  $f_T/f_{MAX}$ . Linearization is usually not required in MOSFET implementations because deep submicron MOSFETs exhibit almost bias-independent  $g_m$ ,  $C_{gs}$ , and  $C_{gd}$ . In the case of bipolar VCOs, linearization is a must and can be accomplished elegantly as in cascode LNAs, without degrading phase noise, by using inductive emitter degeneration. A survey of mmwave CMOS and SiGe HBT VCOs reveals systematically 6-10 dB lower phase noise values achieved with bipolar VCOs over those of SOI/CMOS VCOs due to the 2-3 times larger voltage swings afforded by higher breakdown voltages in SiGe HBTs.

#### V. ON-CHIP HIGH-SPEED SERIAL LINKS AT 80-100 GB/S

According to the 2003 ITRS, the continued push to higher frequencies and larger chip sizes has created a gap between the interconnect needs and projected interconnect performance [4].

At the moment, the biggest problem is wiring delay, the ramifications of which are likely to be synchronous clock domains that only span a small fraction of a chip [22]. Several solutions have already been proposed aimed at reducing the interconnect delay or making it irrelevant. Near term solutions such as the introduction of copper wires and low-k dielectrics will help reduce the delay. In the long term, asynchronous clocking and Network-on-Chip (NOC) concepts will help avoid the issue altogether. However, these solutions do not address another problem of long, on-chip, high-speed interconnects and that is Inter Symbol Interference (ISI).

To illustrate this problem, an RGLC-model for a 3.6-mm long microstrip line was fitted to the measured characteristics up to 94 GHz, as in Fig. 16. The attenuation increases almost linearly with frequency, reaching 2.5 dB at 90 GHz. Simulated eye diagrams for a  $2^{10}$ –1 PRBS signal over a 5-cm version of the microstrip line at 100 Gb/s are reproduced in Fig. 17(a). For line lengths longer than 3 cm the eye is completely closed. Fig. 17(b) shows the eye after being processed by a 7-tap, 2.5-ps spaced, transversal equalizer (i.e. a modified version of the FFE presented in [23] for operation at 100 Gb/s). Thus, ignoring noise generated by the equalizer itself, electrical equalization can be used to extend the distance, to more than 5-cm, over which data can be reliably transmitted on-chip using conventional microstrip lines.

However, for up to 1 cm of on-chip interconnect an even simpler solution exists that relies on inductive peaking. For the first time, in Fig. 18, an 80-GHz driver with output amplitude and pre-emphasis control is shown. It operates from a 2.5-V supply and consumes 200 mW. The chip microphotograph highlights the use of silicon inductors, smaller than  $20\mu mx 20\mu m$ , which operate above 90 GHz, and of production  $55\mu mx 70\mu m$  pads. The measured differential gain, S<sub>21</sub>, shown in Fig. 19, increases linearly by 7 dB from 10



Fig. 16. Measured vs. modeled attenuation and characteristics impedance for an on-chip 3.6-mm long microstrip line.



Fig. 17. 100 Gb/s eye at the input and output of a 7-tap distributed feed forward equalizer [23] after passing through a 5-cm long microstrip line.

GHz to 65 GHz, peaking above 10 dB in the 65-GHz to 75-GHz range. More than 10 dB of gain control is achieved over the entire frequency range. The output return loss, better than -10 dB up to 94 GHz, is also shown in Fig. 19 and remains unchanged as a function of the pre-emphasis control current.



Fig. 18. 80-GHz Driver with peaking control for pre-emphasis at 80 Gb/s.

## VI. CONCLUSIONS

Algorithmic design methodologies have been developed for the main circuit building blocks that make up a wireline transceiver. The theory was experimentally verified on 40-Gb/s SiGe BiCMOS preamplifiers, a 49-Gb/s retimer and on an 80-GHz output driver with pre-emphasis, all fabricated in 130-nm SiGe BiCMOS technology and operating from 2.5-V supply. The prospects of 90-nm and 65-nm CMOS technology for low-voltage/low-power 40-Gb/s and 80-Gb/s transceivers have also been investigated and proof-of-concept building blocks are currently in fabrication. More importantly, CMOS low-noise preamplifier and CML gate designs have been shown to scale almost unchanged in terms of transistor size and current from 90-nm to 65-nm node while their noise and bandwidth are improved.

#### ACKNOWLEDGEMENTS

The authors thank NSERC, Micronet, STMicroelectronics, and Gennum for financial support and STMicroelectronics Crolles for chip fabrication. An equipment grant from CFI and OIT, and CAD tools from CMC are also acknowledged. Special thanks go to Bernard Sautreuil of STMicroelectronics for help in this project.

#### References

- T. Dickson et al., "A 72Gb/s 2<sup>31</sup>-1 PRBS Generator in SiGe BiCMOS Technology," ISSCC Digest, pp.342-343, Feb. 2005.
- [2] S.P. Voinigescu, et al., "A comparison of Si CMOS, SiGe BiCMOS, and InP HBT technologies for high-speed and millimeter-wave ICs," *SiRF-2004*, pp. 111-114, Sept. 2004.
- [3] S.P. Voinigescu, "RF and High-Speed Integrated Circuits," ECE1364S, lecture notes and midterm exam, University of Toronto, 2005.
- [4] "International Technology Roadmap for Semiconductors 2003 Edition

interconnect," ITRS, 2003/2004.

- [5] G.D. Vendelin, et al., Microwave Circuit Design Using Linear and Nonlinear Techniques, Toronto, John Wiley & Sons, 1990.
- [6] S.P. Voinigescu et al, "A scalable high-frequency noise model for bipolar transistors with application to optimal transistor sizing for low-noise amplifier design," *IEEE J. Solid-State Circuits*, vol. 32, no. 9, Sept. 1997.
- [7] A.N. Karanicolas, "A 2.7-V 900-MHz CMOS LNA and Mixer," *IEEE J. Solid-State Circuits*, vol. 31, no. 12, , pp.1939-1944, Dec. 1996.
- [8] C. Kromer, et al. "A 100mW 4x10Gb/s Transceiver in 80nm CMOS for High-density Optical interconnects," *ISSCC Digest*, pp.334-335, 2005.
- [9] H. Tran et al, "6-kΩ, 43-Gb/s differential transimpedance-limiting amplifier with auto-zero feedback and high dynamic range," *IEEE GaAs IC Symp. Tech. Dig.*, pp. 241-244, Nov. 2003.
- [10] M.. Meghelli, "A 108Gb/s 4:1 Multiplexer in 0.13 µm SiGe-Bipolar Technology," ISSCC Digest, pp.236-237, Feb. 2004.
- [11] M. Laurens et al, "A 150 GHz  $f_T/f_{MAX}$  0.13 µm SiGe:C BiCMOS technology,"*Proc. IEEE BCTM*, Sept. 2003.
- [12] G. Freeman et al. "Transistor Design and Application Considerations for >200 GHz SiGe HBTs," *IEEE Trans ED*, Vol.50, No.3, pp.645-655, 2003.
- [13] T.O. Dickson et al., "A 2.5-V, 40-Gb/s Decision Circuit Using SiGe BiCMOS Logic," *Dig. Symp. VLSI Circuits*, pp. 206-209, June 2004.
- [14] M. Rodwell, et al., "Transistor and Circuit Design for 100-200 GHz ICs," *IEEE CSICS* Technical Digest, pp.207-210, Oct. 2004.
- [15] J. Kim, et al., "Circuit Techniques for a 40Gb/s Transmitter in 0.13µm CMOS," *ISSCC Digest*, pp.150-151, Feb. 2005.
- [16] K. Kanda, et al., "40Gb/s 4:1 MUX/1:4 DEMUX in 90nm Standard CMOS," ISSCC Digest, pp.152-153, Feb. 2005.
- [17] D. Kehrer et al., "A 60Gb/s 2:1 Selector in 90nm CMOS," *IEEE CSICS Digest*, pp.105-108, Oct. 2004.
- [18] F. Ellinger, et al., "60 GHz VCO with Wideband Tuning Range Fabricated on VLSI SOI CMOS Technology," *IEEE MTT-S Digest*, pp.1329-1332, June 2004.
- [19] J. Kim, et al., "A 44GHz Differentially Tuned VCO with 4GHz Tuning Range in 0.12µm SOI CMOS Technology," *ISSCC Digest*, pp.416-417, Feb. 2005.
- [20] P.-C. Huang, et al., "A 114GHz VCO in 0.13µm CMOS Technology," ISSCC Digest, pp.404-405, Feb. 2005.
- [21] C. Lee, et al., "SiGe BiCMOS 65-GHz BPSK Transmitter and 30 to 122 GHz LC-Varactor VCOs with up to 21% Tuning Range," *IEEE CSICS*, Technical Digest, pp.179-182, Oct. 2004.
- [22] S. Kumar, et al., "A network on chip architecture and design methodology," in IEEE Computer Society Annual Symposium on VLSI, pp. 105–112, April 2002.
- [23] A. Hazneci and S. P. Voinigescu, "49-Gb/s, 7-Tap Transversal Filter in 0.18µm SiGe BiCMOS for Backplane Equalization," *IEEE CSICS*, Technical Digest, pp.101-104, Oct.2004.



Fig. 19. Measured S parameters for 80-GHz driver as a function of pre-emphasis control.