# Coding Schemes for Chip-to-Chip Interconnect Applications

Kamran Farzan, Member, IEEE, and David A. Johns, Fellow, IEEE

Abstract-Increasing demand for high-speed interchip interconnects requires faster links that consume less power. The Shannon limit for the capacity of these links is at least an order of magnitude higher than the data rate of the current state-of-the-art designs. Channel coding can be used to approach the theoretical Shannon limit. Although there are numerous capacity-approaching codes in the literature, the complexity of these codes prohibits their use in high-speed interchip applications. This work studies several suitable coding schemes for chip-to-chip communication and backplane application. These coding schemes achieve 3-dB coding gain in the case of an additive white Gaussian noise (AWGN) model for the channel. In addition, a more realistic model for the channel is developed here that takes into account the effect of crosstalk, jitter, reflection, inter-symbol interference (ISI), and AWGN. Interestingly, the proposed signaling schemes are significantly less sensitive to such interference. Simulation results show coding gains of 5-8 dB for these methods with three typical channel models. In addition, low-complexity decoding architectures for implementation of these schemes are presented. Finally, circuit simulation results confirm that the high-speed implementations of these methods are feasible.

Index Terms-Chip-to-chip communications, coding, high speed, power efficient, signaling scheme.

### I. INTRODUCTION

DVANCES in integrated circuit (IC) fabrication technology, coupled with aggressive circuit design, have led to an exponential growth in speed and integration levels. However, to improve overall system performance, the communication speed between systems and ICs must increase accordingly. Currently, communication bus links in various applications approach Gb/s data rates. These applications include high-speed network switching, local area network, memory buses, and multiprocessor interconnection networks. It is also likely that many high speed digital signals will be transmitted between analog and digital chips.

High-speed circuits as well as low-loss matched transmission lines are necessary to maintain a high performance and to minimize crosstalk, reflection, and dispersion in a high-speed chip-to-chip link. Achieving a highly dense system by bringing the chips closer together is only a partial solution since denser systems require denser interconnects, which in turn, cause more

D. A. Jones is with the Edward S. Rogers, Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 3G4, Canada.

Digital Object Identifier 10.1109/TVLSI.2006.874369

crosstalk [1]. Simulation results for a parallel bus interface show that crosstalk can inject 140 mV (pp) errors into the victim line for an 800 mV aggressor step, which translates to a crosstalk as large as 20% of the aggressor step amplitude [2]. Therefore, crosstalk is extremely important in interchip communication applications. Indeed, it is the dominant noise in most of the microstrip interconnects. In such cases, the channel capacity is independent of the spectral power density (SPD) of the transmitted signal and it depends mainly on the channel and crosstalk frequency response [3]. Consequently, crosstalk can be the limiting factor for the interconnect capacity. Since crosstalk is proportional to the transmitted signal amplitude, increasing transmitted signal power or equivalently transmitted signal-tonoise ratio (SNR) would increase the noise due to crosstalk and residual reflections and, therefore, cannot increase the noise margin significantly.

In the case where the noise can be modeled as Gaussian, one can derive the required SNR for a given bit error rate (BER). For example, assuming a BER of  $10^{-15}$ , which is a reasonable value for chip-to-chip interconnects, the required SNR can be shown to be 18.4 dB [4]. This means that if the signal amplitude is 100 mV, the standard deviation of the permitted noise in the system could be as high as 12 mV. To reduce the BER, in general, one needs to either increase the signal amplitude or reduce the noise by using special circuit techniques. Both solutions require more power, and since off-chip drivers can consume up to 70% power of a large pin count digital chip [5], reducing the power consumed by interconnect circuitry is extremely important.

There is still a significant gap between the Shannon limit and the data rates of the current state-of-the-art designs [3]. Introducing some redundancy at the transmitter (channel coding) can be used as an attempt to approach the Shannon limit and to find a low-power scheme [6], [7]. Finding good codes is a simple task. Indeed, randomly generated codes with a large block size can be a very good code. The problem lies in the fact that while encoding is always a rather simple task, the decoding complexity increases exponentially with the block size and, thus, quickly becomes unmanageable [8]. Therefore, instead of making the code more and more complex, the search should focus on finding low-complexity codes with good coding gain. In chip-to-chip communication applications, where highspeed implementation is the main concern, this becomes even more important.

Section II introduces several suitable coding schemes for interchip communication and backplane applications. A simple coding scheme for binary (2-PAM) signaling is proposed, which is significantly less sensitive to crosstalk, jitter, ISI, and residual reflections than the regular 2-PAM scheme. Several multilevel

Manuscript received April 18, 2005; revised October 19, 2005. This work was supported by Semiconductor Research Corp. (SRC), the Microelectronic Network (Micronet), and the Natural Sciences and Engineering Research Council (NSERC).

K. Farzan is with Snowbush Microelectronics, Toronto, ON M5G 1Y8, Canada (e-mail: farzan@snowbush.com).



Fig. 1. (a) General block diagram of a 2-PAM signaling scheme. (b) General Block diagram of the proposed coding scheme (3LINE-PAM2).

coding schemes are also proposed that are motivated from the Gigabit Ethernet scheme [9]. The approach is to transmit information in 5-PAM or 6-PAM instead of 4-PAM and use some techniques, such as coded modulation [10], to achieve a moderate coding gain. Section III provides a realistic model for the channel. This model is used for three typical channels. Simulation results are shown in Section IV for one binary signaling scheme and one multilevel signaling scheme using these channel models. Finally, Section V explains low-complexity architectures for analog implementation of these coding schemes. It should be noted that although these schemes can be applied to both single-ended and fully differential architectures, most figures show the single-ended architecture only to simplify the illustration.

# II. CODING SCHEMES FOR CHIP-TO-CHIP COMMUNICATION

Although the 50-year-old edifice of coding theory has resulted in numerous capacity-approaching codes, the search for low-complexity coding schemes for practical implementation is still an active research topic [8]. In chip-to-chip communication applications, the main challenge is to come up with low-complexity coding schemes that can be implemented at high speed. This section investigates several suitable coding schemes for chip-to-chip communication applications. The use of coding in inter-chip applications can be categorized into two subsections: two-level signaling and multilevel signaling.

# A. Coding Schemes for Two-Level Signaling

As shown in Fig. 1(a), in a 2-PAM (binary) signaling scheme, two lines are required for transmitting two bits of information. Symbols (-1, -1), (-1, 1), (1, -1), and (1, 1) can be used to send the information bits 00, 01, 10, and 11. The minimum squared Euclidean distance (MSED) in this constellation is four. To achieve an appreciable coding gain, a signaling scheme with more than two lines could be used. A simple scheme, 3LINE-PAM2, is to use codewords (-1, -1, -1), (-1, 1, 1), (1, -1, 1), and (1, 1, -1) for transmitting 00, 01, 10, and 11, respectively. The MSED of these codewords (MSED = 8) is twice of the one in the uncoded 2-PAM signaling scheme (MSED = 4) and, therefore, it provides 3 dB coding gain. Obviously, this gain is achieved at the cost of adding one more line to the interconnect link as shown in Fig. 1(b).



Fig. 2. Simulation results for 3LINE-PAM2 and regular 2-PAM in the case of AWGN.

For decoding of the received signal, the Euclidean distance of the received signal to each of the transmitted codewords [(-1,-1,-1),(-1,1,1),(1,-1,1),(1,1,-1)] should be calculated and the one that has the smallest distance is decoded as the output. For example, the decoder output would be 00 if the codeword (-1,-1,-1) has the smallest distance to the received signal. Although this seems to significantly increase the complexity of the receiver, a low-complexity method, which needs only six comparators and several logic gates, is proposed in Section V.

A Simulink model is used here to verify the coding gain of 3LINE-PAM2 in the case of additive white Gaussian noise (AWGN) channel [11]. As shown in Fig. 2, the proposed coding scheme provides roughly 2.8 dB coding gain at a BER of approximately  $10^{-6}$ . The two curves in this figure diverge slightly and the full 3-dB gain is expected to be obtained at higher SNRs. However, the extra line in this signaling scheme needs an extra 1.7 dB power, which reduces the overall gain. Nevertheless, this method can significantly reduce the required SNR in the presence of crosstalk. The additive white Gaussian noise (AWGN) model for the channel is modified to the one in Fig. 3, which takes into account the effect of crosstalk. As shown in this figure, we model the crosstalk by taking the derivative of the transmitted signal in the discrete time domain. Crosstalk amplitude can be adjusted by a gain factor q.

Two sets of simulations have been performed to determine the performance of the proposed method in the presence of crosstalk. Fig. 4 shows the simulation results for two different values of the crosstalk coefficient [g = 0.1 in Fig. 4(a) and g = 0.2 in Fig. 4(b)]. As shown in this figure, the proposed method achieves 4 dB gain at BER of  $10^{-7}$  over the ordinary 2-PAM signaling when g = 0.1. It should be mentioned that crosstalk coefficient of 0.1 could represent a practical case when the capacitance of each line to ground is roughly nine times of the coupling capacitance between lines in a bus. A system that can tolerate a larger crosstalk coefficient can have a higher channel density and, therefore, take up less board and/or chip area.

Interestingly, the performance improvement for the case of g = 0.2 is roughly 8 dB at BER =  $10^{-3}$ . These results



Fig. 3. Modified channel model to take into account the effect of crosstalk.



Fig. 4. Performance of the proposed method in the presence of crosstalk: (a)g = 0.1 and (b) g = 0.2.

show that the proposed method is significantly less sensitive to crosstalk than the regular binary signaling scheme, which in turn justifies the use of one more line.

| PAM5 Partitioning |     |        | g 4D Subsets    | # of<br>Points | MSED |
|-------------------|-----|--------|-----------------|----------------|------|
|                   | в   | I +2   | S0: AAAA & BBBB | 97             | 4    |
| 1                 | A ( | ) +1   | S1: AAAB & BBBA | 78             | 4    |
| I                 | в   | 0      | S2: AABB & BBAA | 72             | 4    |
|                   | A   | ) -1   | S3: AABA & BBAB | 78             | 4    |
|                   | в   | -2     | S4: ABBA & BAAB | 72             | 4    |
| Gigabit Ethernet  |     |        | S5: ABBB & BAAA | 78             | 4    |
| Structure         |     | ire    | S6: ABAB & BABA | 72             | 4    |
| -                 |     |        | S7: ABAA & BABB | 78             | 4    |
| Transmitte        |     | seiver | M={S0,S2,S4,S6} | 313            | 2    |
|                   |     | T a    | N={S1,S3,S5,S7} | 312            | 2    |
|                   |     |        | {M,N}           | 625            | 1    |

Fig. 5. Subset partitioning in 4-D space with 5-PAM in each dimension.

### B. Coding Schemes for Multilevel Signaling

Although the proposed coding scheme in Section II-A provides a significant gain, especially in the presence of crosstalk, the overhead of the 3LINE-PAM2 method precludes its use in many applications where the total number of signal traces between two chips is limited. Multilevel signaling such as 4-PAM can be used to reduce the number of required signal traces in a bus. In this section, several coding schemes for multilevel signaling are proposed that are based on the Gigabit-Ethernet coding scheme. Therefore, a brief explanation of Gigabit-Ethernet and coded-modulation is necessary.

1) Coded Modulation and Gigabit-Ethernet Coding: In Gigabit Ethernet, 1 Gb/s throughput is achieved with four pairs of twisted pair cables. The IEEE 802.3ab standard settled on a base-band 5-level PAM (5-PAM) combined with trellis coding [6], [12]. This scheme makes use of 5-level PAM  $(\{-2, -1, 0, 1, 2\})$  on each pair of wires to code two bits of information. Transmitting two bits of information needs only four levels and, therefore, the extra level in the 5-PAM scheme provides a code redundancy that can be used for improving the performance. The four pairs of cables form a four-dimensional (4-D) constellation (each pair represents one dimension). The total number of the points in the constellation is  $5^4 = 625$ , but only 256 points are necessary for transmitting eight bits. This redundancy can be used to achieve a 1.5-dB coding gain with symbol-to-symbol detection and 4.5 dB with a sequence detector, such as a Viterbi decoder, over the uncoded ordinary 4-PAM signaling [12].

Partitioning the set of points in each dimension into two subsets  $A : \{-1,1\}$  and  $B : \{-2,0,2\}$  is the basic idea behind this method. As shown in Fig. 5, eight 4-D subsets (S0 to S7) can be formed by means of this partitioning. The intrasubset MSED for each subset is four. Notice that the set M ( $M = \{S0, S2, S4, S6\}$ ), which has 313 points, can be used to construct a constellation for transmitting eight bits. The MSED of this constellation is two, which is twice the MSED of ordinary 4-PAM. The scaled version of this constellation, which has the same MSED as 4-PAM, has roughly the same BER as 4-PAM. However, it can be shown that it needs 1.5 dB less transmitted power than the transmitted power of 4-PAM. This idea can be

 

 BIT-RATE COMPARISON FOR DIFFERENT 5-PAM CONSTELLATIONS

 # dimensions
  $log_2($  # points with MSED=4)
 bits/symbol/ (# of lines)
  $(bits/Symbol)_{\pi}$  scheme name

 4
 6.5
 1.5
 .75
 4LINE-PAM5

TABLE I

| 4  | 6.5   | 1.5   | .75   | 4LINE-PAM5 |
|----|-------|-------|-------|------------|
| 5  | 8.18  | 1.6   | 0.8   | 5LINE-PAM5 |
| 6  | 10.18 | 1.666 | 0.833 | 6LINE-PAM5 |
| 7  | 12.26 | 1.714 | 0.857 |            |
| 8  | 14.6  | 1.75  | 0.875 |            |
| 16 | 32.15 | 2     | 1     |            |
|    |       |       |       |            |

used to construct a signaling scheme, hereafter referred to as Coded-Modulation-PAM5 scheme, which provides 1.5-dB gain over the ordinary 4-PAM signaling. Moreover, a trellis encoder in the transmitter and a Viterbi decoder in the receiver can be used to achieve an overall 4.5-dB coding gain over the uncoded 4-PAM scheme [9].

2) Multilevel Coding Schemes for Chip-to-Chip Communication: As mentioned before, using a Viterbi decoder in the receiver leads to a 4.5-dB gain over 4-PAM. Unfortunately, the complexity of the Viterbi decoder prevents its use for high-speed chip-to-chip interconnects. One possible solution is to use the Coded-Modulation-PAM5 scheme. However, the 1.5 dB gain of this scheme over ordinary 4-PAM is only a modest gain.

As shown in Fig. 5, the intrasubset MSED of each subset (S0 to S7) is four, which is twice the MSED of set M. Therefore, using only one subset provides a better coding gain. However, the number of points in subset S0 (97), which has the maximum number of points, is far from the required number for transmitting eight bits (256).

Adding one dimension, i.e., one extra line, to the constellation increases the redundancy, so different schemes or constellations with MSED = 4 could be found by adding more dimensions to the constellation. Table I provides some information that could be used to find such a scheme. This table shows the logarithm of the number of codewords with MSED = 4 (second column) for different number of dimensions (first column). The third column shows the number of bits per symbol per line. The number of bits per symbol for each scheme is divided by the number of bits per symbol for 4-PAM to get the normalized bits/symbol, (bits/symbol)<sub>n</sub>, for each constellation. This parameter is shown in the fourth column.

Each row in this table represents a constellation or, equivalently, a signaling scheme for chip-to-chip communication. The names of some of the more useful schemes for this application are shown in the far-right column of Table I. For example, the third row of Table I shows that a six-dimensional (6-D) constellation can be used for transmitting 10 bits. This method is called 6LINE-PAM5 scheme throughout this paper. The same idea can be applied to the first and the second row of this table. This introduces two new schemes: 4LINE-PAM5 (transmitting six bits over four lines) and 5LINE-PAM5 (transmitting eight bits over five lines).

On account of the structure of these constellations (schemes), the minimum distance of the codewords in these constellations is more than the minimum distance of the codewords in 4-PAM for a given transmitted power. This means that for the same BER, this scheme requires less power compared to the conventional 4-PAM scheme. Actually, it can be shown that these constellations can provide roughly 3 dB gain over the uncoded

TABLE II BIT-RATE COMPARISON FOR DIFFERENT 6-PAM CONSTELLATIONS

| # dimen- | $\log_2(\# \text{ points})$ | bits/symbol/ | $(bits/Symbol)_n$ | scheme name |
|----------|-----------------------------|--------------|-------------------|-------------|
| sions    | with MSED=4)                | (# lines)    |                   |             |
| 4        | 7.34                        | 1.75         | .875              | 4LINE-PAM6  |
| 5        | 8.92                        | 1.6          | 0.8               |             |
| 6        | 11.51                       | 1.83         | 0.92              |             |
| 7        | 14.09                       | 2            | 1                 | 7LINE-PAM6  |
|          |                             |              |                   |             |

4-PAM signaling scheme. As shown in this table, there is a tradeoff between complexity and data rate; reducing the number of dimensions in the constellation results in lower data rate and lower complexity.

An alternative approach is to use 6-PAM modulation. Here, the set of points in each dimension can be partitioned into two subsets  $A : \{-1.5, .5, 2.5\}$  and  $B : \{-2.5, -.5, 1.5\}$ . Table II, which is similar to Table I, summarizes the possible schemes in this case. Again, each row in this table is a possible scheme for this application. For instance, the first row introduces a method, hereafter referred to as 4LINE-PAM6, for transmitting seven bits over four lines with 6-PAM modulation in each line.

In this scheme, patterns AAAA and BBBB are used for constructing a constellation with MSED = 4. There are 162 points in this constellation and, therefore, seven bits can be transmitted with this scheme. From the original 162-point constellation, 34 points that have higher energy have been removed to form a 128-point constellation. It can be shown that this scheme provides roughly a 3 dB gain over the uncoded 4-PAM while its data rate is only 13% less than that of 4-PAM. The last row in Table II represents a scheme for transmitting 14 bits over 7 lines. Therefore, it has the same throughput as the 4-PAM scheme  $[(bits/symbol)_n = 1]$ , and it also provides about a 3 dB gain over 4-PAM. However, its complexity is much more than the complexity of 4LINE-PAM6 scheme.

Using an uncoded 3-level PAM on four lines results in a  $3^4$ -point constellation. Therefore, another option is to transmit six bits over four lines using 3-PAM in each line. However, the gain of this method over the ordinary 4-PAM is roughly 2 dB and (bits/symbol)<sub>n</sub> = 0.75, which is not as good as the corresponding values in 4LINE-PAM6.

This section has introduced several signaling schemes that can be used in chip-to-chip communication. Table III summarizes the performance of some of the schemes that have been studied. The second column in this table shows the performance improvement of each method over the 4-PAM scheme and the third column shows the normalized data rate for each method. To obtain the performance improvement of each scheme, we make the MSED of each scheme equal to the MSED of 4-PAM, thereby obtaining approximately the same BER, and calculating the extra power that the traditional 4-PAM scheme needs.

Among the schemes in Table III, the 4LINE-PAM6 method is the best signaling scheme for high-speed interchip applications since it is a low-complexity method that has the second largest (bits/symbol)<sub>n</sub>. A low-complexity method for an analog implementation of 4LINE-PAM6 is proposed in Section V-B.

To verify the results of Table III for 4LINE-PAM6 scheme and compare the performance of this method with the performance of Coded-Modulation-PAM5 and regular 4-PAM, a model in Simulink is developed. This model uses the AWGN

TABLE III Performance Comparison for Different Schemes

| scheme     | gain over 4-PAM       | $(bits/symbol)_n$ | Complexity |
|------------|-----------------------|-------------------|------------|
| PAM3       | 2 dB                  | 0.75              | low        |
| 4LINE-PAM6 | 3.02 dB               | 0.875             | low        |
| 4LINE-PAM5 | 3.1 dB                | 0.75              | low        |
| 5LINE-PAM5 | 2.5 dB                | 0.8               | moderate   |
| 6LINE-PAM5 | 3.17 dB               | 0.833             | moderate   |
| 7LINE-PAM6 | $\simeq 3 \text{ dB}$ | 1                 | High       |
|            |                       |                   |            |



Fig. 6. Simulation result for 4-PAM, 5-PAM, 4LINE-PAM6, and Coded-Modulation-PAM5 schemes.

model for the channel. Fig. 6 shows the simulation results, the symbol error rate (SER) versus the SNR, for several signaling schemes. As shown in this figure, the performance of the 4LINE-PAM6 method is roughly 2.7 dB better than the performance of the 4-PAM scheme at a SER of about  $10^{-3}$ . However, the expected gain for this method is 3 dB. The reason for this small difference (0.3 dB) is the fact that each point in the 4LINE-PAM6 constellation has more neighboring points, points in the constellation with minimum distance away from the original point, compared to the 4-PAM constellation. Fortunately at high SNRs, where the SER-versus-SNR curve has a larger slope, this difference would be even smaller and almost the full 3-dB gain over 4-PAM can be achieved.

It should be mentioned that some interconnect applications are peak-power limited, hence, the comparison between the proposed method should be performed when the peak power is the same for both methods. Fig. 7 shows the simulation results in this case. The horizontal axis shows the noise attenuation in dB and the vertical axis shows the SER. The expected gain in this case is about 1.6 dB less than the expected gain in the previous case. This 1.6 dB comes from the fact that if the peak signal power in both 4-PAM and 4LINE-PAM6 schemes is the same, the average power of a 4LINE-PAM6 scheme is 1.6 dB less than that of a conventional 4-PAM scheme.

As shown in Fig. 7, simulation results show that the gain in this case is roughly 1.4 dB as we expected. A similar approach to the one in Section II-A is used to take crosstalk into account. Fig. 8 shows a performance improvement of about 2.5 dB and



Fig. 7. Simulation result for 4-PAM, 4LINE-PAM6 in peak-power-limited case.

4 dB for g = 0.05 and g = 0.07, respectively. The above 4 dB gain translates to roughly 5.6 dB power saving. The rest of this paper specifies coding gain based on power saving at certain BER and to obtain the gain for peak-power-limited applications, 1.6 dB should be deducted from the mentioned coding gain.

The reported gain for 4LINE-PAM6 scheme is basically the difference between the the required SNR of the 4LINE-PAM6 scheme and the regular 4-PAM scheme at a certain BER. However, since the 4LINE-PAM6 scheme transmits seven bits over four lines and the regular 4-PAM scheme transmits eight bits over four lines, we should deduct 0.57 dB from the reported gain for the 4LINE-PAM6 scheme throughout this paper. It should also be noted that to avoid excessively long simulations, they are not extended to high SNRs.

# III. REALISTIC CHANNEL MODEL FOR CHIP-TO-CHIP APPLICATIONS

Fig. 9 shows the general block diagram of a chip-to-chip communication system. The transmitter is modeled by a voltage source, an output impedance  $(Z_s)$ , and a package. Similarly the receiver is modeled by a receiver package and an impedance  $(Z_L)$ , which is the input impedance of the receiver. Different kinds of transmission lines, such as microstrip and stripline, can be used for PCB traces between two chips. In general, PCB traces can be modeled as transmission lines.

For perfect termination,  $Z_L$  and  $Z_s$  should be equal to the characteristic impedance of the transmission line [13]. In practice, it is very difficult to have perfect termination and, therefore, there would be some residual reflections. Attenuation of the transmission line is also another important parameter in this system, which causes inter-symbol interference (ISI). Throughout this paper, ISI refers to dispersion-induced ISI only to differentiate it from the intersymbol-interference due to the reflections.

So far, a channel model that takes into account the effect of crosstalk and/or AWGN has been used in the simulations. This would be a good model for a channel with perfect termination



Fig. 8. Performance of the 4LINE-PAM6 and 4-PAM methods in the presence of crosstalk in peak-power-limited case: (a) g = 0.05 and (b) g = 0.07.



Fig. 9. A general block diagram for a chip-to-chip communication system.

and no ISI. Nevertheless, the main sources of noise for this application in practical systems are usually ISI and residual reflections due to the imperfect termination. Consequently, a more realistic model for the channel should take into account the effect of ISI and reflection.

A 2-D field solver (W-element in HSPICE) is used to obtain the *RLCG* parameters for two typical transmission lines for interchip applications: microstrip and stripline with  $Z_0 = 100 \Omega$ differential ( $Z_0 = 50 \Omega$  single-ended). These RLCG parameters



Fig. 10. Eye diagram for a 10-Gb/s link at the receiver (channel: package, 300-mm microstrip, package) when  $Z_s = 80$  and  $Z_l = 120$ .

are frequency dependent and, therefore, the ABCD representation, which can be obtained from [13], also takes into account the skin effect and the dielectric loss [14].

It is straightforward to obtain the two-port ABCD representation of package models and source and load impedances. Multiplying those two-port representations results in the two-port representation of the entire channel, and thereby the transfer function  $V_L/V_S$  in Fig. 9.

# **IV. SIMULATION RESULTS**

This section shows the simulation results for 3LINE-PAM2 and 4LINE-PAM6 schemes in typical chip-to-chip communication channels. The channel model in these simulations consists of two parts. The first part uses the magnitude and the phase of the channel transfer function obtained by the method presented in Section III to find the impulse response of the channel. This impulse response is used to find the output of the channel. The second part adds jitter to the clock and white Gaussian noise to the output of the first section. Therefore, this model is a general model that takes into account the effect of jitter, ISI, reflection and additive white Gaussian noise.

# A. Simulation Results for the 3LINE-PAM2 Scheme

A typical channel model for this application can be obtained with the proposed method in Section III. Fig. 10 shows the eye diagram of a 10-Gb/s link at the receiver for a channel comprised of a transmitter package, 0.3-m microstrip, and a receiver package for a binary signaling scheme. Here, the source and load impedances are selected to be 80  $\Omega$  and 120  $\Omega$ , respectively. Fig. 11 shows the magnitude of the transfer function for this channel. In addition, the channel model is modified to take into account the effect of clock jitter. Fig. 12 shows the performance of 3LINE-PAM2 for this model at 10 Gb/s. As shown in Fig. 12(a), the proposed method provides roughly 5 dB performance improvement at BER =  $10^{-3}$ . Significant performance improvement of about 8 dB is achieved by 3LINE-PAM2 when a 20 ps p-p jitter is added to the channel model [see Fig. 12(b)].

Simulation results in this section show that the 3LINE-PAM2 scheme is significantly less sensitive to jitter, ISI, and residual reflection than the ordinary 2-PAM signaling scheme. As shown in Fig. 10, the eye height of the received signal is roughly 0.3 V, which translates to a noise margin of only 0.15 V, for a 2 V



Fig. 11. Channel frequency response for the case of microstrip and  $Z_L = 120 \ \Omega, Z_S = 80 \ \Omega, d = 0.3$ m.

signal swing at the transmitter. Consequently, in an advanced technology in which the signal swing cannot be more than 1 V, the use of a coding scheme to achieve the required performance becomes more appealing. More specifically, the proposed method might even eliminate the need for an equalizer in a high-speed inter chip application. Nevertheless, an equalizer can be used along with a coding scheme to further improve the performance of the system. Indeed, using both equalizer and coding is common in most of the communication applications.

#### B. Simulation Results for the 4LINE-PAM6 Scheme

Two typical channel models for high-speed interchip communication applications are used in this section to determine the performance of 4LINE-PAM6 scheme in the presence of jitter, ISI, and residual reflections. One set of simulations for each channel has been performed in MATLAB to compare the performance of this method with the ordinary 4-PAM scheme. The corresponding simulation results for each channel are presented in this section. These results show that the 4LINE-PAM6 scheme is significantly less sensitive to jitter, ISI, and residual reflections.

1) Case I:  $(Z_l = 110 \Omega, Z_s = 90 \Omega, Z_0 = 100 \Omega$  for a 0.20-m Stripline): The first channel is composed of a 0.2-m stripline, transmitter package, receiver package and source and load termination resistors. A 10% terminations mismatch is considered for the source and load resistors ( $Z_l = 110 \Omega, Z_s = 90 \Omega$ ) to create some residual reflections. Fig. 13 shows the magnitude of the channel transfer function in this case. Since the data rate used for this simulation is 10 Gb/s (5 GS/s), the frequency range of interest is 0–2.5 GHz. The attenuation of the channel in this case is roughly 4.5 dB at 2.5 GHz and, therefore, this channel introduces moderate ISI.

Fig. 14(a) illustrates the SER versus SNR for the 4LINE-PAM6 and 4-PAM schemes, which shows roughly a 6 dB gain for the 4LINE-PAM6 scheme over 4-PAM at BER around  $10^{-2}$ . Since the reasonable SER for chip-to-chip communication is on the order of  $10^{-15}$  and the two curves in Fig. 14(a) are



Fig. 12. 3LINE-PAM2 simulation results: (a) without jitter and (b) with 20 ps p-p jitter.

slowly diverging, the gain at high SNRs would be even better. Fig. 14(a) shows the performance of the 4LINE-PAM6 in the presence of a 20-ps p-p jitter. As shown in this figure, 4LINE-PAM6 shows roughly a 7 dB gain over 4-PAM at SER of  $10^{-2}$ , which is better than the simulation result without jitter. Therefore, 4LINE-PAM6 is less sensitive to jitter than 4-PAM.

2) Case II:  $(Z_l = 105 \ \Omega, Z_s = 95 \ \Omega, Z_0 = 100 \ \Omega$ for a 50-mm Microstrip): To assess the performance of 4LINE-PAM6 at 20 Gb/s, another model for the channel is used, which models a high-speed transmitter package, a high-speed receiver package, and a 50-mm microstrip. A 5% termination mismatch is considered for the source and load resistors  $(Z_l = 105 \ \Omega, Z_s = 95 \ \Omega)$ . Fig. 15 shows the magnitude of the channel transfer function in this case. The attenuation of the channel is roughly 6 dB at 5 GHz.

Fig. 16(a), which illustrates the SER versus SNR for the 4LINE-PAM6 and 4-PAM schemes, shows roughly 4 dB gain for the 4LINE-PAM6 scheme over 4-PAM at BER =  $10^{-3}$ . Fig. 16(b) shows the performance of the 4LINE-PAM6 in



Fig. 13. Channel frequency response for the case of stripline  $Z_L = 110 \ \Omega$ ,  $Z_S = 90 \ \Omega$ , d = 0.2m.

the presence of a 10-ps p-p jitter. As illustrated in this figure, 4LINE-PAM6 scheme shows roughly a 4.7 dB gain over 4-PAM at SER =  $10^{-3}$ , which is again better than the simulation result without jitter.

#### V. ANALOG IMPLEMENTATION

As mentioned earlier, the main challenge in high-speed interconnect applications is to come up with a low-complexity signaling scheme that not only provides some coding gain but also can be implemented at high-speed. This section presents lowcomplexity architectures for 3LINE-PAM2 and 4LINE-PAM6 schemes.

#### A. Analog Implementation of the 3LINE-PAM2 Scheme

In an optimal decoder for the 3LINE-PAM2 scheme, assuming all constellation points are equally likely, the distance of the received signal and all four points in the constellation should be calculated and the point that has the minimum distance is decoded as the output. Assume the received signal is (x, y, z). The Euclidean distances of this signal to the transmitted codewords (-1, -1, -1), (-1, 1, 1), (1, -1, 1), and (1, 1, -1) are  $(x + 1)^2 + (y + 1)^2 + (z + 1)^2, (x + 1)^2 + (y - 1)^2 + (z - 1)^2, (x - 1)^2 + (y + 1)^2 + (z - 1)^2,$  and  $(x - 1)^2 + (y - 1)^2 + (z + 1)^2$ , respectively. Cancelling all common terms and dividing them by 2 leads to the following terms for distances: x+y+z, x-y-z, -x+y-z, -x-y+z.

Hence, the decoding algorithm is composed of two steps: the first step is to calculate these distances and the second step is to find the smallest one by means of six comparators. The former step is unnecessary since we can further simplify the comparisons. For example x + y + z > x - y - z is equivalent to y > -z. Therefore, the transmitted information can be decoded by using only six comparators and several logic gates as shown in more detail in Fig. 17. The receiver architecture comprises of six comparators, three AND, and two OR gates. Moreover, as shown in Fig. 17, the encoder in the transmitter is simply an



Fig. 14. Case I:  $Z_L = 110 \Omega$ ,  $Z_S = 90 \Omega$ , 0.2-m stripline: (a) without jitter and (b) with 20 ps peak-to-peak jitter.

XOR gate. Hence, the low-complexity of transceiver architecture makes its high-speed implementation feasible.

# B. Analog Implementation of the 4LINE-PAM6 Scheme

In an optimal decoder for 4LINE-PAM6 scheme, assuming all points are equally likely, the distance of the received signal and all 128 points in the constellation should be calculated and the point that has the minimum distance is decoded as the output. Obviously, this method is prohibitively complex and a suboptimal method with lower complexity is more desirable. A low-complexity method for analog implementation of 4LINE-PAM6 is proposed. This method shows a negligible performance degradation (less than 0.02 dB) compared with the optimal scheme. Only AAAA and BBBB patterns are used in the 4LINE-PAM6 method. Since each pattern has 81 points, the total number of points is 162. The proposed decoder is actually an optimal decoder for the 162-point constellation. Therefore, its output would be a point in this constellation.



Fig. 15. Channel frequency response for the case of microstrip with high-speed package and  $Z_L=105~\Omega, Z_S=95~\Omega, d=50~mm$ 

This method uses a simple architecture to specify the transmitted pattern (AAAA or BBBB) for the received signal. Once this is known, the decoding would simply be the decoding of an ordinary 3-level PAM, which needs only two comparators. The first step, as shown in Fig. 18, is to find the distances of the received signal in each line with the closest point in subsets  $A = \{-2.5, -0.5, 1.5\}$  and  $B = \{-1.5, 0.5, 2.5\}$ ,  $d_A$  and  $d_B$ . The transmitted pattern can then be found by the inequality

$$d_{A1}^2 + d_{A2}^2 + d_{A3}^2 + d_{A4}^2 < d_{B1}^2 + d_{B2}^2 + d_{B3}^2 + d_{B4}^2.$$
(1)

If the output of the comparison in (1) is true, the transmitted pattern would be AAAA. The implementation of the decision in (1) needs circuitry that provides signals proportional to the square value of the  $d_{Ai}$  or  $d_{Bi}$ , which is not straightforward. Interestingly,  $d_{Bi}$  can be expressed in terms of  $d_{Ai}$  as follows:

$$d_{Bi} = \begin{cases} d_{Ai} - 1, & \text{if } y_i > 2.5; \\ 1 - d_{Ai}, & \text{if } -2.5 < y_i < 2.5; \\ 1 + d_{Ai}, & \text{if } y_i < -2.5 \end{cases}$$
(2)

where  $y_i$  is the received signal in the *i*th line. This leads to the following expression for the square value of  $d_{Bi}$ :

$$d_{Bi}^2 = (1 - m_i \times d_{Ai})^2$$
(3)

where

$$m_i = \begin{cases} -1, & \text{if } y_i < -2.5; \\ 1, & \text{if } y_i > -2.5. \end{cases}$$

Substituting (3) in (1) results in

$$\sum_{i=1}^{4} (m_i d_{Ai} - 0.5) < 0.$$
<sup>(4)</sup>



Fig. 16. Case II:  $Z_S = 95 \Omega Z_L = 105$ , 50-mm microstrip at 20Gb/s: (a) with jitter and (b) 10 ps peak-to-peak jitter.



Fig. 17. Transceiver architecture for the 3LINE-PAM2 method.

Therefore, it would be sufficient to find  $m_i d_{Ai}$  for each line and then add them all up. This addition is straightforward in



Fig. 18. Main idea of an analog implementation of the 4LINE-PAM6 scheme.

current mode circuitry. It is straightforward to show that  $m_i d_{Ai}$  can be obtained by

$$(m_i d_{Ai} - 0.5) = \begin{cases} y_i + 2, & \text{if } y_i < -1.5\\ -y_i - 1, & \text{if } -1.5 < y_i < -0.5\\ y_i, & \text{if } -0.5 < y_i < 0.5\\ -y_i + 1, & \text{if } 0.5 < y_i < 0.5\\ y_i - 2, & \text{if } y_i > 1.5. \end{cases}$$
(5)

1) Receiver Architecture: Equations (4) and (5) can be used to detect the transmitted pattern of the received signal. Knowing the transmitted pattern, only two comparators for each line are required to decode the received signal. Fig. 19 shows the general block diagram of the receiver for implementing this algorithm. The detail of the front-end block for each line is shown in Fig. 20. As shown in this figure, two comparators decode the signal for AAAA pattern and similarly the other two comparators decode the signal for *BBBB* pattern.  $I_{out}$  output of this block would be proportional to the  $m_i d_{Ai} - 0.5$  for each line. Fortunately, the required thresholds for obtaining the Iout are identical to those used in these comparators [see Fig. 20 and (5)]. Therefore, no additional comparators are needed to obtain  $I_{out}$ for each line. As shown in Fig. 19, a comparator is used to perform the comparison in (4). The output of this comparator, "Select" signal, specifies the transmitted pattern for the received signal.

There are 64 points in each subconstellation, AAAA or BBBB. This means that decoding a point in each subconstellation specifies six bits and the seventh bit would be the "Select" signal. As shown in Fig. 19, the output of the first stage of the receiver has eight bits corresponding to each subconstellation. These eight bits are mapped to 6 bits using an  $8 \times 6$  digital decoder. In other words, two sets of six bits corresponding to the two subconstellations are decoded individually. Six  $2 \times 1$  multiplexers are used to select one of these sets based on the "Select" signal.

2) Structure of  $8 \times 6$  Decoder: As mentioned earlier, two  $8 \times 6$  decoders map the output of the first stage of the receiver



Fig. 19. Block diagram of an analog implementation of 4LINE-PAM6 scheme.



Fig. 20. Detail of the receiver block (L1-L4) for each line.

to 12 bits (two sets of 6 bits). To further simplify the structure, as shown in Fig. 19, each of the two  $8 \times 6$  digital decoders is decomposed into two  $4 \times 3$  digital decoders. Since the design methodology is similar for all of these  $4 \times 3$  digital decoders, only the design of the top decoder in Fig. 19 will be explained.

The four inputs of this decoder  $(Bit1_A, Bit2_A, Bit3_A, Bit4_A)$  are the outputs of "L1 Receiver" and "L2 Receiver" blocks in Fig. 19. The first six columns of Table IV show the relation between the signal on line1 and line2 and these four bits. Since the signal in line1

TABLE IV Mapping Design for  $4 \times 3$  Decoder

| Signal1 | Signal2 | B1 | B2 | B3 | B4 | out1 | out2 | out3 |
|---------|---------|----|----|----|----|------|------|------|
| -2.5    | -0.5    | 0  | 0  | 1  | 0  | 1    | 0    | 0    |
| -2.5    | 1.5     | 0  | 0  | 1  | 1  | 1    | 1    | 0    |
| -0.5    | -2.5    | 1  | 0  | 0  | 0  | 1    | 1    | 1    |
| -0.5    | -0.5    | 1  | 0  | 1  | 0  | 1    | 0    | 1    |
| -0.5    | 1.5     | 1  | 0  | 1  | 1  | 0    | 1    | 0    |
| 1.5     | -2.5    | 1  | 1  | 0  | 0  | 0    | 1    | 1    |
| 1.5     | -0.5    | 1  | 1  | 1  | 0  | 0    | 0    | 1    |
| 1.5     | 1.5     | 1  | 1  | 1  | 1  | 0    | 0    | 0    |

cannot be simultaneously larger than 0.5 and smaller than -1.5, Bit1<sub>A</sub> and Bit2<sub>A</sub> can have only three different possibilities (00, 10, 11). The same argument is true for Bit3<sub>A</sub> and Bit4<sub>A</sub>. Therefore, the four input bits of this smaller decoder have nine different possibilities. The all-zero case is selected to be invalid since it corresponds to the constellation point (-2.5, -2.5), which is far from origin and, thus, needs more power.

The remaining eight different possibilities can be mapped to three output bits of the decoder. This mapping needs to be carefully designed to reduce BER. The mapping based on Gray code is a well-known scheme for a constellation with uniformly distributed points. However, for some systems with irregular constellations, such as the mapping for this decoder, the common method of mapping is invalid. For a constellation with M points, there are a total of M! different mappings and the search for the optimal mapping for large M is impractical. A suboptimal algorithm for this purpose is proposed in [15].

Fortunately, in this case, M is equal to 8 and looking for an optimal mapping is possible. The goal is to assign small Hamming distances to small Euclidean distances. A good mapping that needs a low-complexity circuitry for its implementation is shown in Table IV. Fig. 21 shows the required combinational circuitry for this mapping. As shown in this figure, this circuit is simple and it only needs five AND gates and two OR gates.

3) Transmitter Architecture: It seems that the 4LINE-PAM6 scheme needs a 6-PAM transmitter. Nevertheless, having only AAAA and BBBB patterns for the transmitted signals can also simplify the transmitter structure. Since the points in the subsets  $A = \{-2.5, -0.5, 1.5\}$  and  $B = \{-1.5, 0.5, 2.5\}$  are a shifted version of each other, the structure of the transmitter is basically a 3-PAM transmitter. As shown in Fig. 22(a), a current-mode three-level transmitter can be used in each line to generate signals for all different possibilities of the first six bits. The last input bit specifies the transmitted pattern. If this bit is "1," a fixed current is added to all lines.

Similar to the receiver, a decoder is needed to map the six input bits to eight bits, two bits for each line, to make the data ready for transmission. This can be done with two  $3 \times 4$  digital encoders. The required circuitry for these digital encoders is shown in Fig. 22(b). This simple circuitry only needs two AND, two NAND, and two OR gates.

4) Simulation Results: Fig. 23 shows the simulation results for this analog implementation method and an optimal decoder, which are obtained by MATLAB [11] for an AWGN channel. The performance of this method is almost identical to the performance of the optimal method. Particularly, the performance



Fig. 21. Detail of the  $4 \times 3$  decoder.

of this method is roughly 0.02 dB worse than the performance of the optimal method at SER around  $10^{-3}$ . Simulation results for a digital implementation of 4LINE-PAM6 with 4-bit quantization is also shown in Fig. 23 to show the advantage of this analog implementation over a digital implementation with 4-bit quantization.

As shown in Fig. 23, the performance of a digital implementation with 4-bit quantization is around 1 dB worse than the performance of the optimal implementation of 4LINE-PAM6. At the same time, this digital implementation needs much more circuitry than the analog one. This analog implementation needs only 17 comparators, whereas the digital implementation needs four 4-bit analog-to-digital converters (60 comparators).

It is also useful to compare the complexity of this method with the complexity of the ordinary 4-PAM schemes. The ordinary 4-PAM scheme needs three comparators for each line, whereas the analog implementation of 4LINE-PAM6 requires four comparators and one transconductance amplifier for each line. Thus, a small increase in the complexity results in a large performance improvement.

# C. Circuit-Level Simulations

The receiver architecture for 4LINE-PAM6 method was designed and simulated in a 0.18- $\mu$ m CMOS technology with Spectre [16]. As shown in Fig. 20, the receiver architecture needs operational transconductance amplifiers (OTA) to convert the received signal from voltage to current. Simulation results show that 5–6 bit linearity is sufficient for this block. Since high-speed implementation is the primary concern here, a simple architecture that can work at high-speed and satisfies the required linearity condition is chosen.

Fig. 24 shows the schematic of such an amplifier. As shown in this figure, the conventional architecture for a high-speed OTA is modified to have an extra feature. This new feature enables the OTA to be turned off by pulling up "Enable" signal, which in turn, steers the tail currents into two dummy branches and makes



Fig. 22. Transmitter structure. (a) Transmitter architecture. (b) Encoder detail.



Fig. 23. Simulation results for different implementations of the 4LINE-PAM6 scheme.



Fig. 24. Schematic of the transconductance amplifier.

the output current equal to zero. This switching architecture not only increases the speed but also reduces the required voltage headroom of the conventional method for turning off the OTA, using a switch in series with the current mirrors. Simulation results for this block show that its linearity is roughly six bits.

This modified OTA is especially useful for high-speed implementation of 4LINE-PAM6 receiver. As illustrated in (5), the output of the front-end blocks in the receiver has two terms. A first term is proportional to the received signal (" $y_i$ " or " $-y_i$ ") and the other term is a constant value that depends on the received signal. Fig. 25 shows the circuit realization of the front-end block of the 4LINE-PAM6 receiver. As shown in this figure, two OTAs are used to generate currents proportional to " $y_i$ " and " $-y_i$ ." Based on the input signal, one of them is turned on and the other one is turned off. "Constant Current Generator" block generates the constant term in (5). Instead of turning off unnecessary current sources, they are steered into dummy branches to increase the speed of circuit. Simulation results show that the proposed circuit is functional up to 2 GS/s. Using an interleaving technique, it should be possible to speed up the circuit to 4 GS/s by means of two parallel circuits at 2 GS/s.

As stated before, 4LINE-PAM6 method can reduce the transmitted power by roughly 6 dB. Since 20 mA current is required for a differential signal swing of 1 V in a typical 4-PAM driver with 50- $\Omega$  source and load termination, the 4LINE-PAM6 method can save roughly 10 mA in the transmitter. Simulation results show that for the case of 2 GS/s the required supply current for the overhead circuitry in the 4LINE-PAM6 receiver is roughly 1 mA. Therefore, not only the transmitted power, but also the total power of the transceiver can be lowered by using the 4LINE-PAM6 scheme.

# VI. CONCLUSION

This paper proposed several coding schemes for chip-to-chip applications. These coding schemes can be used as an attempt to approach the theoretical Shannon limit. The main contribution, here, is to propose coding scheme with low-complexity decoders. These coding schemes achieve roughly 3 dB coding gain in the case of an AWGN model for the channel. Moreover, a realistic model for the channel is developed that takes into account the effect of crosstalk, jitter, reflection, ISI, and AWGN. The proposed signaling schemes are significantly less sensitive to those noise sources. In particular, two coding schemes (3LINE-PAM2 and 4LINE-PAM6) that show better performance were highlighted and simulation results show that they provide a coding gain of 5–8 dB in the presence of jitter,



Fig. 25. Detail of circuit implementation for receiver blocks (L1-L4 in Fig. 19).

ISI, and residual reflections. These methods are significantly less sensitive to crosstalk, which is the dominant noise in most of the microstrip interconnects. Finally, the presented low-complexity architectures for analog implementations of 3LINE-PAM2 and 4LINE-PAM6 makes their high-speed implementations feasible. This was also confirmed by circuit-level simulation for the 4LINE-PAM6 receiver at 2 GS/s.

#### ACKNOWLEDGMENT

The authors would like to thank Prof. F. Kschischang for his helpful comments, which led to the development of the proposed low-complexity decoding method in the receiver.

#### REFERENCES

- O. Kwon and R. Pease, "Closely packed microstrip lines as very high-speed chip-to-chip interconnects," *IEEE Trans. Compon., Hybrids, Manufact. Technol.*, vol. CHMT-10, no. 3, pp. 314–320, Sep. 1987.
- [2] J. L. Zerbe, P. S. Chau, C. W. Werner, W. F. Stonecypher, H. J. Liaw, G. J. Yeh, T. P. Thrush, S. C. Best, and K. S. Donnelly, "A 2 Gb/s/pin 4-PAM parallel bus interface with transmit crosstalk cancellation, equalization, and integrating receiver," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, 2001, pp. 66–67.
- [3] M. V. Ierssel, T. Esmailian, A. Sheikholeslami, and S. Pasupathy, "Signaling capacity of FR4 PCB traces for chip-to-chip communication," in *Proc. IEEE Int. Symp. Circuit Syst.*, 2003, pp. 85–88.
- [4] W. J. Dally and J. W. Poulton, *Digital Systems Engineering*. Cambridge, U. K.: Cambridge Univ. Press, 2000.
- [5] N. Tan and S. Eriksson, "Low-power chip-to-chip communication circuits," *Electron. Lett.*, vol. 30, no. 21, pp. 1732–1733, Oct. 1994.
- [6] E. A. Lee and D. G. Messerschmit, *Digital Communication*. Norwell, MA: Kluwer, 1994.
- [7] K. Farzan and D. A. Johns, "Power-efficient chip-to-chip signaling schemes," in *Proc. IEEE Int. Symp. Circuit Syst.*, 2002, pp. 560–563.

- [8] S. Benedetto, G. Montorsi, and D. Divsalar, "Concatenated convolutional codes with interleavers," *IEEE Commun. Mag.*, vol. 41, no. 8, pp. 102–108, Aug. 2003.
- M. Hatamian, "Design consideration for gigabit ethernet 1000Base-T twisted pair transceivers," in *Proc. IEEE Custom Integr. Circuits Conf.*, 1998, pp. 335–342.
- [10] G. Ungerboeck, "Channel coding with multilevel/phase signals," *IEEE Trans. Inf. Theory*, vol. IT-28, no. 1, pp. 55–67, Jan. 1982.
- [11] Matlab Website, [Online]. Available: http://www.mathworks.com
- [12] L. F. Wei, "Trellis coded modulation with multi-dimensional constellations," *IEEE Trans. Inform. Theory*, vol. 33, no. 4, pp. 483–501, Jul. 1987.
- [13] T. Starr, J. M. Cioffi, and P. J. Silverman, Understanding Digital Subscriber Line Technology. Englewood Cliffs, NJ: Prentice-Hall, 1999.
- [14] T. C. Carusone, High-Speed Link Model. [Online]. Available: http:// www.eecg.toronto.edu/~tcc
- [15] N. Gup and L. B. Milstein, "Mapping design for general multidimensional communication systems," in *Proc. Military Commun. Conf.*, 1999, pp. 35–39.
- [16] K. S. Kundert, *The Designer's Guide to SPICE & SPECTRE*. Norwell, MA: Kluwer, 1995.



Kamran Farzan (S'01–M'05) was born in Isfahan, Iran, in 1971. He received the B.A.Sc. degree from Isfahan University of Technology (IUT), Isfahan, Iran, in 1994, the M.A.Sc. degree from the University of Tehran, Tehran, Iran, in 1997, and the Ph.D. degree in the area of high-speed chip-to-chip communication from the University of Toronto, Toronto, ON, Canada, in 2004.

From 1994 to 1995, he worked in the Electrical and Computer Engineering Research Center (ECERC), Isfahan University of Technology (IUT).

From 1997 to 1999, he worked as a Senior Researcher in the area of digital signal processing at ECERC. Since 2003, he has been with Snowbush Microelectronics, Toronto, ON, Canada, where he works in the area of analog design for high-speed transmitter/receiver and read channel analog front end.

Dr. Farzan is the recipient of the Inventors Recognition Awards of Semiconductor Research Corporation (SRC) and the 2003 Analog Devices Outstanding Student Designer Award.



David A. Johns (S'81–M'89–SM'94–F'01) received the B.A.Sc., M.A.Sc., and Ph.D. degrees from the University of Toronto, Toronto, ON, Canada, in 1980, 1983, and 1989, respectively.

In 1988, he was hired at the University of Toronto where he is currently a Full Professor. He has ongoing research programs in the general area of analog integrated circuits with particular emphasis on digital communications. His research work has resulted in more than 40 publications. He is the co-author of a textbook titled *Analog Integrated Circuit Design*  (Wiley, 1997), and has given numerous industrial short courses. Together with academic experience, he also has spent a number of years in the semiconductor industry and is a co-founder of Snowbush Microelectronics.

Dr. Johns is recipient of the 1999 IEEE Darlington Award. He served as a Guest Editor of the IEEE JOURNAL OF SOLID-STATE CIRCUITS and an Associate Editor for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS PART II from 1993 to 1995, and for IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS PART I from 1995 to 1997. He was elected to Adcom for SSCS in 2002.