# A 5-Gbit/s CMOS Optical Receiver With Integrated Spatially Modulated Light Detector and Equalization

Tony Shuo-Chun Kao, Faisal A. Musa, and Anthony Chan Carusone, Senior Member, IEEE

Abstract—This paper presents an optical receiver with a monolithically integrated photodetector in 0.18- $\mu$ m CMOS technology using a combination of spatially modulated light (SML) detection and an analog equalizer. A transimpedance amplifier employing negative Miller capacitance is introduced to increase its bandwidth without causing gain peaking. To provide sufficient reverse-bias voltage to the photodetector's p-n junction, the transimpedance amplifier is operated with a 3.3-V supply, while the rest of the circuit blocks is powered with a 1.8-V supply. The on-chip SML detector achieves a net responsivity of 0.052 A/W. Occupying a core area of 0.72 mm<sup>2</sup>, the fully integrated optical receiver achieves 4.25 and 5 Gbits/s with power consumption values of 144 and 183 mW, respectively.

*Index Terms*—CMOS integrated circuits, equalizers, monolithically integrated photodiode, negative Miller capacitance, photodetector, transimpedance amplifier.

## I. INTRODUCTION

**O** PTICAL RECEIVERS with monolithically integrated photodetectors can be used in short-distance communication systems, in-car fiber-optic networks, and board-to-board links. Optical interfaces are also required in optical storage systems such as CD-ROM, DVD, and Blu-Ray Disc. In all these applications, a photodetector is necessary to convert light into electrical signal for further processing.

Commercially, high-speed photodetectors have mainly been implemented in III-V technologies such as GaAs [1] and InP-InGaAs [2] for long-haul optical communication systems. Although the cost of the entire system is high due to the use of these expensive technologies, the cost per user remains low thanks to the large number of users per channel. However, for short-distance communication, a low-cost solution must be investigated as the communication channels are often shared by a much smaller number of users. This motivates the use of 850-nm low-cost vertical-cavity surface-emitting lasers (VC-SELs) at the transmitter side as the source. To further reduce the cost, the entire optical receiver can be implemented in standard CMOS technology. Not only can this save the extra overhead and cost during assembly for multichip solutions, but also ground-bounce issues, ESD problems, and parasitics associated with bond pads and bond wires can also be eliminated.

T. S.-C. Kao was with The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 3G4, Canada. He is now with Gennum Corporation, Burlington, ON L7L 5M4, Canada.

F. A. Musa and A. Chan Carusone are with The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON M5S 3G4, Canada (e-mail: faisal.a.musa@gmail.com).

Digital Object Identifier 10.1109/TCSI.2010.2050231

In general, light detection in CMOS technology is achieved by a reverse-biased p-n junction that creates a depletion region to collect the electron-hole pairs generated by the incident photons. However, the penetration depth of 850-nm light is much larger than that where the depletion region occurs. Consequently, many carriers generated deep in the silicon substrate must diffuse all the way up to the depletion region. The slow diffusion mechanism limits the data rate to only a few hundreds of Mbit/s if no compensation techniques are employed [3], [4].

Several methods have been developed to compensate for the slow diffusive carriers and improve the speed of monolithically integrated photodetectors in CMOS. In many cases, this can be achieved by using a buried oxide layer [5] or silicon-on-in-sulator process [6], but this requires modification to the stan-dard CMOS process that leads to an increase in cost. Similarly, many photodetectors are built in BiCMOS process due to its superior performance for optical receivers [7], but this approach disallows monolithic integration into a large-scaled CMOS IC. In purely standard CMOS technology, speed can be improved by applying a high reverse-bias voltage to the photodetector's p-n junction, often higher than the power supply, to generate a very thick depletion region (e.g., 6 V in [8] and 13.9 V in [9]). This approach, however, seriously impacts the reliability of the photodetector.

A spatially modulated light (SML) detector topology [10]–[12], which consists of alternatively covered and exposed diodes, can be used to eliminate the slow diffusive carriers at the expense of reduced responsivity because part of the photodetector is covered and hence reflects incident light. Equalization has also been applied to solve this problem. In [4], a high-pass analog equalizer is used to improve the low intrinsic bandwidth of the photodiode without SML detection. The schematic of the analog equalizer is shown in Fig. 1, and the transfer function of this equalizer [G(s)] is

$$G(s) \approx -\frac{R_D}{R_S} - \frac{sR_DC_2}{1 + sR_2C_2} - \frac{sR_DC_3}{1 + sR_3C_3} - \frac{sR_DC_4}{1 + sR_4C_4}.$$
 (1)

Since SML detection is not used in [4], the slow roll-off response of the photodetector begins at a very low frequency of around 1 MHz due to the slow diffusive carriers. Therefore, a high-order equalizer is needed to compensate for the photodetector's slow roll-off of approximately 4 dB/dec from 1 to at least 700 MHz. With SML detection (used in this paper), the slow roll-off of the photodetector starts at around 100 MHz. Thus, a single parallel RC branch for degeneration is sufficient to compensate for the photodetector losses over the much narrower frequency range of 100 MHz-3 GHz. The reduced number of resistors and capacitors also helps lower the impact of component spread, thus obviating the need for adaptive equalization. The presented work combines the SML detection with equalization. The high reverse-bias voltage is avoided to ensure the reliability of the photodetector. Although this combination has been implemented previously, the highest data rate achieved

Manuscript received January 06, 2010; revised March 08, 2010; accepted April 02, 2010. Date of publication August 16, 2010; date of current version November 10, 2010. This work was supported by Broadcom Corporation. This paper was recommended by Associate Editor S. Pavan.



Fig. 1. Schematic of the third-order equalizer implemented in [4].

was only 3.125 Gbits/s at a sensitivity of -4.2 dBm with a bit error rate (BER) of  $10^{-12}$  [13]. This work achieves a maximum data rate of 5 Gbits/s at a sensitivity of -3 dBm with a BER of  $10^{-12}$ . To extend the state-of-the-art data rate to beyond 4 Gbits/s, a wide-bandwidth transimpedance amplifier (TIA) with a large transimpedance gain is implemented as first reported in [14]. The wide bandwidth is achieved by employing negative Miller capacitance [15], [16] in the core amplifier stage. Between the TIA and equalizer, an ac-coupled active feedback stage based on a Cherry–Hooper topology [17] improves the common-mode rejection ratio of the TIA and increases the gain. However, compared to the active feedback stage in [17], this paper does not employ inductive peaking or negative Miller capacitance in the active feedback stage.

This paper is organized as follows. Section II discusses the SML detector and equalization. A block diagram of the entire optical receiver is presented in Section III. Section IV describes the circuit implementation of the TIA, and other blocks are discussed in Section V. Section VI presents the measurement results. Concluding remarks are given in Section VII.

## II. SML DETECTOR AND EQUALIZATION

A simplified cross section of the SML detector consisting of an exposed and a covered photodiode with a light-blocking metal is shown in Fig. 2. When the light is incident on the surface of the detector, carriers generated in the depletion region are immediately collected by the exposed photodiode. Carriers generated in the deep substrate will diffuse toward the depletion region and have equal probability of reaching either the exposed or the covered photodiode. Hence, when the signal currents collected by these two photodiodes are subtracted, the slow diffusive carriers are cancelled, and a faster response is obtained. The cancellation, however, reduces the responsivity of the photodetector, and a low-noise TIA is necessary to amplify differential



Fig. 2. Simplified cross section of the SML detector.

currents in the range of a few microamperes from the SML detector without degrading the sensitivity.

Using p-diffusion, a guard ring is realized around the photodetector, as shown in Fig. 2. The guard ring isolates the currents generated by the photodetector from the rest of the circuitry. Metal 2 was used to block incident light. A higher metal layer would have resulted in a lower parasitic capacitance, but the incident light is blocked more effectively by using a metal close to the active circuitry.

With an area of 75  $\mu$ m × 75  $\mu$ m, the simulated photodiode capacitance is 0.5 pF. This is in close agreement with the simulated photodiode capacitance reported in [18] at the same technology node and at the same TIA supply voltage of 3.3 V.

The total current response of the SML detector  $(J_{\text{total}})$  [see (4), is the sum of the drift component  $(J_{\text{drift}})$  in the depletion region together with the diffusive components in both p-substrate  $(J_{\text{diff},p\text{-substrate}})$  and n-well  $(J_{\text{diff},n\text{-well}})$  defined in (2) and (3), shown at the bottom of the page

$$J_{\text{total}} = J_{\text{drift}} + J_{\text{diff,p-substrate}} + J_{\text{diff,n-well}}.$$
 (4)

Note that the diffusion current densities ( $J_{\rm diff,p-substrate}$  and  $J_{\rm diff,n-well}$ ) denote the current density difference between the exposed and covered photodiodes. As reported in [19] for a typical SML detector in 0.25- $\mu$ m CMOS, below 1 GHz,  $J_{\rm diff,p-substrate}$  is approximately 84% of the total current density,  $J_{\rm diff,n-well}$  is approximately 10% of the total current density, and  $J_{\rm drift}$  is approximately 6% of the total current density.

Due to the large electric field in the depletion region, the bandwidth of  $J_{\text{drift}}$  will always be much larger than that of  $J_{\text{diff,p-substrate}}$  and  $J_{\text{diff,n-well}}$ . Therefore,  $J_{\text{drift}}$  is assumed to be frequency independent. The current responses of  $J_{\text{diff,p-substrate}}$  and  $J_{\text{diff,n-well}}$  are shown in (2) and (3), respectively [19]. The procedure to derive these expressions is briefly discussed in the Appendix. In these equations,  $L_p$ and  $L_n$  denote the diffusion lengths of holes and electrons, respectively;  $\tau_p$  and  $\tau_n$  are the lifetimes of holes and electrons, respectively;  $l_x$  is the distance between the surface and the p-n

$$J_{\text{diff,p-substrate}} = q\alpha L - ne^{-\alpha l_x} \sum_{n=1}^{\infty} \frac{4}{\pi^2 (2n-1)^2} \frac{1}{\sqrt{\left(\frac{(2n-1)2\pi L_n}{l}\right)^2 + 1 + j\omega\tau_n} + \alpha L_n}$$
(2)

$$J_{\text{diff,n-well}} = q \frac{L_p^2}{l} \frac{32}{\pi^2} \frac{(1 - e^{-\alpha l_x})}{l_x} \sum_{n=1}^{\infty} \sum_{m=1}^{\infty} \frac{\frac{2l_x}{l_y} \left(\frac{1}{2n-1}\right)^2 + \frac{l_y}{2l_x} \left(\frac{1}{2m-1}\right)^2}{\left(\frac{(2n-1)\pi L_p}{2l_x}\right)^2 + \left(\frac{(2m-1)\pi L_p}{l_y}\right)^2 + 1 + j\omega\tau_p} \tag{3}$$



Fig. 3. Normalized  $J_{\rm diff,p-substrate}$  for different wavelengths. For these plots,  $L_n = 0.311 \,\mathrm{cm}$ ,  $l_x = 2 \,\mu\mathrm{m}$ ,  $\tau_n = 2.5 \times 10^{-3} \,\mathrm{s}$ ,  $\alpha = 600 \,\mathrm{cm^{-1}}$ , 2400 cm<sup>-1</sup>, and 57 000 cm<sup>-1</sup> for  $\lambda = 850$ , 680, and 430 nm, respectively.



Fig. 4. Normalized  $J_{\rm diff,n-well}$  for different  $l_y$ 's. For these plots,  $L_p=0.15~{\rm cm}, l=8~\mu{\rm m}, l_x=2~\mu{\rm m},$  and  $\tau_p=2.5\times10^{-3}~{\rm s}.$ 

junction;  $l_y$  is the width of the n-well, and l is the periodicity of the structure.

Fig. 3 shows normalized  $J_{\text{diff,p-substrate}}$  [i.e., (2)] for different wavelengths. The absorption coefficient ( $\alpha$ ) determines how deep the light penetrates into a material and is a function of wavelength. Longer wavelength light gives a smaller value of  $\alpha$ , resulting in a lower bandwidth because the light penetrates deeper into the substrate and carriers generated further down must slowly diffuse up to the junction.

Fig. 4 shows normalized  $J_{\rm diff,n-well}$  [i.e., (3)] for different n-well widths  $(l_y$ 's). As  $l_y$  reduces, holes arrive at the junction more quickly, thus resulting in a higher bandwidth. It can also be observed that, although  $l_y = 1 \ \mu$ m results in the highest bandwidth, it also leads to a lower responsivity and has less effect on the overall bandwidth. The final dimension of  $l_y = 2.6 \ \mu$ m was chosen as a compromise between bandwidth and responsivity. The spacing between neighboring n-wells was the minimum allowable distance set by the design rules, i.e.,  $1.4 \ \mu$ m. Hence, this results to  $l = 8 \ \mu$ m. The actual layout of the SML detector consists of alternating fingers of the exposed and covered photodiodes with a total area of  $75 \ \mu$ m  $\times 75 \ \mu$ m to facilitate coupling to multimode fibers.



Fig. 5. Frequency response of the SML detector, the equalizer, and the equalized SML detector.

The normalized total current response of the SML detector  $(J_{\text{total}})$  is shown in Fig. 5. Note that the total current shows more or less the same frequency behavior as  $J_{\text{diff,p-substrate}}$  in Fig. 3 since it is the dominant component. This is consistent with the results reported in [19]. The slow roll-off at high frequencies is due to the summation of the different current responses. The resulting intrinsic bandwidth of the SML detector has been improved to 700 MHz (relative to the case when only the exposed photodiodes are present [19]) but is insufficient for the target 5-Gbit/s operation. Hence, an additional analog peaking equalizer is used to compensate the frequency roll-off of the SML detector. The peaking in the equalizer is modeled with a zero and two poles in its transfer function

$$H(s) = \frac{1 + \frac{s}{\omega_z}}{\left(1 + \frac{s}{\omega_{p1}}\right)\left(1 + \frac{s}{\omega_{p2}}\right)}$$
(5)

where  $\omega_z$ ,  $\omega_{p1}$ , and  $\omega_{p2}$  denote the zero and the two poles of H(s), respectively. The second pole of the equalizer is used to model the output parasitic capacitance of the equalizer that arises during circuit design. The normalized frequency response of the equalizer and the equalized SML detector is shown in Fig. 5 together with that of the SML detector. The boosting of the equalizer extends the bandwidth of the SML detector to 3 GHz when  $\omega_z = 2.2$  Grad/s,  $\omega_{p1} = 3.89$  Grad/s, and  $\omega_{p2} = 33$  Grad/s. With the combination of the SML detector and an analog equalizer, a higher speed of operation is achievable. Note that Fig. 5 shows the results of a system-level simulation without considering the bandwidth limitation due to the TIA. Practically, one of the poles in (5) may be due to the TIA's finite bandwidth.

## **III. OPTICAL RECEIVER**

As shown in Fig. 6, the entire optical receiver front end begins with the SML detector that converts the incident optical power into two currents. When the subtraction of these two signals is performed, the signal due to slow diffusive carriers from the substrate can be removed, and the bandwidth of the photodetector can be improved. A differential TIA converts the two currents from the photodiodes into two voltages. The photodiodes are dc coupled to the TIA to avoid the extra parasitic capacitance due to ac coupling. A supply voltage of 3.3 V was used



Fig. 6. Optical receiver block diagram.



Fig. 7. Schematic of the RGC stage followed by the shunt-shunt feedback TIA used in other works [8], [13], [20].

in the TIA to provide a high reverse-bias voltage for the photodiodes so that large photodetector bandwidth and responsivity can be achieved simultaneously. AC coupling with a cutoff frequency of 100 kHz was used to connect the TIA to the rest of the circuit blocks that operate from a 1.8-V supply. A differential gain stage follows the ac coupling to further improve common-mode rejection. An equalizer and a postamplifier (PA) follow to further remove intersymbol interference (ISI) and increase the signal swing, respectively. For testing purposes, an output buffer was used to drive the signal to the oscilloscope.

## IV. TRANSIMPEDANCE AMPLIFIER

Among all the building blocks in the whole receiver, the TIA is the most critical one in this design. Since the current signals from the photodetector are only tens of microamperes, the performance of the TIA often limits the overall performance of the entire receiver. Hence, tradeoffs between transimpedance gain, bandwidth, noise, and stability must be examined.

A regulated cascode (RGC) stage has often been inserted before the TIA to present the photodetector with a low input resistance [8], [13], [20], [21]. The schematic of an RGC stage is shown in Fig. 7. The small-signal input resistance at the source of M1 is approximately [21]

$$R_{\rm in} \approx \frac{1}{g_{m1}(1+g_{mB}R_B)}.$$
(6)

The small input resistance provides a smaller input time constant  $(R_{in}C_{in})$  than what is achievable without the RGC. Thus, the RGC stage acts as a buffer between the photodetector capacitance and the TIA and reduces the dependence of the stability of the TIA on the value of the photodetector capacitance. An RGC stage proves useful in applications where the optical receiver has to operate with a variety of detectors. However, the additional thermal noise sources of the RGC stage reduce the receiver's sensitivity, and the RGC stage limits the available reverse-bias

voltage for the photodetector, further degrading its sensitivity. For these reasons, an RGC stage was not used in this paper.

Since the TIA is placed right at the input, its noise contribution often dominates the overall receiver. A large transimpedance gain is therefore desirable as it reduces the input-referred noise contribution from later stages. The bandwidth of the TIA must also be chosen carefully. A small bandwidth causes ISI and reduces the eye opening, while a large bandwidth increases the total integrated noise and reduces the input sensitivity. For a data rate  $R_b$ , a TIA bandwidth of  $0.7 \cdot R_b$  is a common compromise between ISI and noise [22]. For 4-Gbit/s operation, a target bandwidth of approximately 2.8 GHz was established. Since the bandwidth of the photodetector is only 700 MHz, a TIA bandwidth of 2.8 GHz is large enough to carry the signal to the equalizer. In addition, the selected bandwidth also ensures that peaking at  $0.5 \cdot R_b$  from the equalizer is not attenuated. Moreover, as the TIA generally employs feedback, stability must be examined. A minimum phase margin of 60° was targeted to minimize overshoot and hence ensure the quality of the eye diagram.

The shunt-shunt feedback TIA has been frequently used in optical receiver front ends and is shown at the front end of the receiver in Fig. 6. Denoting the low-frequency transimpedance gain as  $R_T$  and the low-frequency voltage gain of the core amplifier as  $A_C$ , we can approximate  $R_T$  as

$$R_T \approx \frac{R_F A_C}{A_C + 1}.\tag{7}$$

Hence,  $R_T$  is approximately equal to  $R_F$  for large  $A_C$ . The sensitivity of an optical receiver at a BER =  $10^{-12} (P_S)$ is related to the input-referred noise current of the TIA  $(I_{in,TIA})$ (assuming that no noise is generated by the photdetector itself) and the responsivity of the photodetector  $(R_{PD})$  as follows [16]:

$$P_S = \frac{7I_{\rm in,TIA}}{R_{\rm PD}}.$$
(8)

Assuming that the transimpedance gain of the TIA is approximately equal to  $R_F$ , the input-referred noise current can be calculated from the output noise voltage using the following relation:

$$I_{\rm in,TIA} = \frac{V_{n,o}}{R_F} \tag{9}$$

where  $V_{n,o}$  is the total noise voltage at the output of the TIA. Combining (8) and (9) leads to

$$R_F = \frac{7V_{n,o}}{R_{\rm PD}P_S}.$$
(10)

Thus, a large transimpedance  $(R_F)$  is essential for low  $P_S$ . However, the bandwidth of the receiver degrades with high transimpedance.



Fig. 8. Schematic of the TIA.

With an average input optical power of  $P_S = -5$  dBm and assuming a photodetector responsivity of  $R_{\rm PD} = 0.03$  A/W, as reported in [23], a photdiode current of 9  $\mu$ A is anticipated. Thus, to achieve a swing of at least 50 mV at the TIA output, we require  $R_F$  to be approximately 5.6 k $\Omega$ . Using (10), this provides a BER of  $10^{-12}$  as long as the TIA output noise is restricted below 7.5 mV rms.

The schematic of the TIA is shown in Fig. 8. The transistor size is specified as  $N_f \times W_f/L$ , where  $N_f, W_f$ , and L denote the number of fingers, the finger width, and the finger length, respectively. The gate-source and gate-drain voltages of all transistors are kept within 1.8 V to guarantee reliability of operation with the 3.3-V supply. To increase  $A_C$ , two identical differential-pair amplifiers were cascaded to implement the core amplifier. The number of stages was limited to two as a compromise between gain and stability. A capacitor can be added in parallel with  $R_F$  to eliminate gain peaking but with reduced TIA bandwidth [24]. Instead, in this paper, a negative Miller capacitance [15], [16]  $C_m$  is employed in the core amplifier, which increases the pole frequency of this stage and leads to an increase in the overall TIA bandwidth. The negative Miller capacitance reduces the net parasitic capacitance at the intermediate node between cascaded stages. However, the negative Miller capacitance cannot be increased without bound as this will deteriorate the stability of the TIA. To understand the effect of the negative Miller capacitance on the stability of the loop, we model the TIA, as shown in Fig. 9. Here,  $C_m$  denotes the negative Miller capacitance,  $C_{in}$  denotes the input capacitance of the TIA (including the photodiode capacitance and the input parasitic capacitance of the first amplifier stage),  $C_{p1}$  is the parasitic capacitance at the intermediate node between the first and second stages,  $C_{p2}$  is the parasitic capacitance at the



Fig. 9. TIA model used for analysis.

TIA output,  $R_{L1}$  is the output resistance of the first stage,  $R_{L2}$ is the output resistance of the second stage, and  $A_1 = g_{m1}R_{L1}$ and  $A_2 = g_{m2}R_{L2}$  are the gains of the two stages. Note that the TIA is a three-pole system, with the input pole being the dominant pole due to the large photodiode capacitance (= 500 fF). In order to achieve minimum input-referred noise, the TIA front-end transistors were sized to match the photodiode capacitance [16], [25]–[27]. Therefore, the input pole (including photodiode and TIA input parasitics) with  $R_F = 5.6 \text{ k}\Omega$  is at approximately 28 MHz. The core amplifier is a cascade of two identical stages, and the intermediate-node parasitic cap  $C_{p1}$  is 680 fF, resulting in a pole at 1.8 GHz (with  $C_m = 0$ and  $R_{L1} = R_{D1} || r_{ds1} = 130 \Omega$ ). The load of the TIA was sized to set the output pole to 7.2 GHz (with  $C_m = 0$  and  $R_{L2} = 130 \Omega$ ). Note that, with increasing  $C_m$ , we expect the intermediate pole to move away from the origin and the output pole to move toward the origin. Fig. 10 shows the effect of increasing  $C_m$  on the poles of the system.



Fig. 10. Effect of  $C_m$  on the poles of the TIA. (a)  $C_m = 0$ . (b)  $C_m > 0$ .

Specifically, the loop gain is

$$LG(s) = \frac{-A_1 A_2 \left(1 + s \frac{C_m}{g_{m2}}\right)}{(1 + s\tau_{\rm in})(1 + s\tau_1)(1 + s\tau_2)}$$
(11)

where

$$\tau_{\rm in} = C_{\rm in} R_F \tag{12}$$
$$\tau_1 = 0.5 \left( \tau_A + \sqrt{\tau_A^2 - 4\tau_P^2} \right) \tag{13}$$

$$\tau_{2} = 0.5 \left( \tau_{A} - \sqrt{\tau_{A}^{2} - 4\tau_{B}^{2}} \right)$$
(14)

$$\tau_A = R_{L1}C_{p1} + R_{L2}C_{p2}$$

$$+C_m \left(R_{L1}(1-A_2)+R_{L2}\right) \tag{15}$$

$$\tau_B = \sqrt{R_{L1}R_{L2}\left(C_{p2}C_{p1} + C_m(C_{p1} + C_{p2})\right)}.$$
 (16)

In these equations,  $\tau_{\rm in}$  denotes the input time constant,  $\tau_1$  is the intermediate time constant, and  $\tau_2$  is the output time constant. Both  $\tau_1$  and  $\tau_2$  are related to  $C_m$  through time constants  $\tau_A$  and  $\tau_B$ . Since  $\tau_A > \tau_B$  when  $C_m = 0$ ,  $\tau_1 > \tau_2$ . As  $C_m$  increases,  $\tau_A$  decreases and  $\tau_B$  increases, thus causing  $\tau_1$  to decrease and  $\tau_2$  to increase. At some point,  $\tau_1$  equals  $\tau_2$ . A further increase in  $C_m$  causes  $\tau_1$  and  $\tau_2$  to become complex, thus degrading the stability of the system.

The phase margin of the TIA can be expressed as

$$PM = 180^{\circ} - \tan^{-1}(\omega_t \tau_{\rm in}) + \tan^{-1}\left(\frac{\omega_t C_m}{g_{m2}}\right) - \tan^{-1}(\omega_t \tau_1) - \tan^{-1}(\omega_t \tau_2) \quad (17)$$

where  $\omega_t$  is the unity-gain frequency of the loop gain. Neglecting the phase contribution of the zero in (17) (since  $g_{m2}$  is large) and approximating the second term in (17) to 90° (since  $\tau_{in}$  is large) result in the following expression:

$$PM \approx \tan^{-1} \left( \frac{1 - \omega_t^2 \tau_1 \tau_2}{\omega_t (\tau_1 + \tau_2)} \right) = \tan^{-1} \left( \frac{1 - \omega_t^2 \tau_B^2}{\omega_t \tau_A} \right).$$
(18)

As  $C_m$  is increased,  $\tau_A$  decreases and  $\omega_t$  increases slightly, resulting in little net change in phase margin. However, at larger values of  $C_m$ , the value of  $\tau_B$  increases, approaching  $1/\omega_t$ , thus causing the phase margin to degrade.

The TIA's closed-loop transfer function is

$$A(s) = \frac{-R_{F0}}{1 + K_1 s + K_2 s^2 + K_3 s^3}$$
(19)

where

$$R_{F0} = \frac{R_F A_1 A_2}{1 + A_1 A_2} \tag{20}$$

 TABLE I

 PERFORMANCE COMPARISON OF DIFFERENT VALUES OF  $C_m$ 

| $C_m$ (fF)             | -              | 0    | 90   | 150   |
|------------------------|----------------|------|------|-------|
| DC gain (dB $\Omega$ ) | -              | 74.6 | 74.6 | 74.6  |
| Bandwidth (GHz)        | Using (19) 1.7 |      | 2.5  | 3.0   |
|                        | Circuit sims   | 2.0  | 2.9  | 3.3   |
| Peaking (dB)           | Using (19)     | 0.8  | 0.13 | 16    |
|                        | Circuit sims   | 0.48 | 0.18 | 10.38 |
| PM (degrees)           | Using (17)     | 57   | 64.5 | 28    |
|                        | Circuit sims   | 59   | 64   | 34    |



Fig. 11. Frequency response of the TIA with different values of  $C_m$ .



Fig. 12. TIA: Output noise spectrum density.

$$K_1 = \frac{\tau_{\rm in} + \tau_1 + \tau_2}{1 + A_1 A_2} \tag{21}$$

$$K_2 = \frac{\tau_1 \tau_2 + \tau_1 \tau_{\rm in} + \tau_2 \tau_{\rm in}}{1 + A_1 A_2} \tag{22}$$

$$K_3 = \frac{\tau_1 \tau_2 \tau_{\rm in}}{1 + A_1 A_2}.$$
 (23)

Using (19), the performance of the TIA with different values of  $C_m$  is computed and compared with circuit-level simulations. Table I summarizes the comparison results. Good agreement between circuit-level simulations and the TIA model is observed. Thus, increasing  $C_m$  improves the phase margin, bandwidth, and peaking of the TIA up to a certain limit. A further increase in  $C_m$  degrades the stability and should be avoided. For this design,  $C_m = 90$  fF was chosen as the optimal point since it results in minimal peaking. Simulation results also indicate that  $\pm 10\%$ 



Fig. 13. Schematic of the differential gain stage.



Fig. 14. Schematic of the equalizer.

variation in the value of  $C_m$  resulted in less than  $\pm 6\%$  variation in TIA bandwidth and less than  $\pm 1.2\%$  variation in phase margin. Thus, the design is robust in the presence of  $\pm 10\%$  variations in  $C_m$  around its chosen nominal value of 90 fF. Fig. 11 shows the circuit-level simulations of the TIA response for different values of  $C_m$ . Note that  $C_m = 90$  fF ensures a flat frequency response with a simulated transimpedance gain of 74.6 dB $\Omega$  and a bandwidth of 2.9 GHz. A TIA gain of 74.6 dB $\Omega$ and a bandwidth of 2.9 GHz compare favorably with the design in [13], which reports a (simulated) TIA gain of 60 dB $\Omega$  and a bandwidth of 2.58 GHz at the same technology node.

Fig. 12 shows the simulated output noise spectrum density of the TIA for  $C_m = 90$  fF. The simulation was performed assuming a photodiode capacitance of 0.5 pF using the *Pnoise* tool in *Spectre*. From the plot, the total integrated output noise



Fig. 15. Equalizer frequency response over process and temperature variations. The dark line corresponds to a slow corner at 100 °C and with 20% increase in resistor values, the dash-dotted line corresponds to a typical corner with nominal resistor values, and the dashed line corresponds to a fast corner with 20% decrease in resistor values.

of the TIA is calculated as 1.06 mV rms. Combining this with a transimpedance gain of 5.6 k $\Omega$ , the input-referred rms noise current is 0.19  $\mu$ A. The input-referred noise of the TIA does not include the effect of equalization and other subsequent receiver blocks.

# V. INTEGRATED FRONT-END BLOCKS

This section describes the receiver blocks following the ac-coupled TIA stage. From Fig. 6, these blocks are the differential gain stage, equalizer, PA, and output buffer.

## A. AC Coupling

Since the TIA is the only block using a 3.3-V supply, ac coupling provides a shift down in the common-mode voltage going into the subsequent stage. The values of  $R_{\rm ac}$  and  $C_{\rm ac}$  were chosen so that the cutoff frequency is 100 kHz, which is low enough to pass the lowest frequency content of a  $2^{31} - 1$  pattern at 3–5 Gbits/s. To reduce the load on the TIA, a combination of lower  $C_{\rm ac}$  and higher  $R_{\rm ac}$  were used, as shown in Fig. 6.



Fig. 16. Equalizer input and output eye diagrams over process corners at 4 Gbits/s. (a) Input to equalizer: pattern-dependent jitter = 30 ps. (b) Equalizer output with typical corner: pattern-dependent jitter = 5.5 ps. (c) Equalizer output with slow corner at  $100^{\circ}$ C and with 20% increase in resistor values: pattern-dependent jitter = 7.8 ps. (d) Equalizer output with fast corner at  $20^{\circ}$ C and with 20% decrease in resistor values: pattern-dependent jitter = 6.8 ps.

### B. Differential Gain Stage

The purpose of the differential gain stage is to improve the common-mode rejection of the optical receiver. It also provides additional gain, which is important to reduce the noise contribution of later stages.

The schematic of the differential gain stage is shown in Fig. 13. To achieve a high gain–bandwidth product (GBW), it is based on a Cherry–Hooper amplifier with active feedback [17]. The transfer function of this amplifier is

$$\frac{V_{\text{out}}}{V_{\text{in}}}(s) = \frac{A_{vo}\omega_n^2}{s^2 + 2\zeta\omega_n s + \omega_n^2}$$
(24)

where

$$A_{vo} = \frac{g_{m1}g_{m3}R_{D1}R_{D2}}{1 + g_{m3}g_{m5}R_{D1}R_{D2}}$$
(25)

$$\zeta = \frac{1}{2} \frac{R_{D1}C_1 + R_{D2}C_2}{\sqrt[2]{R_{D1}R_{D2}C_1C_2(1 + g_{m5}g_{m3}R_{D1}R_{D2})}}$$
(26)

$$\omega_n^2 = \frac{1 + g_{m5}g_{m3}R_{D1}R_{D2}}{R_{D1}R_{D2}C_1C_2}.$$
(27)

In the aforementioned equations,  $g_{m1}$ ,  $g_{m3}$ , and  $g_{m5}$  denote the transconductances of M1, M3, and M5, respectively. As seen from (25), to achieve high gain,  $g_{m5}$  must be small, and  $g_{m1}$  must be large.  $g_{m3}$  must also be large to support a high bandwidth, as depicted by (27). A low  $g_{m5}$  was achieved with a lower bias current, and high  $g_{m1}$  and  $g_{m3}$  were achieved with wider device sizes for M1 and M3 in order to save power consumption. The differential gain stage had a simulated dc gain of 10 dB, a bandwidth of 12.4 GHz, and 28 dB of common-mode rejection. Since the TIA output signals are not fully differential gain stage.



Fig. 17. Architecture of the PA.

#### C. Equalizer

To compensate for the slow roll-off of the photodetector (< 20 dB/dec), the equalizer provides a corresponding roll-up. This slow roll-up can be realized using a higher order equalizer to generate multiple poles and zeros.

With the SML detection employed in this paper, the slow roll-off of the photodetector starts at a much higher frequency of about 100 MHz. Hence, the equalizer only needs to provide the corresponding roll-up over a much narrower frequency band. For this reason, the well-known technique of capacitive degeneration using single parallel RC branches is sufficient to achieve data rates up to 5 Gbits/s while minimizing area. The reduced numbers of resistors and capacitors also help lower the component spread and avoids the use of adaptive equalization. The schematic of the equalizer used in this paper is shown in Fig. 14. The transfer function of the equalizer is

$$\frac{V_{\text{out}}}{V_{\text{in}}}(s) = \frac{g_{m1}R_D}{1 + \frac{g_{m1}R_S}{2}} \cdot \frac{1 + \frac{s}{\omega_{z1}}}{\left(1 + \frac{s}{\omega_{p1}}\right)\left(1 + \frac{s}{\omega_{p2}}\right)}$$
(28)

where  $\omega_{z1} = 1/(R_S C_S)$ ,  $\omega_{p1} = (1 + (g_{m1}R_S/2))/(R_S C_S)$ , and  $\omega_{p2} = 1/(R_D C_L)$ . To achieve a slow roll-up response with



Fig. 18. PA: Frequency response.



Fig. 19. Output buffer: Schematic.

the equalizer shown in Fig. 14, it is sufficient to place  $\omega_{p1}$  close to  $\omega_{z1}$ . The degeneration capacitors are MOS devices in accumulation mode, realized by placing NMOS structures inside an n-well.

The frequency response of the equalizer over process and temperature variations is shown in Fig. 15. The equalizer had a simulated dc gain of 0 dB with a maximum high-frequency gain of 5.8 dB. Transient simulation was used to verify the performance of the equalizer by comparing the eye diagrams before and after the equalizer at 4 Gbits/s, as shown in Fig. 16. A poor-quality eye was introduced at the input to the equalizer by sending a portion of a  $2^{31} - 1$  pseudorandom bit sequence (PRBS) pattern through a low-pass filter with cutoff at 1.3 GHz. Fig. 16(a) shows the input eye to the equalizer with a pattern-dependent jitter of 30 ps. The eyes at the output of the equalizer for typical, slow, and fast corners had pattern-dependent jitters of 5.5, 7.8, and 6.8 ps, respectively, and are shown in Figs. 16(b)-(d). Current densities were maintained across corners by adjusting control voltages associated with the bias circuitry. Measurement results with the equalizer activated and deactivated are shown in Section VI, Figs. 24, and 26.

## D. PA

The purpose of the PA is to amplify the small signal coming out of the equalizer to a level that is large enough to ensure the correct operation of any subsequent circuitry. For the proposed system architecture, it is important to ensure good lin-



Fig. 20. Output buffer: Frequency response.

earity until equalization is performed, but only after the equalizer clipping can be tolerated as the binary information is still preserved. Hence, a limiting amplifier (LA) is used. With a typical received average optical power of -5 dBm from the VCSEL source, a photodiode responsivity of 0.03 A/W, and a combined gain of approximately 80 dB $\Omega$  from the TIA through the equalizer, an input signal of 95 mV is expected at the LA input.

The architecture of the PA is shown in Fig. 17. It consists of an input stage that performs offset cancellation, four identical gain cells to provide the necessary gain, and an offset cancellation network to low-pass filter the output and sense any dc offset. Each gain cell was implemented using the same topology as the



0.86 mm

Fig. 21. Die photograph of the fully integrated optical receiver.



Fig. 22. Photograph of the evaluation PC board.

one for the subtracter, shown in Fig. 13, but with 2.8, 2.8, and 0.4 mA flowing through M7, M8, and M9, respectively. Although a greater overall GBW can be achieved with a higher number of stages, power consumption increases and noise performance degrades as the gain per stage drops. Since the bandwidth requirement is less than one-tenth of the nMOS devices' peak  $f_T$ , four stages were sufficient.

The frequency response of the PA is shown in Fig. 18. The PA has a simulated dc gain of 39 dB and a small-signal bandwidth of 3.7 GHz through all five stages. The low-frequency cutoff due to the offset cancellation is 40 kHz—which is low enough to avoid baseline wander with a  $2^{31} - 1$  PRBS pattern.

## E. Output Buffer

The schematic of the output buffer is shown in Fig. 19. An  $f_T$  doubler was used to halve the effective input capacitance [22]. As the signal goes through the PA, it has such a high amplitude at the input of the output buffer that the small-signal bandwidth becomes inapplicable. However, the small-signal bandwidth can be used as a conservative lower limit, and a target of 5 GHz was established assuming a 400-fF output load, which models the pad and package parasitic capacitances. A small-signal bandwidth of 5.4 GHz was achieved, as shown by the frequency response of the output buffer in Fig. 20.

## VI. MEASUREMENT RESULTS

The optical receiver was implemented in a one-poly six-metal 0.18- $\mu$ m CMOS technology without the use of additional masks



Fig. 23. Net responsivity of the photodetector at different reverse-bias voltages.





Fig. 24. Measured eye diagrams in LP mode at 4.25 Gbits/s for a  $2^{31} - 1$  PRBS pattern and an average  $P_{\rm opt}$  of -3 dBm with (a) equalization off and (b) equalization on (horizontal scale: 100 ps/div; vertical scale: 50 mV/div).

or antireflective coatings to enhance the optical performance of the photodetector. It occupies a core area of 0.72 mm<sup>2</sup>. The die photograph of the circuit is shown in Fig. 21. An output buffer with  $50-\Omega$  on-chip termination was employed to provide matching. The chip was decapsulated and mounted on a printed



Fig. 25. Measured eye diagrams in LP mode for a  $2^{31} - 1$  PRBS pattern with varying input optical power and data rates. (a) 2.5 Gbits/s and optical power of -9.5 dBm. (b) 3.125 Gbits/s and optical power of -8.5 dBm. (c) 2.5 Gbits/s and optical power of -3 dBm. (d) 3.125 Gbits/s and optical power of -3 dBm.

circuit board to allow access to the photodetector, as shown in Fig. 22. A stand-alone SML detector with the same dimension as the one attached to the TIA of the optical receiver was connected to their own package pins for dc responsivity measurement. A 50/125- $\mu$ m multimode fiber was used to couple the light emitted from a 850-nm VCSEL source (HFE4192-582) to the on-chip photodetector. The extinction ratio for the laser source is ten, as recommended in the laser data sheet. A power meter (Noves OPM4) and a multimeter (Keithley 2400) were used to measure the input optical power and the output current in order to determine the dc responsivity of the photodetector. With an input power of -3 dBm or, equivalently, 500  $\mu$ W, the currents coming from both the exposed and covered photodiodes were 68.4 and 42.5  $\mu$ A, respectively, when the reverse-bias voltage was set to 2 V. This leads to a net responsivity<sup>1</sup> of 0.052 A/W. The net responsivity of the photodetector was also measured at other reverse-bias voltages, and the result is shown in Fig. 23. As expected, the net responsivity increases with the reverse-bias voltage since increasing the reverse-bias voltage increases the width of the depletion region in the photodiode. For a fixed optical power, this causes more carriers to be captured by the electric field in the depletion region, resulting in less recombination

<sup>1</sup>The net responsivity refers to the difference between the responsivities of the exposed and covered photodiodes.

of electron-hole pairs, thus translating to a higher responsivity [28].

A pattern generator (MP1701A) was then used to modulate the 850-nm VCSEL with NRZ data. A BER tester (Anritsu MP1800A) was used for BER measurements. The optical receiver was operated in two modes that differ only in the current consumption of the TIA. Hence, two sets of measurements were obtained. The first set of measurements (low power or LP) corresponds to a total current consumption of 9 mA in the TIA. The second set of measurements (high speed or HS) corresponds to a total current consumption of 20 mA in the TIA. A  $2^{31} - 1$ PRBS pattern was used to modulate the VCSEL.

For the LP mode, the highest data rate with a BER of less than  $10^{-12}$  and an average optical input power ( $P_{opt}$ ) of -3dBm is 4.25 Gbits/s. In order to verify the performance improvement provided by the equalizer, the boosting must be eliminated. This was achieved by pulling both  $V_{RS}$  and  $V_{CS}$  to 1.8 V. The eye diagrams with equalization off and on are shown in Fig. 24(a) and (b), respectively. The eye diagrams at 2.5 and 3.125 Gbits/s are shown in Fig. 25(a) and (b) for average optical input power values of -9.5 and -8.5 dBm, respectively. Increasing the optical input power to -3 dBm improves the output eyes at 2.5- and 3.125-Gbit/s data rates, as shown in Fig. 25(c) and (d), respectively.



Fig. 26. Measured eye diagrams in HS mode at 5 Gbits/s for a  $2^{31} - 1$  PRBS pattern and an average  $P_{\text{opt}}$  of -3 dBm with (a) equalization off and (b) equalization on (horizontal scale: 75 ps/div; vertical scale: 50 mV/div).

For the HS mode, the maximum data rate with a BER of less than  $10^{-12}$  and an average  $P_{\rm opt}$  of  $-3~\rm dBm$  is increased to 5 Gbits/s due to a larger TIA bandwidth. Shown in Fig. 26(a) and (b) are the eye diagrams with equalization off and on, respectively. The boosting of the equalizer was deactivated using the same approach done in the LP mode. The measured BER as a function of average  $P_{opt}$  at different data rates is shown in Fig. 27(a) and (b) for the LP and HS modes, respectively. It can be observed from Fig. 27(a) that the BER improvement becomes small at higher data rates with increasing average  $P_{\text{opt}}$ . This suggests that the BER in the LP mode is not limited by noise at these data rates but rather by the bandwidth of the receiver. To enable a higher data rate, the bandwidth of the TIA is increased with a higher current consumption in the HS mode. As shown in Fig. 27(b), significant improvement in BER is observed with increasing average  $P_{\rm opt}$  at 4.25 and 5 Gbits/s.

Fig. 28 shows the optical sensitivity with a BER of less than  $10^{-12}$  at different data rates in both modes. The sensitivity is better in the LP mode at lower data rates because an increased TIA bandwidth in the HS mode leads to a higher total integrated



Fig. 27. Measured BER as a function of average  $P_{\rm opt}$  at different data rates under (a) LP mode and (b) HS mode.



Fig. 28. Optical sensitivity with a BER of less than  $10^{-12}$  for different data rates in both modes, with the HP mode consuming 36 mW more than the LP mode.

noise. Therefore, while the HS mode is suitable for higher data rates, the LP mode is preferred for data rates below 4.25 Gbits/s from both power consumption and sensitivity perspectives.

|      | Technology             | Responsivity | Highest Supply | Data Rate  |
|------|------------------------|--------------|----------------|------------|
| [4]  | 0.18-µm CMOS           | -            | 1.8 V          | 3 Gbps     |
| [8]  | 0.18-µm CMOS           | -            | 6 V            | 2.5 Gbps   |
| [9]  | 0.18-µm CMOS           | 0.38 A/W     | 13.9 V         | 5 Gbps     |
| [12] | 0.6-µm CMOS            | 0.1 A/W      | 5 V            | 250 Mbps   |
| [13] | 0.18-µm CMOS           | 0.07 A/W     | 3.3 V          | 3.125 Gbps |
| [23] | 0.18-µm CMOS           | 0.03 A/W     | 1.8 V          | 1.2 Gbps   |
| This | work (LP) 0.18-µm CMOS | 0.05 A/W     | 3.3 V          | 4.25 Gbps  |
| This | work (HS) 0.18-µm CMOS | 0.05 A/W     | 3.3 V          | 5 Gbps     |

TABLE III DETAILED COMPARISON OF FULLY INTEGRATED OPTICAL RECEIVERS, INCLUDING PHOTODETECTOR, TIA, AND PA

|                | Power  | Data Rate                                     | Sensitivity                                | Area                 |
|----------------|--------|-----------------------------------------------|--------------------------------------------|----------------------|
| [8]            | 138 mW | 2.5 Gbps                                      | -4.5 dBm                                   | $0.53 \text{ mm}^2$  |
| [13]           | 175 mW | 3.125 Gbps                                    | -4.2 dBm                                   | $0.7 \text{ mm}^2$   |
| [23]           | 250 mW | 1.2 Gbps                                      | -8 dBm                                     | $4.5 \text{ mm}^2$   |
| This work (LP) | 144 mW | 2.5 Gbps<br>3.125 Gbps<br>4.25 Gbps           | -9.5 dBm<br>-8.5 dBm<br>-3 dBm             | $0.72 \text{ mm}^2$  |
| This work (HS) | 183 mW | 2.5 Gbps<br>3.125 Gbps<br>4.25 Gbps<br>5 Gbps | -7.5 dBm<br>-6.8 dBm<br>-4.5 dBm<br>-3 dBm | 0.72 mm <sup>2</sup> |

# VII. CONCLUSION

By combining an SML detector and an analog equalizer in a monolithically integrated photodetector, a maximum data rate of 5 Gbits/s was achieved. To the authors' knowledge, it is the fastest photodetector integrated in a standard CMOS technology using standard supplies at or below 3.3 V. A low-noise TIA with high bandwidth and high transimpedance gain has also been proposed. By employing a negative Miller capacitance, the bandwidth of the TIA can be extended while keeping a flat frequency response. The measurement results of the optical receiver operating in two modes are compared with that of recently published photodetectors built in standard CMOS technology in Table II. Among the references mentioned in Table II, only [8], [13], [23], and this paper have integrated the TIA and PA together with the photodetector on-chip. A more detailed comparison between these fully integrated optical receivers is summarized in Table III. Although 5 Gbits/s was reported in [9], it used a very high supply voltage of 13.9 V to reverse-bias its photodiode through an external bias-T. In addition, the authors failed to report the sensitivity at 5 Gbits/s. Moreover, since the TIA and PA were not integrated on chip with the photodetector, an external TIA was used for testing. In conclusion, the optical receiver achieves better sensitivity at 2.5 and 3.125 Gbits/s at a BER of less than  $10^{-12}$  compared to [8] and [13] in both LP and HS modes. When operating in the LP mode, the optical receiver accomplishes so with less power consumption than that in [13]. The improvement in sensitivity justifies the design of the proposed low-noise TIA compared to the TIA with an RGC input stage [8], [13]. The reported work achieves sensitivities of -9.5and -7.5 dBm for both LP and HP modes, respectively, at 2.5 Gbits/s. This is much lower than, for example, the sensitivity of -20.1 dBm at 2.5 Gbits/s reported for an optical receiver with a p-i-n detector [29] due to the large intrinsic region in the p-i-n photodiode where a high electric field provides higher sensitivity and bandwidth [28].

# APPENDIX

This section derives (2) and (3) for  $J_{\text{diff,p-substrate}}$  and  $J_{\text{diff,n-well}}$ , respectively. They are treated in detail in [19]. The transport of minority carriers (i.e., electrons) in the p-substrate can be expressed as [19]

$$\frac{\partial n_p}{\partial t} = D_n \frac{\partial^2 n_p}{\partial^2 x} + D_n \frac{\partial^2 n_p}{\partial^2 y} - \frac{n_p}{\tau_n} + f(t, y) e^{-\alpha x}$$
(29)

where  $n_p$  is the minority-carrier concentration in the region below the space-charge region normalized to the equilibrium concentration,  $D_n$  is the diffusion coefficient of electrons in the p-doped layer,  $\tau_n$  is the minority-carrier lifetime,  $\alpha$  is the absorption coefficient, and f(t, y) is the electron generation rate at the lower border of the space-charge region (x = 0).

To evaluate the current density  $J_{\text{diff,p-substrate}}$  using (29), first, the carrier profile is computed by taking the Laplace transform of the variable t. From this carrier profile, a current density profile is calculated at the border of the space-charge region. Integrating the current density profile over both the junction length of the exposed and covered photodiodes and subtracting yield (2).

The transport of minority carriers (i.e., holes) in the n-well can be expressed as [19]

$$\frac{\partial p_n}{\partial t} = D_p \frac{\partial^2 p_n}{\partial^2 x} + D_p \frac{\partial^2 p_n}{\partial^2 y} - \frac{p_n}{\tau_p} + g(t)$$
(30)

where  $p_n$  is the minority-carrier concentration in the region above the space-charge region (i.e., n-well) normalized to the equilibrium concentration,  $D_p$  is the diffusion coefficient of electrons in the p-doped layer,  $\tau_p$  is the minority-carrier lifetime, and g(t) is expressed as

$$g(t) = \frac{\Phi_o(t)}{l_x(1 - e^{-\alpha x})}.$$
(31)

Here,  $\Phi_o(t)$  denotes the incident-light flux density. In this case,  $p_n$  and g(t) are rewritten as a product of two Fourier series, i.e., one of a square wave in the x-direction (having index n), and the other of a square wave in the y-direction (having index m). For each set of indices m and n, a carrier profile can be calculated, and the current profiles can be determined from these carrier profiles. The total contributed current is the integral of the current through the two sidewalls and the bottom layers. It can be expressed as the sum of the contributions for each of the indices m and n as in (3).

## ACKNOWLEDGMENT

The authors would like to thank CMC for the fabrication services.

#### REFERENCES

- J. Choi, B. Sheu, and O.-C. Chen, "A monolithic GaAs receiver for optical interconnect systems," *IEEE J. Solid-State Circuits*, vol. 29, no. 3, pp. 328–331, Mar. 1994.
- [2] M. Bitter, R. Bauknecht, W. Hunziker, and H. Melchior, "Monolithic InGaAs–InP p-i-n/HBT 40-Gb/s optical receiver module," *IEEE Photon. Technol. Lett.*, vol. 12, no. 1, pp. 74–76, Jan. 2000.
- [3] C. Hermans and M. Steyaert, "A high-speed 850-nm optical receiver front-end in 0.18- μm CMOS," *IEEE J. Solid-State Circuits*, vol. 41, no. 7, pp. 1606–1614, Jul. 2006.
- [4] S. Radovanovic, A.-J. Annema, and B. Nauta, "A 3-Gb/s optical detector in standard CMOS for 850-nm optical communication," *IEEE J. Solid-State Circuits*, vol. 40, no. 8, pp. 1706–1717, Aug. 2005.

- [5] M. Ghioni, F. Zappa, V. Kesan, and J. Warnock, "A VLSI-compatible high-speed silicon photodetector for optical data link applications," *IEEE Trans. Electron Devices*, vol. 43, no. 7, pp. 1054–1060, Jul. 1996.
- [6] C. Schow, L. Schares, S. Koester, G. Dehlinger, R. John, and F. Doany, "A 15-Gb/s 2.4-V optical receiver using a Ge-on-SOI photodiode and a CMOS IC," *IEEE Photon. Technol. Lett.*, vol. 18, no. 19, pp. 1981–1983, Oct. 1, 2006.
- [7] J. Sturm, M. Leifhelm, H. Schatzmayr, S. Groiss, and H. Zimmermann, "Optical receiver IC for CD/DVD/blue-laser application," *IEEE J. Solid-State Circuits*, vol. 40, no. 7, pp. 1406–1413, Jul. 2005.
- [8] W.-Z. Chen and S.-H. Huang, "A 2.5 Gbps CMOS fully integrated optical receiver with lateral PIN detector," in *Proc. IEEE CICC*, Sep. 2007, pp. 293–296.
- [9] W.-K. Huang, Y.-C. Liu, and Y.-M. Hsin, "Bandwidth enhancement in Si photodiode by eliminating slow diffusion photocarriers," *Electron. Lett.*, vol. 44, no. 1, pp. 52–53, Jan. 2008.
- [10] K. Ayadi, M. Kuijk, P. Heremans, G. Bickel, G. Borghs, and R. Vounckx, "A monolithic optoelectronic receiver in standard 0.7- μm CMOS operating at 180 MHz and 176-fJ light input energy," *IEEE Photon. Technol. Lett.*, vol. 9, no. 1, pp. 88–90, Jan. 1997.
- [11] C. Rooman, M. Kuijk, R. Windisch, R. Vounckx, G. Borghs, A. Plichta, M. Brinkmann, K. Gerstner, R. Strack, P. Van Daele, W. Woittiez, R. Baets, and P. Heremans, "Inter-chip optical interconnects using imaging fiber bundles and integrated CMOS detectors," in *Proc. 27th ECOC*, 2001, vol. 3, pp. 296–297.
- [12] C. Rooman, D. Coppee, and M. Kuijk, "Asynchronous 250 Mb/s optical receivers with integrated detector in standard CMOS technology for optocoupler applications," in *Proc. 25th ESSCIRC*, Sep. 1999, pp. 234–237.
- [13] W.-Z. Chen, S.-H. Huang, G.-W. Wu, C.-C. Liu, Y.-T. Huang, C.-F. Chin, W.-H. Chang, and Y.-Z. Juang, "A 3.125 Gbps CMOS fully integrated optical receiver with adaptive analog equalizer," in *Proc. IEEE ASSCC*, Nov. 2007, pp. 396–399.
- [14] T. Kao and A. Chan Carusone, "A 5-Gbps optical receiver with monolithically integrated photodetector in 0.18-  $\mu$ m CMOS," in *Proc. IEEE RFIC*, Jun. 2009, pp. 451–454.
- [15] P. Gray, P. Hurst, S. Lewis, and R. Meyer, Analysis and Design of Analog Integrated Circuits. New York: Wiley, 2001.
- [16] E. Sackinger, Broadband Circuits for Optical Fiber Communication. New York: Wiley, 2005.
- [17] S. Galal and B. Razavi, "10-Gb/s limiting amplifier and laser/modulator driver in 0.18- μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2138–2146, Dec. 2003.
  [18] M. Grazing, M. Jutzi, W. Nanz, and M. Berroth, "A 2-Gb/s 0.18- μm
- [18] M. Grazing, M. Jutzi, W. Nanz, and M. Berroth, "A 2-Gb/s 0.18- μm CMOS front-end amplifier for integrated differential photodiodes," in *Proc. Top. Meeting Silicon Monolithic Integr. Circuits RF Syst.*, Jan. 2006, pp. 361–364.
- [19] J. Genoe, D. Coppee, J. Stiens, R. Vonekx, and M. Kuijk, "Calculation of the current response of the spatially modulated light CMOS detector," *IEEE Trans. Electron Devices*, vol. 48, no. 9, pp. 1892–1902, Sep. 2001.
- [20] T. Vanisri and C. Toumazou, "The opto-electronic high-frequency transconductor and circuit applications," in *Proc. IEE Colloq. RF Des. Scene*, Feb. 1996, pp. 2/1–2/211.
- [21] S. M. Park and H.-J. Yoo, "1.25-Gb/s regulated cascode CMOS transimpedance amplifier for gigabit Ethernet applications," *IEEE J. Solid-State Circuits*, vol. 39, no. 1, pp. 112–121, Jan. 2004.
- [22] B. Razavi, *Design of Integrated Circuits for Optical Communication Systems*. New York: McGraw-Hill, 2003.
- [23] C. Hermans, F. Tavernier, and M. Steyaert, "A gigabit optical receiver with monolithically integrated photodiode in 0.18- μm CMOS," in *Proc. 32nd ESSCIRC*, Sep. 2006, pp. 476–479.
- [24] H. Zimmermann, *Silicon Optoelectronic Integrated Circuits*. Berlin, Germany: Springer-Verlag, 2004.
- [25] S. Simth and S. Personick, "Receiver design for optical communication systems," in *Semiconductor Devices for Optical Communications*, H. Kressel, Ed. Berlin, Germany: Springer-Verlag, 1980.
- [26] P. Nicholson, Nuclear Electronics. New York: Wiley, 1982.
- [27] A. A. Abidi, "On the choice of optimum FET size in wide-band transimpedance amplifiers," *J. Lightwave Technol.*, vol. 6, no. 1, pp. 64–66, Jan. 1988.

- [28] C. Hermans and M. Steyaert, Broadband Opto-Electrical Receivers in Standard CMOS. Berlin, Germany: Springer-Verlag, 2007.
- [29] R. Swoboda and H. Zimmermann, "2.5 Gb/s silicon receiver OEIC with large diameter photodiode," *Electron. Lett.*, vol. 40, no. 8, pp. 505–507, May 2004



**Tony Shuo-Chun Kao** received the B.A.Sc. degree in electrical engineering from the University of Waterloo, Waterloo, ON, Canada, in 2006 and the M.A.Sc. degree in electrical engineering from the University of Toronto, Toronto, ON, in 2009.

He joined Gennum Corporation, Burlington, ON, as an Analog IC Designer in 2010, where he is currently working on high-speed transceiver design supporting various industry standard specifications. His research interests include broadband data communication circuits and monolithically

integrated photodetectors.

Mr. Kao is the recipient of the Best Paper Award (Gold Leaf) at the 2009 Microsystems and Nanoelectronics Research Conference.



**Faisal A. Musa** received the Ph.D. degree from The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, Toronto, ON, Canada, in 2007.

During the summer of 2004, he was with the Circuits Research Laboratory, Intel Corporation, Hillsboro, OR, where he worked on the design of high-speed clock recovery systems. From 2006 to 2008, he was with Gennum Corporation, Burlington, ON, where he worked on the design and verification of high-speed integrated circuits for video and data

communication applications. Since 2008, he has been with The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, first as a Lecturer and then as a Research Associate. His research interests include modeling, design, and implementation of high-speed integrated circuits for chip-to-chip communications, electronic dispersion compensation in optical links, and integrated optical receivers in CMOS and multirate delta–sigma modulators.



Anthony Chan Carusone (S'96–M'02–SM'08) received the B.A.Sc. and Ph.D. degrees from the University of Toronto, Toronto, ON, Canada, in 1997 and 2002, respectively.

In 2008, he was a Visiting Researcher with the University of Pavia, Pavia, Italy, and, later, with the Circuits Research Laboratory, Intel Corporation, Hillsboro, OR. Since 2001, he has been with The Edward S. Rogers Sr. Department of Electrical and Computer Engineering, University of Toronto, where he is currently an Associate Professor.

Prof. Chan Carusone was a coauthor of the Best Paper at the 2005 IEEE Compound Semiconductor Integrated Circuits Symposium and the Best Student Papers at both the 2007 and 2008 IEEE Custom Integrated Circuits Conferences. While with the University of Toronto, he was the recipient of the Governor-General's Silver Medal. He is an appointed member of the Administrative Committee of the IEEE Solid-State Circuits Society and the Board of Governors of the IEEE Circuits and Systems Society, a member and Past Chair of the Analog Signal Processing Technical Committee of the IEEE Circuits and Systems Society, and a member and Past Chair of the Wireline Communications Subcommittee of the IEEE Custom Integrated Circuits Conference. He serves as a Guest Editor for both the IEEE JOURNAL OF SOLID-STATE CIRCUITS and the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS I: REGULAR PAPERS. Since 2006, he has been a member of the editorial board of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II: EXPRESS BRIEFS and is currently the Editor-in-Chief.