# A 19.6-Gbps CMOS Optical Receiver with Local Feedback IIR DFE

Alireza Sharif-Bakhtiar, Anthony Chan Carusone Department of Electrical and Computer Engineering University of Toronto, Canada

*Abstract*—This paper describes a low power optical receiver for discrete photodiodes. The receiver utilizes an input stage bandwidth of only 2GHz, affording high gain with low power consumption while limiting input-referred noise. The resulting ISI is eliminated and data recovered using an IIR DFE. The IIR DFE utilizes a local feedback to relax the timing criteria of the DFE loop. The 65-nm CMOS chip consumes 14.7 mW at 19.6Gbps.

### I. INTRODUCTION

By reducing the cost and power consumption of optical transceivers, optical links may replace copper in emerging high-volume wireline communication applications for highperformance computing and networking. Directly-modulated 850nm VCSEL-based links over multimode fiber and receivers utilizing discrete photodetectors offer the lowest cost optoelectronic components and optical packaging suitable for 20+Gb/s communication. However, lowering the receiver power consumption while maintaining adequate sensitivity above 10Gbps remains a challenge. Conventional optical receivers consist of a transimpedance amplifier (TIA) followed by a high-gain amplifier chain serving as a limiting amplifier [1], [2]. In such systems the overall bandwidth needs to be at least approximately 70% of the input bit-rate  $(f_{bit})$  to maintain signal integrity. As the data rate increases the power required to increase the front-end bandwidth without loosing gain or sensitivity becomes prohibitive.

Recently-reported energy-efficient optical receivers use alternative architectures to permit front-end bandwidths below  $0.2 f_{bit}$  while maintaining signal integrity. In [3], [4], a doublesampling technique was used to realize a 2-tap (Differentiating) FFE that equalizes the low bandwidth (integrating) front-end. Excellent energy efficiency is achieved, down to 0.17pJ/bit in [4]. However, this technique inherently introduces noise enhancement due to the use of a FFE. Moreover, [3] omits the RS latch that would be required to recover NRZ data and requires multiple clock phases whose generation is not accounted for in the reported power consumption, whereas the power consumption in [4] benefits from the combination of 28-nm CMOS technology and a silicon photonic photodiode at 1310nm with only 8fF capacitance ( $C_{PD}$ ).

Alternatively, optical receivers in [5], [6] equalize the bandwidth limitations of the receiver front-ends using a decision feedback equalizer (DFE). DFEs do not introduce noise enhancement. However, the required feedback loop timing limits the maximum data rate to 9Gb/s in 90-nm CMOS [6].

This work presents a bandwidth-limited TIA in combination with a novel implementation of a infinite impulse response (IIR) DFE. The DFE uses a local feedback to alleviate the feedback timing criteria of the DFE and increase its maximum operating speed to 19.6Gbps while consuming 0.18 pJ/b. The overall receiver consumes 0.75 pJ/b.

### II. CIRCUIT IMPLEMENTATION

Front-end The frondend is comprised of a pseudodifferential regulated cascode TIA [7], a programmable gain amplifier (PGA) and the offset cancellation loop. The TIA (Fig. 1) frequency response is dominated by its output pole. As a result the overall response of the TIA can be approximated by a first order filter with bandwidth  $f_{TIA}$  (inversely proportional to  $R_D$  times output capacitance of the TIA) and a transimpedance gain of  $R_D$ . Assuming that the DFE removes all the inter-symbol interference (ISI) introduced by the frontend, increasing  $R_D$  increases the signal main cursor amplitude and is beneficial. However, for large values of  $R_D$  where  $f_{TIA} < 0.15 f_{bit}$  further increases in  $R_D$  grows the main cursor amplitude very little whereas the rms noise at the output of the TIA continues to grow in proportion to  $\sqrt{R_D}$ . Hence, an optimum value for  $R_D$  maximizes the ratio of signal power in the received pulse response's main cursor to noise power. For this receiver, the optimum occurs at  $f_{TIA} \approx 0.1 f_{bit}$ . A similar front-end bandwidth was adopted for the DFE-based optical receiver in [6].

**DFE** Assuming 20Gbps input data and  $f_{TIA} = 2$ GHz, the DFE has to compensate for about 15dB loss at the Nyquist frequency. Given the first-order response of the TIA, the design uses an IIR DFE. However, closing the feedback timing in a conventional IIR DFE becomes difficult and power hungry at 20Gbps. The proposed DFE utilizes a local feedback to mitigate this problem. The DFE is shown in Fig. 2 in the two clock half-cycles. In Fig. 2a when "CLK" is low and the gate-source voltages on input transistors M<sub>1</sub> compare the DFE input voltage  $(V_A)$  to the voltage across the local  $R_F C_F$ feedback  $(V_F)$ . At this point the output nodes are charged to  $V_{DD}$ . In Fig.2b at the rising edge of "CLK" the crosscoupled pair  $(M_3 \text{ and } M_4)$  trip to one side or the other dependent on the comparison  $V_A - V_F$  in step (a). The charge on the output node that goes low (right side in Fig. 2b) gets injected onto the corresponding local feedback capacitor  $C_F$ , providing a shift in the feedback voltage  $\pm \Delta V_F$  whose polarity depends upon the received bit. The resistor in the local feedback continuously discharges  $C_F$  with the desired time-constant ( $\tau$ ) providing an exponential decay in  $V_F$  that corresponds to a 1st-order IIR filtering of all past recovered bits. The feedback delay of this structure reduces simply to the settling time of the cross-coupled pair. Note the same  $R_F C_F$ network is shared by two time-interleaved latches allowing the IIR-DFE to be incorporated into a half-rate receiver without explicitly multiplexing the recovered data back to full-rate in the feedback path, as in past IIR DFEs, e.g. [8].

## **III. MEASUREMENT RESULTS**

The receiver was fabricated in 65-nm CMOS. The CMOS chip was then wirebonded to a COSEMI BPD2010 photodiode (and a dummy load). The PD is intended to 10Gbps but the IIR DFE allows its use beyond that speed. A 850-nm light source (VCSEL) is used for the measurements. The bathtub curves at 17Gbps for different DFE time-constants are shown in Fig. 3a. The bathtub curve with the best time-constant shows 40% opening at BER =  $10^{-12}$ . Fig. 3b shows the optical sensitivity at different data-rates. The sensitivity is limited by the sensitivity of the DFE latches (not the front-end noise) and achieves -5.9 dBm and -4.9 dBm optical modulation amplitude (OMA) at 18Gbps and 19.6Gbps respectively. The receiver including the clock buffers consumes 0.65 pJ/b at 18Gbps and 0.75 pJ/bit at 19.6Gbps. The power efficiency of the DFE alone is 180 fJ/bit at the latter data-rate. The comparison table summarises the relevant CMOS receivers with discrete photodiodes.

Acknowledgements The authors would like to thank Fujitsu Laboratories of America for their support, and CMC for CAD, fabrication, and packaging.

#### REFERENCES

- [1] Takemoto, T., et al., "A 25-to-28 Gb/s High-Sensitivity ( 9.7 dBm) 65 nm CMOS Optical Receiver for Board-to-Board Interconnects," JSSC, Oct. 2014
- [2] Lee, Benjamin G., et al., "Latch-to-latch CMOS-driven optical link at 28 Gb/s," CLEO, 2014
- [3] Nazari, M.H., Emami-Neyestanak, A., An 18.6Gb/s double-sampling receiver in 65nm CMOS for ultra-low-power optical communication,, ISSCC, 2012
- [4] Saeedi, S.; Emami, A., "A 25Gb/s 170µW/Gb/s optical receiver in 28nm CMOS for chip-to-chip optical communication," RFIC, 2014 [5] Rylyakov, A.V., et al., "A new ultra-high sensitivity, low-power optical
- receiver based on a decision-feedback equalizer,", OFC/NFOEC, 2011
- J. Proesel, et. al., " Optical Receivers Using DFE-IIR Equalization," [6] ISSCC), 2013
- Kromer, C., et al., "A low-power 20-GHz 52-dBO transimpedance [7] amplifier in 80-nm CMOS," JSSC , June 2004
- [8] Shahramian, S.; Chan Carusone, A., "A 10Gb/s 4.1mW 2-IIR + 1discrete-tap DFE in 28nm-LP CMOS," ESSCirC, 2014.



Fig. 1: Receiver block diagram with TIA and PGA schematics in the inset



Fig. 2: IIR DFE operation (a) when CLK = "0" (b) at the rising edge of CLK



Fig. 3: (a) Bath-tub cuvers at 17Gbps (PRBS7) for different IIR DFE time-constants, OMA = -3.5dBm ( $230\mu$ A<sub>pp</sub>) (b) Optical sensitivity



Fig. 4: (a) Optical eye (top) and half-rate electrical output at 19.6Gbps (b) CMOS chip wirebonded to the PD chip

| Com | narison | Tab  | le |
|-----|---------|------|----|
| Com | Jarison | 1 au | ле |

|                              | [2]     | [3]      | [5]    | [6]     | This Work        |      |
|------------------------------|---------|----------|--------|---------|------------------|------|
| Technology                   | 32nm    | 65nm     | 90nm   | 90nm    | 65nm CMOS        |      |
|                              | SOI     | CMOS     | CMOS   | CMOS    |                  |      |
| Architecture                 | TIA +   | RC BW    | TIA +  | DFE-IIR | TIA + IIR<br>DFE |      |
|                              | Latched | -limited | 2-tap  |         |                  |      |
|                              | DMUX+   | Double   | DFE    |         |                  |      |
|                              | TX FFE  | Sampler  |        |         |                  |      |
| Data Rate (Gb/s)             | 28      | 18.6     | 4      | 9       | 18               | 19.6 |
| C <sub>PD</sub> (fF)         | 85      | 120      | 140    | 140     | 200              |      |
| PD Responsivity<br>(A/W)     | 0.55    | 1        | 0.55   | 0.55    | 0.5              |      |
| Sensitivity<br>(dB OMA)      | -6      | -4.7*    | -22    | -5      | -5.9             | -4.9 |
| Power Efficiency<br>(pJ/bit) | 1.95    | 0.4**    | 1.15   | 0.93    | 0.65             | 0.75 |
| Area (mm <sup>2</sup> )      | 0.012   | 0.0028   | 0.0045 | 0.004   | 0.027            |      |

Coupling losses de-embedded \*\* Does not include clock generation and SR - latch