# A 0.002-mm<sup>2</sup> 6.4-mW 10-Gb/s Full-Rate Direct DFE Receiver With 59.6% Horizontal Eye Opening Under 23.3-dB Channel Loss at Nyquist Frequency

Yong Chen, Member, IEEE, Pui-In Mak, Senior Member, IEEE, Li Zhang, and Yan Wang

Abstract—This paper reports a full-rate direct decision-feedback-equalization (DFE) receiver with circuit techniques to widen the data eye opening with competitive power and area efficiencies. Specifically, a current-reuse active-inductor (AI) linear equalizer is merged into a clocked-one-tap DFE core for joint-elimination of pre-cursor and long-tail post-cursors. Unlike the passive-inductor designs that are bulky and untunable, the AI linear equalizer offers orthogonally tunable low- and high-frequency de-emphasis. The clocked-one-tap DFE resolves the first post-cursor via return-to-zero feedback data patterns for sharper data transition (i.e., horizontal eye opening), and is followed by a D-flip-flop slicer to maximize the data height (i.e., vertical eye opening). A 10-Gb/s DFE receiver was fabricated in 65-nm CMOS. Measured over an 84-cm printed circuit board differential trace with 23.3-dB channel loss at Nyquist frequency (5 GHz), the achieved figure-of-merit is 0.027 pJ/bit/dB (power consumption/date rate/channel loss). At  $10^{-12}$  bit error rate under  $2^7 - 1$  pseudorandom binary sequence, the horizontal and vertical eye opening are 59.6% and 189.3 mV, respectively. The die size is 0.002 mm<sup>2</sup>.

Index Terms-Active inductor (AI), analog one-tap nonreturn-to-zero (NRZ) feedback, bathtub curve, bit error rate (BER), channel loss, clocked-one-tap return-to-zero (RZ) feedback, CMOS. decision feedback equalization (DFE) receiver. horizontal eve opening, pseudorandom binary sequence (PRBS), vertical eve opening.

## I. INTRODUCTION

• OR BETTER power and area efficiencies of massively parallel I/O links in microwave backplane data [1]-[6] and optical fiber transceivers [7], [8], circuit advances must be made to each decision-feedback-equalization (DFE) receiver node to reduce the inter-symbol interference (ISI) induced by the lengthy printed circuit board (PCB) traces, which induce severe frequency-dependent channel loss caused by the dielectric loss and skin effect. For instance, an 84-cm PCB differential

Manuscript received April 15, 2014; revised July 15, 2014; accepted September 22, 2014. Date of publication October 08, 2014; date of current version December 02, 2014. This work was supported by the National 973 Program of China under a grant, by the National 863 Program of China under grant, and by the Macau Science and Technology Development Fund (FDCT) SKL Fund.

Y. Chen, L. Zhang, and Y. Wang are with the Institute of Microelectronics, Tsinghua University, Beijing 100084, China (e-mail: chenyong@ieee.org).

P.-I. Mak is with the State-Key Laboratory of Analog and Mixed-Signal and FST-ECE, University of Macau, Macao, China (e-mail: pimak@umac.mo).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TMTT.2014.2360697

(qB) **Microtrip Traces** Sdd21 -10 (84 cm FR4) 0.8 Long-tail Post-cursors Amplitude 0.6 Loss -20 0.4 Channel -30 Pre-curso -40 3 5 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 Frequency (GHz) Time (UI) (b) (a)

Main cursor

Fig. 1. Characteristics of an 84-cm FR4 differential microtrip trace. (a) Measured channel loss is 14.1/18.7/23.3 dB at 3/4/5 GHz in frequency domain. (b) Normalized pulse response sampled at 1-UI (100 ps) with precursor and long-tail post-cursors in time domain.

trace can exhibit heavy channel loss (e.g., 23.3 dB at 5 GHz), as shown in Fig. 1(a). The normalized pulse response in Fig. 1(b) intuitively illustrates the existence of both precursor and longtail post-cursors. Both of them can dominate the channel-induced ISI in the time domain polluting the eye diagram. In fact, to meet a specified bit error rate (BER) (e.g.,  $BER = 10^{-12}$ ) in the bathtub curves, both vertical and horizontal eye opening are crucial. When optimizing the figure-of-merit (FOM) [1] defined by power consumption per data rate per channel loss (pJ/bit/dB), the clearness of the eye diagram must be accounted to justify the overall equalization efficacy.

The long-tail post-cursors can last for a number of symbol periods, commonly entailing a large number of discrete taps to address them [2]. Regrettably, each tap calls for extra power and area, while adding more parasitic capacitance to the summation node. Recently, certain efforts [3]-[5], [9], [10] were shifted to the continuous-time infinite impulse response (IIR) filter for its better efficiency. A properly optimized IIR filter can suppress most long-tail post-cursors, but has to pre-evaluate its exponential function for matching with the long-tail response. Another concern of the IIR filter is the tight tradeoff among the number of taps, die areas (e.g., passive RC in the IIR filter) and power (e.g., current path associated with each tap). Another attempt [11] was to combine the equalization benefits of both the linear equalizer and one-tap speculative (unrolled) DFE. The linear equalizer, separately optimized from the DFE, is responsible for the long-tail post-cursors, leaving only the first post-cursor to be addressed by the one-tap speculative DFE. Still, this scheme could not address the precursor, which generally appears in the PCB channel loss [see Fig. 1(b)].



One major design consideration of the DFE is the receiver architecture. Comparing with the half-rate DFEs [4], [5], [12], the full-rate direct DFEs [13], [14] are more favored for its simple clocking, and compact layout fully benefitting from the  $f_T$  and parasitic improvements of fine linewidth CMOS technologies. Structurally, only a summer for the input and tap feedback signals, and a data flip-flop (DFF) as the slicer are entailed. The main challenge is the settling accuracy of the feedback signal at the summer output before detecting the next symbol (i.e., within one-unit-interval (1-UI) feedback loop delay). Several solutions have been reported to improve the horizontal eye opening. For instance, the DFE receiver in [13] merges the adder and slicer, boosting the operation speed to 20 Gb/s with a FOM of 0.171 pJ/bit/dB. Another way was to use an unclocked DFE receiver [8], [14] with a variable delay to accurately tune the feedback path for optimizing the horizontal eye opening. Yet, the power and area are still penalized due to the separately designed linear equalizer and DFE core.

This paper describes a 65-nm CMOS 10-Gb/s full-rate direct DFE receiver suitable for most backplane communication systems that feature a general PCB channel-loss profile [see Fig. 1(a)] including precursor and long-tail post-cursors [see Fig. 1(b)]. It is a merged architecture combining a current-reuse *active inductor (AI) linear equalizer* to cope with both precursor and long-tail post-cursors, with a *clocked-one-tap DFE core* to resolve the residual first post-cursor. Unlike the traditional one-tap analog nonreturn-to-zero (NRZ) feedback, the clocked-one-tap is based on return-to-zero (RZ) feedback data patterns. Together with a DFF as the slicer, both signal amplitude and data-to-data transition are improved, rendering the techniques well suited for full-rate direct DFE that has a tighter timing margin than its half-rate counterparts [1].

This paper is organized as follows. The architecture of the proposed full-rate direct DFE receiver is introduced in Section II. The circuit implementation and simulation results are detailed in Section III. The experimental results are reported in Section IV. Finally, Section V draws the conclusions.

# II. PROPOSED FULL-RATE DIRECT DFE RECEIVER WITH MERGED ARCHITECTURE

To save the die area, inductorless (passive) realization is preferred. This work focuses on an inductorless full-rate direct DFE receiver. A typical design [13], [14] (Fig. 2) normally consists of a linear equalizer and a full-rate one-tap direct DFE core that are separately designed and powered. The linear equalizer is to gain-boost the input signal  $(V_{I,DFE})$  (with severe ISI), while transferring the equalized signal to the DFE in the voltage mode. The summation path in the DFE converts the equalized voltage signal of the linear equalizer to the present current signal  $(I_D)$ as the main tap signal, and simultaneously, the one-tap analog feedback path converts the previous voltage signal  $(V_{O,DFE})$ to the previous current signal  $(I_{tap})$  as the NRZ signal pattern. Next, the two current signals are summed in the current mode at  $V_{\rm sum}$  so as to eliminate the first post-cursor. Finally, the DFF as the slicer [13] resolves and amplifies the equalized signal to larger amplitude before conveying it to the subsequent circuitry. Notably, the sum operation in the DFE is a subtraction of current, rendering  $I_{sum} < I_D$ . It implies that the required signal



Fig. 2. Typical full-rate direct DFE with one-tap analog NRZ feedback. The linear equalizer and DFE core are separately powered.



Fig. 3. (a) Proposed full-rate direct DFE receiver combines an AI linear equalizer with a clocked-one-tap RZ feedback technique. (b) Same DFE receiver, but with the traditional one-tap analog NRZ feedback for comparison.

swing at the summation node is undemanding, even under a typical 1.2-V supply in 65-nm CMOS. Thus, merging more functions in one cascode cell for current-reuse becomes a prospective direction for power savings, while improving the performances due to less V-to-I and I-to-V conversions; both just generate noise and nonlinearity.

This work introduces a compact linear equalizer featuring a current-reuse AI [15]. It is merged into the DFE core [see Fig. 3(a)] as the main tap path. The linear equalizer directly converts the input voltage signal with severe ISI into an equalized current signal, with the AI mainly responsible for the tunable gain peaking at high frequency. Targeting a signal swing



Fig. 4. Timing diagram of the proposed full-rate direct DFE receiver and corresponding one [see Fig. 3(a) and (b)].

 $\sim$ 200 mV<sub>pp</sub> at the summation node, one can properly elevate the supply to 1.5 V, to absorb the voltage headroom consumed by the AI. With proper biasing, an elevated supply can create more design headroom to balance the signal swing and transistor overdrives without overstressing all devices [16]. The AI linear equalizer can eliminate both precursor and long-tail postcursors via its orthogonally tunable low- and high-frequency de-emphasis.

For the first post-cursor, it is addressed by the clockedone-tap RZ feedback associated with the DFE core. It converts the previously sliced data into a RZ tap current  $I_{tap}$  under the corresponding tap weight.  $I_D$  and  $I_{tap}$  are then current summed and converted into a voltage  $(V_{sum})$  via the load resistor  $(R_L)$ . The DFF slices the present data at  $V_{sum}$  and delivers a clearly opened  $V_{O,DFE}$  including the main cursor.

To show the effectiveness of the clocked-one-tap RZ feedback [see Fig. 3(a)], one can compare it with the traditional one-tap analog NRZ feedback [see Fig. 3(b)]. For brevity, only the internal-node timing diagrams are discussed. As illustrated in Fig. 4, the AI linear equalizer converts the unequalized voltage  $V_{I,DFE}$  (i.e., with ISI) as the present symbol  $(DI_1)$ into an equalized current  $I_D$  as the present symbol  $(DL_1)$ . When the clock is high (e.g., Case 1), the previous symbol  $(DD_0)$  that is the equalized data after DFE from the previous symbol  $(DI_0)$  with ISI will mix with  $I_{tap}$  to produce the feedback current signal  $(D_{0T})$  after the feedback delay time  $(t_{\rm fb})$ . The clock's falling edge is the sampling time. Before it, the equalized data after DFE  $(DD_1 = D_1 + D_{0T})$  must be settled in  $t_{setup}$ . The DFF as a slicer takes  $t_{clk-Q}$  to deliver  $V_{O,DFE}$ . In fact, Fig. 3(a) and (b) features the same expression for the feedback loop delay  $(t_{fl_{delay}})$  constraint as given by

$$t_{fl_{\text{delay}}} = t_{\text{fb}} + t_{\text{setup}} + t_{\text{clk}-Q} < 1 UI \tag{1}$$

where  $t_{fl_{delay}}$  consists of two parts: one is the feedback delay time ( $t_{fb}$ ) and another one is the required time of the DFF (e.g.,

 $t_{\text{setup}} + t_{\text{clk}-\text{Q}}$ ). Fig. 3(a) and (b) differs only in terms of  $t_{\text{fb}}$ , which will be detailed in Section III. Traditionally, shortening the critical feedback loop delay is the main focus for optimization, but this work proposes to change the feedback signal pattern for better equalization effects. The key difference of the two timing diagrams in Fig. 4 is the feedback current signal  $I_{\text{tap}}$  with RZ (red line in online version) and NRZ (black line), which effectively improve the horizontal opening of the eye diagram at the summation node. For example, at Case1, the equalized signal combines  $DL_1$  and  $D_{0T}$  after  $t_{\rm fb}$ . While for the analog NRZ feedback, the previous feedback signal  $(D_{-1T})$ at  $t_{\rm fb}$  cannot help the equalized signal at the summation node. Here,  $D_{-1T}$  at the second half clock period contribute effectively to the summing process  $(I_D + I_{tap})$  such that the equalized signal in Case1 will be more "lower" before the rising time and more "higher" after it. As such, the transition edge of  $V_{\rm sum}$  is more horizontally opened between the adjacent two data crossing, resulting in wider eye opening.

Differing from the the usual analog NRZ feedback, here the horizontal opening (i.e., small ISI or jitter) is firstly solved by the clocked RZ feedback, the DFF then serves as the slicer [13] to recover the amplitude. The final output is after the DFF that can drive the subsequent circuit [e.g., clock data recovery (CDR)].

Different pattern generations are shown in Fig. 5. For a single swing from VSS to VDD, an RZ output signal can be generated by NRZ input and CLK signals, using just one AND gate [see Fig. 5(a)], resulting in an asymmetric eye diagram. Similarly, two AND gates can be utilized to generate the differential RZ output signals from -VDD to VDD [see Fig. 5(b)], leading to a symmetric eye diagram. Unlike those logic signal swings that normally draw more power and limit the data rate, this work employs a low-power alternative that works under a small-signal swing [see Fig. 5(c)]. When the clocked transistor is inserted between the common-source differential pair and bias current, the input NRZ voltage signal can be converted directly into an output RZ current signal in the differential mode.



Fig. 5. Data pattern in time domain, eye diagram, and circuit implementation of: (a) NRZ single and RZ single. (b) NRZ single and RZ differential. (c) NRZ differential and RZ differential (proposed).

The clocked-one-tap RZ feedback is a dynamic feedback under a periodic clock control. Thus, the equalization effect can be improved substantially if the NRZ feedback signal pattern is replaced by its RZ feedback signal counterpart.

#### **III. CIRCUIT IMPLEMENTATION AND SIMULATION RESULTS**

#### A. Schematic of the Receiver

The schematic of the DFE receiver is depicted in Fig. 6. The input differential pair  $(M_{1,2})$  with RC degeneration reduces the input capacitance and create a first-order de-emphasis that is externally adjustable for more testability (e.g., adaptation to the loss of the PCB traces). Stacked atop input differential pair, there is an AI sharing the same bias current. The cross-diodeconnection of M<sub>3,4</sub> create a positive-feedback impedance converter [15], transferring the capacitive effect of  $C_{\rm AI}/2$  at  $V_{mp,n}$ into an inductive effect at  $V_{\text{sump},n}$ . The equivalent inductance  $L_{\rm AI}$  is given by  $C_{\rm AI}/2g_{m,\rm M3}$ , where  $g_{m,\rm M3}$  is the transconductance of  $M_3$  ( $M_4$ ).  $C_{AI}$  is realized as a MOS varactor for tuning. With it, the intensity of the gain boosting at the higher frequency can be tuned to facilitate circuit adaptation in the system level, being highly insensitive to the channel-loss variations. Although the AI occupies certain voltage headroom, the flexibility and improvements that it brought in are effective, not mentioning that the required signal swing at the summation node can be indeed small.

When the clock is high, the previous symbol differential signal  $(V_{O,\text{DFEp}} - V_{O,\text{DFEn}})$  will mix with  $I_{B,\text{tap}}$  to generate the previous symbol current RZ signal  $(I_{\text{tap}})$ , and sums the partial equalized present symbol current signal  $(I_D)$  at the summation node. The sum current signal  $(I_{\text{sum}})$  and multiplying  $R_L$  forms  $V_{\text{sum}}$ . Since  $I_{\text{tap}}$  with the first-order post-cursor is subtracted from  $I_D$  after the linear equalizer, a small signal swing is resulted. It is noteworthy that the amplitude of  $V_{\text{sum}}$ 



Fig. 6. Schematic of the proposed full-rate direct DFE receiver.

is uncritical here, but the adjacent data zero-crossing must be more opened, differing from the traditional analog NRZ feedback. As there is a DFF [13] to slice  $V_{sum}$  between the zero crossing of two adjacent data, both the vertical and horizontal openings of  $V_{O,DFE}$  can be cleared, inherently resolving the amplitude crossing of  $V_{sum}$ . The DFF recovers the amplitude of the equalized signal with little contribution to the eye opening (only 2-dB gain peaking from low to high frequencies). When the clock is low, the bias current  $(I_{B,tap})$  of the tap-feedback path can be borrowed to the AI and  $R_L$ , rather than directly injected the  $V_{\rm sum}$ . This technique reduces the parasitic contribution from the antiphase-clocked transistor at the summation node. This periodic injection of  $I_{B,tap}$  into  $R_L$  also regulates the common-mode voltage of  $V_{sum}$ , avoiding the need of a common-mode restoration circuit. In fact, all current paths are fully exploited at all times.

# B. AI Linear Equalizer

For simplicity, one can study separately the AI linear equalizer first. The influences of the DFE core to the AI linear equalizer will be discussed later. The derived transfer function H(s)of the AI linear equalizer is given by

$$H(s) = A_{\rm DC} \frac{1 + \frac{s}{\omega_{z1}}}{\left(1 + \frac{s}{\omega_{p1}}\right) \left(1 + \frac{s}{Q\omega_0} + \frac{s^2}{\omega_0^2}\right)}$$
(2)

where

$$A_{\rm DC} = \frac{g_{m1}R_L}{1 + g_{m1}R_{\rm deg}}$$
$$\omega_{z1} = \frac{1}{R_{\rm deg}C_{\rm deg}}$$
$$\omega_{\rm p1} = \frac{1 + g_{m1}R_{\rm deg}}{R_{\rm deg}C_{\rm deg}}$$
$$\omega_0 = \sqrt{\frac{g_{m3}}{C_{\rm AI}C_LR_L}}$$
$$Q = \frac{\sqrt{C_{\rm AI}C_Lg_{m3}R_L}}{C_Lg_{m3}R_L - C_{\rm AI}(g_{m3}R_L - 1)}$$



Fig. 7. Tunable gain response of the AI linear equalizer. (a) Low-frequency de-emphasis. (b) High-frequency de-emphasis.

where  $A_{\rm DC}$  is the dc gain tunable by  $V_{R,\rm set}$ , and the low-frequency 0 can be adjusted via  $V_{C,set}$ ;  $C_L$  denotes the parasitic capacitance at the summation node. The high-frequency gain generated from the complex poles is tunable by  $V_{L,set}$  owing to the merging of the AI with the linear equalizer. Both  $\omega_0$  and Q are functions of one tunable element:  $1/\sqrt{C_{AI}}$ . An optimal Q minimizes the phase distortion in the eye diagram. The AI is self-biased and occupies one diode-connected overdrive voltage. From simulations, the low-frequency gain can be tuned by around 14 dB [see Fig. 7(a)]. The AI has a high self-resonant frequency because of no inner parasitic pole. With  $M_{3,4}$  (20/0.06  $\mu$ m) biased at 620  $\mu$ A,  $C_{AI}$  with  $V_{L,set}$  that ranged from 0.8 to 1.5 V can offer a gain peaking of around 15 dB nearby 5 GHz [see Fig. 7(b)]. With  $C_{AI}$  ranges between 10–80 fF, the enhanced gain can be located at 4-1 GHz in the employed 65-nm CMOS technology.

A passive-inductor linear equalizer [17] can effectively extend the gain-bandwidth, but high-Q peaking brings in larger phase distortion and induces more jitter. The negative-capacitance linear equalizer [18] also can create the second 0 to cancel the first pole for bandwidth extension, but another current path is required and the gain-peak effect is limited. Unlike the aforesaid two techniques, the proposed AI linear equalizer offers wide-range-tunable gain peaking via complex poles, and they are realized in the same current branch as the creation of the low-frequency 0. Moreover, as the Q of the AI is easily tunable, an optimal Q can be selected during the system-level adaptation to minimize the phase distortion in the equalized eye diagram. For a signal swing of ~200 mV<sub>pp</sub> at V<sub>sum</sub>, the equalization effect provided by the AI is much more significant than its added noise, which is confirmed by simulations.

As the AI is based on positive feedback, all poles must be on the left-half plane for stability. Thus, the inequality below must be satisfied for the negative complex roots

Fig. 8. Schematic and layout details of the DFF.

AC and transient simulations were carried to prove the stability of the AI linear equalizer against process, voltage, and temperature variations.

#### C. Power-Efficient DFF

Typical DFF consists of two latches involving two bias currents. This work adopts the pseudodynamic differential DFF [19] shown in Fig. 8 that lessens the clock loading and power both by 50%. For sharing the same clock, every clock patch controls one data set element and one data store element. The optimized DFF has low parasitic and an ultra-compact layout  $(11 \times 17 \ \mu m)$ .

### D. Clocked-One-Tap RZ Feedback

As described above, the resultant feedback signal pattern of clocked-one-tap RZ feedback can enhance the eye opening, being an effective solution to relax the stringent timing requirement of full-rate direct DFE receivers. Recalling Fig. 3(a) and (b), the influence of the critical feedback delay between the traditional analog one-tap NRZ and clocked-one-tap RZ feedbacks are the delay times, i.e.,  $t_{\rm fb,NRZ}$  and  $t_{\rm fb,RZ}$ . The feedback delay time is approximately given by the RC time constant at the summation node. The simplified models of Fig. 3(a) and (b) are given in Fig. 9(a) and (b), respectively. The equivalent parameters ( $R_{eq}$  and  $C_{eq}$ ) from the AI linear equalizer and  $C_L$ from the DFF and feedback circuit are roughly identical between Fig. 9(a) and (b). Thus, the difference between  $R_{eq,NRZ}$ and  $R_{\rm eq,RZ}$  directly reflects the difference of  $t_{\rm fb,NRZ}$  and  $t_{\rm fb,RZ}$ . Since the two feedback circuits are time variant, in which the transistors  $(M_{f1,2})$  commutate the tap current  $(I_{B,tap})$  as similar as a single-balanced mixer. When the clock is high, the clocked-one-tap feedback path will transfer the test signal, while the antiphase-clocked transistors can no longer affect the AI linear equalizer. A time-variant resistor  $R_{eq,CLK}(t)$  is modeled in the feedback path between the feedback and tap transistors. It concerns a specific time period that the test signal turns  $M_{f1}$  ON and  $M_{f2}$  OFF with the clock phases. Using the small-signal analysis, one can derive  $R_{eq,NRZ}$  and  $R_{eq,RZ}$ ,

$$\frac{C_L}{C_{\rm AI}} > 1 - \frac{1}{g_{m3}R_L}.$$
 (3)

$$R_{\rm eq,NRZ} = (1 + g_{m,Mf1}r_{o,Mf1})r_{B,Itap} + r_{o,Mf1}$$

$$R_{\rm eq,RZ} = (1 + g_{m,Mf1}r_{o,Mf1})[r_{B,Itap} + R_{\rm eq,CLK}(t)]$$

$$+ r_{o,Mf1}$$
(4)



Fig. 9. Simplified models to simulate the feedback delay time ( $t_{fb,NRZ}$  and  $t_{fb,RZ}$ ) in full-rate direct DFE. (a) Clocked-one-tap RZ feedback. (b) Traditional analog one-tap NRZ feedback. Note that the overall feedback loop is cut at the gate of  $M_{f1,2}$  and with a test signal (similar to the DFE output signal) applied. The extracted  $t_{fb,NRZ}$  and  $t_{fb,RZ}$  are obtained by comparing the time difference between the test signal and test output.

15

where  $g_{m,Mf1}$  and  $r_{o,Mf1}$  are the transconductance and output resistance of  $M_{f1}$ , respectively;  $r_{B,Itap}$  is the output resistance of bias current  $I_{B,tap}$ . From (4), the embedded resistor resulting from the clock slightly increase the output equivalent resistor  $(R_{eq,RZ})$  of clocked-one-tap feedback, when comparing with  $R_{\rm eq,NRZ}$ . Thus, for the clocked-one-tap RZ feedback, the delay time of the critical feedback loop is slightly higher. Several simulations about  $t_{\rm fb,NRZ}$  and  $t_{\rm fb,RZ}$  were conducted at 10 Gb/s using  $2^7 - 1$  pseudorandom binary sequence (PRBS). The test source is applied at the input of the feedback transistors, and the output signal is extracted at the summation node. Fig. 10(a) shows that both  $t_{\rm fb,NRZ}$  and  $t_{\rm fb,RZ}$  will raise slightly when the data rate is increased. Although  $t_{\rm fb,RZ}$  of the proposed scheme is slightly longer ( $\sim$ 2.5 ps) than  $t_{\rm fb,NRZ}$ , it is negligible for a 1-UI of 100 ps. Fig. 10(b) shows  $t_{\rm fb}$  versus  $V_{L,\rm set}$ . Varying  $V_{L,\rm set}$  can induce roughly 1-ps  $t_{\rm fb}$  variation due to the phase shift, being negligible also in this design.

#### T<sub>fb,NRZ</sub> and T<sub>fb,RZ</sub> (ps) 14 13 Proposed Clocked-1-Tap RZ Feed 12 11 Traditional 1-Tap Analog NRZ Feedback 10 5.5 8.5 10 7 Data Rate (Gb/s) (a) 15 T<sub>fb.NRZ</sub> and T<sub>fb.RZ</sub> (ps) 14 13 12 11 Traditional 1-Tap Analog NRZ Feedback 10 0.9 1.1 1.3 1.5 VL,set (V) (b)

# E. Simulation Results of the Receiver

The DFE receivers in Fig. 3(a) and (b) were simulated at a 10-Gb/s data rate under  $2^7 - 1$  PRBS and nearly sinusoidal clocks for comparison of performances. For the traditional design in 1-UI,  $I_{tap}$  generating from the previous symbol at  $V_{O,DFE}$  with associated tap weight illustrates the NRZ signal. Regrettably, due to the limited circuit bandwidth, the signal amplitude and slope of the rising/falling edges of  $I_{sum}$  are both low [see blue lines (in online version) in Regions A and B of Fig. 11(a)], penalizing both horizontal and vertical eye opening. Analyzing these simulated results must correspond to the timing diagram in Fig. 4. In Fig. 11(a), the present symbol

Fig. 10. Simulated feedback delay time ( $t_{\rm fb}$ ) of the traditional one-tap analog NRZ feedback and proposed clocked-one-tap RZ feedback versus: (a) data rate and (b)  $V_{L,\rm set}$ .

 $(DI_1)$  and previous symbol  $(DI_0)$  are equal to 1 and 0, respectively. At the second 0.5-UI, the feedback current  $I_{tap}$  after  $t_{fb}$  adding the present current  $(I_D)$  generated from the present symbol  $(DI_1)$ . For this case, the proposed one is similar to the traditional one for the feedback current  $I_{tap}$ . At the first 0.5-UI, the traditional NRZ feedback upholds the original value (~100  $\mu$ A), and the proposed RZ feedback generates a zero value. The tap-feedback path is opened in the second 0.5-UI,



Fig. 11. Simulated results in time domain at the internal signal nodes. (a) Several UIs. (b) Zoomed view around  $t_{F0}$  and  $t_{F1}$ . (b) Zoomed view of  $I_{sum}$  and  $V_{O,DFE}$  around  $t_{F2}$ .



Fig. 12. Simulated eye diagram at the internal signal nodes. (a) Traditional one-tap analog NRZ feedback. (b) Proposed clocked-one-tap RZ feedback.

rendering the first latch [see Fig. 3(a)] no longer sensing  $V_{\rm sum}$ . As such, the DFE can be disabled in the second 0.5-UI when a RZ  $I_{\rm tap}$  is created.

Data zero-crossing at the rising edge is determined by the two instant times  $t_{F0}$  and  $t_{F1}$  in Fig. 11(b). For the traditional NRZ feedback, the equalized processes are followed the  $t_{F0}$  and  $t_{F1}$ at points

$$R_{1s} = R_{1t} + R_{1D}$$
  

$$F_{1s} = F_{1t} + F_{1D}.$$
(5)



Fig. 13. Simulated clock falling edge shifting effects on the equalized effect at  $V_{O,DFE}$ . (a) Vertical opening. (b) pk-to-pk jitter.

For the proposed RZ feedback, the equalized processes are followed at  $t_{F0}$  and  $t_{F1}$  at points

$$R_{2s} = R_{2t} + R_{2D}$$
  

$$F_{2s} = F_{2t} + F_{2D}.$$
(6)

From (5) and (6), as  $R_{1S}-F_{1S}$  is smaller than  $R_{2S}-F_{2S}$  at the summation node, the rising edge during the data zero-crossing from ZERO to ONE, or ONE to ZERO, should be substantially



Fig. 14. Simulated: (a) vertical opening and (b) pk-to-pk jitter versus data rate with clock falling edge being equal to zero UI.



Fig. 15. Die photograph of the fabricated DFE receiver (*left*) and its layout details (*right*).



Fig. 16. Experimental setup.

sharper under the RZ feedback. Benefitting from such an effect, the signal amplitude and slope of the rising/falling edges of  $I_{sum}$ can be boosted [red lines (in online version) in Regions A and B of Fig. 11(c)]. Although three or more continuous ZERO or ONE will induce ripple on the amplitude of  $I_{sum}$  ( $V_{sum}$ ), it is uncritical to the timing decision of the DFF by choosing the clock's falling edges as the sampling points (e.g.,  $t_{F1}$  and  $t_{F2}$ ). Verified by simulations, the equalization effects at  $V_{sum}$  still exhibit small jitter at the data crossing points. After *I*-to-*V* conversion at  $R_L$ ,  $V_{sum}$  (~100 mV) will be re-timed by the DFF, reproducing  $V_{O,DFE}$  [Regions C and D in Fig. 11(a) and (c)] that the amplitude crossing is eliminated and slope of the rising/falling edges of  $V_{O,DFE}$  are also boosted like  $I_{sum}$ .



Fig. 17. Measured eye diagrams at 6, 8, and 10 Gb/s data rates (vertical scale: 100 mV/div).



Fig. 18. Measured bathtub curves at: (a) 6 Gb/s, (b) 8 Gb/s, and (c) 10 Gb/s under  $2^7 - 1$  PRBS.

|                                            | This Work                                           |                                                |                                                  | VLSI'13 [4]                                    | ISSCC'10 [14]                                  | JSSC'09 [5]                                   | VLSI'07 [21]                                  |                                               | RFIC'07 [28]                                                |
|--------------------------------------------|-----------------------------------------------------|------------------------------------------------|--------------------------------------------------|------------------------------------------------|------------------------------------------------|-----------------------------------------------|-----------------------------------------------|-----------------------------------------------|-------------------------------------------------------------|
| Architecture                               | Merged Equalizer-DFE +<br>Clocked-1-Tap RZ Feedback |                                                |                                                  | 2-IIR-Tap<br>DFE                               | Equalizer +<br>Un-Clocked DFE                  | 1-Tap + IIR<br>DFE                            | 2-Tap DFE                                     |                                               | Heavy Duty<br>5-Tap DFE                                     |
| DFE Clock                                  | Full Rate                                           |                                                |                                                  | Half Rate                                      | Full Rate                                      | Half Rate                                     | Half Rate                                     |                                               | Half Rate                                                   |
| CMOS Process                               | 65nm                                                |                                                |                                                  | 65nm                                           | 45nm                                           | 65nm                                          | 65nm                                          |                                               | 65nm                                                        |
| Supply Voltage (V)                         | 1.5                                                 |                                                |                                                  | 1                                              | 1.1                                            | 1                                             | 1                                             | 0.9                                           | 1.2/1                                                       |
| Power (mW)                                 | 6.4                                                 |                                                |                                                  | 9.9                                            | 29.9 <sup>b</sup>                              | 6.8                                           | 2.7                                           | 2.4                                           | 50.7                                                        |
| Channel Loss (dB)                          | 14.1<br>@ 3GHz                                      | 18.7<br>@ 4GHz                                 | 23.3<br>@ 5GHz                                   | 35.0 ª<br>@ 5GHz                               | 36.0<br>@ 6GHz                                 | 23.2 ª<br>@ 5GHz                              | 18.0<br>@ 5GHz                                | 15.0<br>@ 5.5GHz                              | 20.0<br>@ 5GHz                                              |
| Data Rate (Gb/s)                           | 6                                                   | 8                                              | 10                                               | 10                                             | 12                                             | 10                                            | 10                                            | 11                                            | 10                                                          |
| Core Area (mm <sup>2</sup> )               | 0.002                                               | 0.002                                          | 0.002                                            | 0.0304                                         | 0.0493                                         | 0.0173                                        | 0.0012                                        |                                               | 0.0117                                                      |
| Horizontal Eye Opening @<br>BER, PRBS Data | 71.5% @<br>10 <sup>-12</sup> , 2 <sup>7</sup> -1    | 65% @<br>10 <sup>-12</sup> , 2 <sup>7</sup> -1 | 59.6% @<br>10 <sup>-12</sup> , 2 <sup>7</sup> -1 | 31% @<br>10 <sup>-12</sup> , 2 <sup>7</sup> -1 | 27% @<br>10 <sup>-12</sup> , 2 <sup>7</sup> -1 | 45% @<br>10 <sup>.9</sup> , 2 <sup>7</sup> -1 | 11% @<br>10 <sup>-9</sup> , 2 <sup>7</sup> -1 | 34% @<br>10 <sup>.9</sup> , 2 <sup>7</sup> -1 | >50% @<br>10 <sup>.9</sup> , 2 <sup>7</sup> -1 <sup>d</sup> |
| Power/Data Rate (pJ/bit)                   | 1.07                                                | 0.8                                            | 0.64                                             | 0.99                                           | 2.49                                           | 0.68                                          | 0.27                                          | 0.22                                          | 5.07                                                        |
| FOM ° (pJ/bit/dB) [1]                      | 0.076                                               | 0.043                                          | 0.027                                            | 0.028                                          | 0.069                                          | 0.029                                         | 0.015                                         | 0.015                                         | 0.25                                                        |

 TABLE I

 Chip Summary and Benchmark With the State-of-the-Art (Similar Data Rates)

<sup>a</sup> Include 2 dB of testbench loss <sup>b</sup> Power is read from the 1<sup>st</sup> author's thesis <sup>c</sup> FOM = Power / Data Rate / Channel Loss at Nyquist in dB <sup>d</sup> read from plot and is related to another work of the authors J. Bulzacchelli et al. [29]

For the RZ signals, they are generated by the clocked-one-tap feedback differential pair ( $M_{f1,2}$ ), the data ONE and ZERO correspond to the positive and negative RZ codes, respectively. Fig. 12(a) and (b) shows the eye diagrams of  $I_{tap}$ ,  $V_{sum}$ , and  $V_{O,DFE}$  based on the time-domain results given in Fig. 11. The smaller jitter at data crossing implies a wider horizontal eye opening. The amplitude crossing at the uncritical time induced by three or more continuous ZERO or ONE will be sliced by the DFF, which recovers  $V_{O,DFE}$  a clear eye diagram.

The relative position between  $V_{I,DFE}$  and the clock falling edge (clock sampling time) has been described in Fig. 4. Before the clock falling edge, there is sufficient time  $(t_{\rm fb} + t_{\rm setup})$ for equalized data setup. After the clock falling edge, the DFF takes  $t_{\rm clk-Q}$  to deliver the equalized data. In fact, these timing parameters are variable, as the relative position between  $V_{I,DFE}$ and the clock falling edge can be varied. Fig. 13(a) and (b) shows the vertical opening and pk-to-pk jitter of the eye diagram at  $V_{O,DFE}$ , when the clock falling edge shift with respect to  $V_{I,DFE}$ . The overall equalization effects (i.e., vertical opening and pk-to-pk jitter) at  $V_{O,DFE}$  are substantially improved. Form simulations the tolerable shift of clock's falling edge is -0.1 to 0.2 UI for effective equalization.

When the clock falling edge is at zero UI, the overall DFE receiver is simulated over a wide range of data rates [see Fig. 14(a) and (b)], better vertical opening (242 mV at 10 Gb/s) and pk-to-pk jitter (6.45 ps at 10 Gb/s) are concurrently achieved, when are significantly improved when comparing with (125.4 mV, 10.43 ps) obtained using the traditional one-tap analog NRZ feedback at  $V_{O,DFE}$ .

For long-term reliability, simulations under power-up/down and full-swing conditions must be checked to ensure the nodevoltage trajectories of the device terminal voltages are within the reliability limits at both static and transient states [20].

# IV. EXPERIMENTAL RESULTS

The 10-Gb/s DFE receiver prototype was fabricated in 65-nm CMOS. It occupies a compact area of 0.002 mm<sup>2</sup> (Fig. 15). The measurements were conducted in a chip-on-board assembly. The on-chip test buffer is an over-designed three-stage CML amplifier such that it cannot affect the results of the DFE receiver. The testbench (Fig. 16) is based on the J-BERT N4903B for PRBS generation on a pair of 84-cm-length FR4 microtrip traces, yielding the channel pulse response with precursor and post-cursors.

Recalling Fig. 1(a), an 84-cm PCB differential trace suffers from channel loss of 14.1/18.7/23.3 dB at 3/4/5 GHz. Thus, all unequalized eye diagrams were completely closed. As shown in Fig. 17, at 6/8/10-Gb/s data rates, the DFE receiver manifests a vertical eye opening of 290/236/189.3 mV and pk-to-pk jitter of 23.45/22.69/26.45 ps, respectively.

The bathtub curves shown in Fig. 18(a)–(c) show 72.8/65.7/60.9% horizontal eye opening at 6/8/10-Gb/s data rates under a BER of  $10^{-9}$  and  $2^7 - 1$  PRBS. For a BER of  $10^{-12}$ , the horizontal eye opening is still 71.5/65.0/59.6% at 6/8/10-Gb/s data rates. The DFE receiver excluding the buffer draws 4.26 mA ( $I_{\rm DFF} = 2.9$  mA,  $I_{B,\rm tap} = 0.12$  mA,  $2I_{B,LE} = 1.24$  mA) at 1.5 V.

The chip summary and comparison with the prior art (similar data rates) are given in Table I [4], [5], [14], [21] and plotted in Fig. 19(a) and (b) [2], [4], [5], [14], [21]–[29]. The 10–11-Gb/s half-rate DFE receiver in [21] has achieved a FOM [1] of 0.015 pJ/bit/dB in a tiny area of 0.0012 mm<sup>2</sup>, but the horizontal eye opening is limited to 11%/34% at  $10^{-9}$  BER under 18/15-dB channel loss at Nyquist frequency. This work clocked at 10-Gb/s data rate measures the widest horizontal eye opening of 59.6% at  $10^{-12}$  BER, while maintaining the FOM and core area amongst those best results reported.



Fig. 19. Benchmark this work with the state-of-the-art: (a) FOM versus horizontal eye opening and (b) core area versus technology nodes.

#### V. CONCLUSION

A 0.002-mm<sup>2</sup> 10-Gb/s full-rate direct DFE receiver has been demonstrated in 65-nm CMOS. It is capable of restoring a horizontal eye opening of 59.6% at  $10^{-12}$  BER under 23.3-dB channel loss at Nyquist frequency. This result was enabled by an AI linear equalizer to eliminate the precursor and long-tail post-cursors via orthogonally tunable low- and high-frequency de-emphasis, and clocked-one-tap RZ feedback data patterns with a D-flip-flop slicer addressing the first post-cursor with sharper data transition and enhanced signal amplitude. Benchmarking with the prior art at similar data rates, this work exhibits the widest horizontal eye opening at a FOM of 0.027 pJ/bit/dB.

#### ACKNOWLEDGMENT

The authors thank J. Yuan, IMECAS, and Y. Zhang and S. Lian for test support in the Agilent Open Laboratory.

#### REFERENCES

- S. Ibrahim and B. Razavi, "Low-power CMOS equalizer design for 20-Gb/s systems," *IEEE J. Solid-State Circuits*, vol. 46, no. 6, pp. 1321–1336, Jun. 2011.
- [2] T. O. Dickson *et al.*, "A 12-Gb/s 11-mW half-rate sampled 5-tap decision feedback equalizer with current-integrating summers in 45-nm SOI CMOS technology," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1298–1305, Apr. 2009.
- [3] Y. Huang et al., "A 6 Gb/s receiver with 32.7 dB adaptive DFE-IIR equalization," in *Int. Solid-State Circuits Conf. Tech. Dig.*, Feb. 2013, pp. 42–43.
- [4] Q. Elhadidy et al., "A 10 Gb/s 2-IIR-tap DFE receiver with 35 dB loss compensation in 65-nm CMOS," in *IEEE Symp. VLSI Circuits*, Jun. 2013, pp. 272–273.
- [5] B. Kim et al., "A 10-Gb/s compact low-power serial I/O with DFE-IIR equalization in 65-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3526–3538, Dec. 2009.

- [6] F. Bien et al., "A 10-Gb/s reconfigurable CMOS equalizer employing a transition detector-based output monitoring technique for band-limited serial links," *IEEE Trans. Microw. Theory Techn.*, vol. 54, no. 12, pp. 4538–4547, Dec. 2006.
- [7] M. Maeng *et al.*, "0.18-μm CMOS equalization techniques for 10-Gb/s fiber optical communication links," *IEEE Trans. Microw. Theory Techn.*, vol. 53, no. 11, pp. 3509–3519, Nov. 2005.
- [8] S. Chandramouli *et al.*, "10-Gb/s optical fiber transmission using a fully analog electronic dispersion compensator (EDC) with unclocked decision-feedback equalization," *IEEE Trans. Microw. Theory Techn.*, vol. 55, no. 12, pp. 2740–2746, Dec. 2007.
  [9] S. Shahramian *et al.*, "Decision feedback equalizer architectures with
- [9] S. Shahramian *et al.*, "Decision feedback equalizer architectures with multiple continuous-time infinite impulse response filters," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 59, no. 6, pp. 326–330, Jun. 2012.
- [10] A. Palaniappan and S. Palermo, "A design methodology for power efficiency optimization of high-speed equalized-electrical I/O architectures," *IEEE Trans. Very Large Scale Integr. (VLSI) Syst.*, vol. 21, no. 8, pp. 1421–1431, Aug. 2013.
- [11] Y. Hidaka et al., "A 4-channel 1.25–10.3 Gb/s backplane transceiver macro with 35 dB equalizer and sign-based zero-forcing adaptive control," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3547–3559, Dec. 2009.
- [12] K. Lee and J.-Y. Sim, "Half-rate clock-embedded source synchronous transceivers in 130-nm CMOS," *IEEE Trans. Very Large Scale Integr.* (VLSI) Syst., vol. 22, no. 10, pp. 2093–2102, Oct. 2014.
- [13] H. Wang and J. Lee, "A 21-Gb/s 87 mW transceiver with FFE/DFE/ analog equalizer in 65-nm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 45, pp. 909–920, Apr. 2010.
- [14] M. Pozzoni *et al.*, "A 12 Gb/s 39 dB loss-recovery unclocked-DFE receiver with bidimensional equalization," in *Int. Solid-State Circuits Conf. Tech. Dig.*, Feb. 2010, pp. 164–165.
- [15] A. Pirola *et al.*, "Current-mode WCDMA channel filter with In-band noise shaping," *IEEE J. Solid-State Circuits*, vol. 45, pp. 1770–1780, Sep. 2010.
- [16] P.-I. Mak and R. P. Martins, "A 2× V<sub>DD</sub>-enabled mobile-TV RF front-end with TV-GSM interoperability in 1-V 90-nm CMOS," *IEEE Trans. Microw. Theory Techn.*, vol. 58, no. 7, pp. 1664–1676, Jul. 2010.
- [17] S. Gondi and B. Razavi, "Equalization and clock and data recovery techniques for 10-Gb/s CMOS serial links," *IEEE J. Solid-State Circuits*, vol. 42, no. 9, pp. 1999–2011, Sep. 2007.
- [18] D. Lee *et al.*, "10 Gbit/s 0.0065 mm<sup>2</sup> 6 mW analogue adaptive equalizer utilizing negative capacitance," *Electron. Lett.*, vol. 45, no. 17, pp. 863–865, Aug. 2009.
- [19] A. Benachour, "Pseudo-dynamic differential flip-flop," U.S. Patent 6 140 845, Oct. 31, 2000.
- [20] P.-I. Mak and R. P. Martins, *High-/Mixed-Voltage Analog and RF Circuit Techniques for Nanoscale CMOS*, ser. Analog Circuits Signal Process. (ACSP). Berlin, Germany: Springer, 2012.
- [21] A. Rylyakov, "An 11 Gb/s 2.4 mW half-rate sampling 2-tap DFE receiver in 65 nm CMOS," in *IEEE VLSI Circuits Symp.*, Jun. 2007, pp. 272–273.
- [22] L. Chen *et al.*, "A scalable 3.6-to-5.2 mW 5-to-10 Gb/s 4-tap DFE in 32 nm CMOS," in *Int. Solid-State Circuits Conf. Tech. Dig.*, Feb. 2009, pp. 180–181.
- [23] M. Pozzoni *et al.*, "A multi-standard 1.5 to 10 Gb/s latch-based 3-tap DFE receiver with a SSC tolerant CDR for serial backplane communication," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1306–1315, Apr. 2009.
- [24] F. Spagna et al., "A 78 mW 11.8 Gb/s serial link transceiver with adaptive RX equalization and baud-rate CDR in 32 nm CMOS," in Int. Solid-State Circuits Conf. Tech. Dig., Feb. 2010, pp. 366–367.
- [25] T. Toifl et al., "A 2.6 mW/Gbps 12.5 Gbps RX with 8-tap switchedcapacitor DFE in 32 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 4, pp. 897–910, Apr. 2012.
- [26] M. H. Nazari and A. Emami-Neyestanak, "A 15-Gb/s 0.5-mW/Gbps two-tap DFE receiver with far-end crosstalk cancellation," *IEEE J. Solid-State Circuits*, vol. 47, no. 10, pp. 2420–2432, Oct. 2012.
- [27] J. Bulzacchelli et al., "A 78 mW 11.1 Gb/s 5-tap DFE receiver with digitally calibrated current-integrating summers in 65 nm CMOS," in *Int. Solid-State Circuits Conf. Tech. Dig.*, Feb. 2009, pp. 368–369.
- [28] J. Bulzacchelli *et al.*, "Power-efficient decision-feedback equalizers for multi-Gbs CMOS serial links," in *Proc. IEEE Radio Freq. Integr. Circuits Symp.*, Jun. 2007, pp. 507–510.
- [29] J. Bulzacchelli et al., "A 10-Gb/s 5-tap DFE/4-tap FFE transceiver in 90-nm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2885–2900, Dec. 2006.



Yong Chen (S'10–M'11) received the B.S. degree in electronic and information engineering from the Communication University of China, Beijing, China, in 2005, and the Ph.D. degree in microelectronics and solid-state electronics from the Institute of Microelectronics of Chinese Academy of Sciences (IMECAS), Beijing, China, in 2010.

Since 2010, he has been with the Computer-Aided Design (CAD) Division, Institute of Microelectronics, Tsinghua University, Beijing, China, as a Post-Doctoral Researcher. His current research

interests include analog/mixed-signal/millimeter-wave integrated circuits, and high-speed wireline circuits and systems.



**Pui-In Mak** (S'00–M'08–SM'11) received the Ph.D. degree from the University of Macau (UM), Macao, SAR, China, in 2006.

He is currently an Associate Professor with the UM, and Coordinator of the Wireless and Biomedical Research Lines of the State Key Laboratory of Analog and Mixed-Signal VLSI. His research interests are analog and RF circuits and systems for wireless, biomedical, and physical chemistry applications. His group reported six state-of-the-art chips at the IEEE International Solid-State Circuits

Conference: wideband receivers (2011, 2014), micro-power amplifiers (2012, 2014) and ultra-low-power ZigBee receivers (2013, 2014) and pioneered the world's first Intelligent Digital Microfludic Technology (iDMF) with Nuclear Magnetic Resonance (NMR) and Polymerase Chain Reaction (PCR) capabilities. He authored *Analog-Baseband Architectures and Circuits for Multistandard and Low-Voltage Wireless Transceivers* (Springer, 2007) and *High-/Mixed-Voltage Analog and RF Circuit Techniques for Nanoscale CMOS* (Springer, 2012).

Prof. Mak has been involved with the IEEE including Distinguished Lecturer (20–2015) and a member of the Board of Governors (2009–2011) of the IEEE Circuits and Systems Society (CASS). He is a n Editorial Board member of the IEEE Press (2014–2016), senior editor of the IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS (2014–2015), associate editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS

(2010–2011 and 2014–present), associate editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—II: EXPRESS BRIEFS (2010–2013), and guest editor of the IEEE RFIC VIRTUAL JOURNAL (2014). He is a Technical Program Committee (TPC) member of the Asian Solid-State Circuits Conference (A-SSCC). He was the recipient of the IEEE DAC/ISSCC Student Paper Award (2005), the IEEE CASS Outstanding Young Author Award (2010), the National Scientific and Technological Progress Award (2011), the Best Associate Editor for the TRANSACTIONS ON CIRCUITS AND SYSTEMS–II: EXPRESS BRIEFS (2012–2013). In 2005, he was decorated with the Honorary Title of Value for scientific merits by the Macau Government.



Li Zhang received the B.S. degree in electronic engineering from Tsinghua University, Beijing, China, in 1994, and the M.S. degree from the Institute of Microelectronics, Chinese Academy of Science, Beijing, in 1997.

Since 1997, she has been a Lecturer with the Institute of Microelectronics, Tsinghua University. Her research centers on device modeling and RF circuit design.



Yan Wang received the B.S. and M.S. degrees in electrical engineering from Xi'an Jiaotong University, Xi'an, China, in 1988 and 1991, respectively, and the Ph.D. degree in semiconductor device and physics from the Chinese Academy of Sciences, Beijing, China, in 1995.

Since 1999, she has been a Professor with the Institute of Microelectronics, Tsinghua University, Beijing, China. Her current research interests include modeling of RF/microwave (MW)/millimeter-wave (MMW) devices and components, the carrier

transport models in scaled-down metal-oxide-semiconductor devices, and computer-aided-design software development for micro-devices and nano-devices.