# An Area-Efficient and Tunable Bandwidth-Extension Technique for a Wideband CMOS Amplifier Handling 50+ Gb/s Signaling

Yong Chen, *Member, IEEE*, Pui-In Mak, *Senior Member, IEEE*, Haohong Yu, Chirn Chye Boon, *Senior Member, IEEE*, and Rui P. Martins, *Fellow, IEEE* 

Abstract—This paper reports an area-efficient and tunable bandwidth (BW)-extension technique for a wideband CMOS amplifier to handle very high rate (50+ Gb/s) signaling while keeping a low jitter penalty. We identify its architectural advantages by correlating the performances with the frequency domain (magnitude and group delay (GD) responses) and time domain (impulse and step responses) and comparing them with the existing solutions. Specifically, our technique enables a flexible ac characteristic by introducing a tunable grounded active inductor in the bridged-shunt peaking topology, offering: 1) a high BW enhancement ratio (BWER =  $2.65 \times$ ); 2) BW-power scalability with small in-band gain variation; and 3) fine tunability of the passband gain without affecting the BW, GD, and power. The experimental prototype is a 65-nm CMOS four-stage differential amplifier occupying just 0.0077 mm<sup>2</sup>. It delivers a 15-dB gain over a 43-GHz BW with 45-mW power consumption. Small in-band gain variation (0.58 dB) and ripple (1.53 dB) are concurrently achieved with low in-band GD variation (17 to 35.3 ps) and ripple (18.3 ps). The achieved figure of merit of 5.48 [(dc Gain × BW)/Power] compares favorably with the prior art.

Index Terms—AC characteristic, bandwidth (BW), bridgedshunt peaking, CMOS, data-dependent jitter (DDJ), figure of merit (FOM), grounded active inductor (GAI), group delay ripple (GDR), intersymbol interference (ISI), shunt peaking, T-coil, wideband amplifier.

### I. INTRODUCTION

**T**O COPE with the next-generation 400-Gb/s Ethernet systems which typically consist of eight channels in

Manuscript received November 2, 2016; revised March 18, 2017; accepted May 20, 2017. Date of publication July 17, 2017; date of current version December 12, 2017. This work was supported in part by the University of Macau under Grant MYRG2017-00223-AMSV and in part by VIRTUS in Nanyang Technological University, Singapore. (*Corresponding author: Yong Chen.*)

Y. Chen was with VIRTUS, School of Electric and Electronic Engineering, Nanyang Technological University, Singapore 639798. He is now with the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macao, China (e-mail: ychen@umac.mo).

P.-I. Mak is with the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macao, China, and also with the Faculty of Science and Technology, Department of Electrical and Computer Engineering, University of Macau, Macao, China (e-mail: pimak@umac.mo).

H. Yu and C. C. Boon are with VIRTUS, School of Electric and Electronic Engineering, Nanyang Technological University, Singapore 639798.

R. P. Martins is with the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macao, China, with the Faculty of Science and Technology, Department of Electrical and Computer Engineering, University of Macau, Macao, China, and also with the Instituto Superior Técnico, Universidade de Lisbon, 1600-276 Lisbon, Portugal (e-mail: rmartins@umac.mo).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TMTT.2017.2720600

parallel [1], each transceiver should master a data rate of higher than 50 Gb/s under challenging power and area budgets. Wideband amplifiers [2]–[5] with tens-of-GHz bandwidth (BW) are their critical block to amplify the data with low jitter penalty (i.e., better eye diagram). Currently, inductive-peaking [6]–[23] and distribution [24], [25] topologies dominate such a role, owing to their large gain and BW capability in nanoscale CMOS. Yet, most of the above topologies are multistage designs to accumulate an acceptable gain. A large number of passive components (e.g., inductors, T-coils, transformers, and transmission lines) can severely raise the design complexity in the floorplan due to magnetic coupling and routing parasitics, while penalizing the chip area (e.g., 0.94 mm<sup>2</sup> in [24] and 0.41 mm<sup>2</sup> in [25]).

For inductive wideband amplifiers, the effectiveness and flexibility of their BW-extension techniques are crucial. Single-passive-inductor topologies: shunt peaking [6]–[9], bridged-shunt peaking [10], [11], and series peaking [12]–[14] can improve the area-efficient BW-extension ratio (BWER), defined as the 3-dB BW  $(f_{-3 \text{ dB}})$  of the BW-extended amplifier over the basic RC amplifier. Yet, their BWER, limited to  $3.17 \times$  with 2-dB peaking [6], cannot be easily tuned. Their variants, such as the (bridged-) shunt-series peaking [15]–[19], show a better BWER of  $4\times$ , but require more passive inductors. One of them, located in the signal path, can strongly affect the in-band gain and group delay (GD) characteristics. Other alternatives: T-coil peaking [20], [21], transformer peaking [22], and synthesis-based peaking with LC ladder [23] can exhibit a high BWER close to  $5\times$ . Yet, the tradeoff is the increased number of the passive inductive components, while the promised BWER only occurs under some specified load conditions (Table I). Currently, Weiss et al. [17] allow adjusting the frequency response by changing the load resistor, but the penalty is a large in-band gain variation ( $\sim 10 \text{ dB}$ ).

This paper aims at an area-efficient BW-extension technique that offers a wide tunability of gain and GD characteristics to optimize the data eye, which is determined by both the signal amplitude and jitter. All existing BW-extend techniques only bring more inductive components focusing on a higher BWER in the frequency domain, as summarized in [6]. In fact, the transient performances such as settling time, vertical opening, and jitter of the data eye should jointly be considered for high-speed systems. Although Walling *et al.* [26] described the time-domain characteristics (e.g., step response) associated with different gain-peaking techniques, it still remains unclear

0018-9480 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.

| BW-Extension<br>Techniques              | No. of<br>Zero/Pole | No. of Inductive<br>Element Per Stage | Limitation<br>by k <sub>c</sub> * | ac<br>Tunability | BWER<br>/0dB | BWER<br>/Peaking (dB) |
|-----------------------------------------|---------------------|---------------------------------------|-----------------------------------|------------------|--------------|-----------------------|
| Basic RC                                | 0 / 1               | 0                                     | No                                | No               | 1x           | 1x /                  |
| Shunt peaking                           | 1/2                 | 1                                     | No                                | No               | 1.72x        | 1.84x / 1.5           |
| Bridged-shunt peaking                   | 2/3                 | 1                                     | No                                | No               | 1.83x        | 1.83x / 1.5           |
| Series peaking                          | 0/3                 | 1                                     | Yes<br>(0.1 ~ 0.5)                | No               | 2.52x        | 3.17x / 2             |
| Bridged-shunt GAI<br>peaking (proposed) | 3 / 4               | 1                                     | No                                | Yes              | 2.56x        | 3.55x / 1.71          |
| Bridged-shunt series<br>peaking         | 2/5                 | 2                                     | Yes<br>(0.4 ~ 0.5)                | No               | 3.92x        | 4x / 2                |
| Asymmetric T-coil<br>peaking            | 1/4                 | 3**                                   | Yes<br>(0.1 ~ 0.4)                | No               | 3.93x        | 5.59x / 2             |

 TABLE I

 Summary of the Existing and Proposed BW-Extension Topologies (Listed According to Their BWER)

\*  $k_c = C_l/(C_l + C_2) = C_l/C_L$ ,  $C_l$  is the drain parasitic capacitance,  $C_2$  is the load capacitance and  $C_L$  is the total capacitance.

\*\* The equivalent small-signal network with T-model of the transformer [6] involves three inductive elements



Fig. 1. Key parameters of wideband amplifiers. (a) Magnitude and (b) GD responses in the frequency domain. (c) Impulse and (d) step responses in the time domain, for a typical third-order transfer function (one real and two complex poles).

how the data-dependent jitter (DDJ) can be linked with the ac characteristics, and how the metrics in the frequency and time domains can be balanced to approach the best-quality eye data, i.e., small intersymbol interference (ISI) and small data timing jitter.

In this paper, we undertake an analytical approach to link up the magnitude and GD responses (frequency domain) with the impulse and step responses (time domain). Besides, we propose a design method to extract the GD ripple (GDR) and magnitude surplus at high frequency, such that the DDJ and ISI can effectively be predicted. We describe, as well, the use of the root locus method to determine the pole-zero locations for high-order transfer functions, simplifying the analysis of the ripple and peaking of the magnitude and GD responses for both the existing and proposed BW-extension techniques. The prototype is a 65-nm CMOS four-stage differential wideband amplifier with flexible ac characteristics using a tunable grounded active inductor (GAI). The theoretical BWER  $(2.56 \times)$  with 0-dB peaking, as shown in Table I, is competitive, and there is no limit about the load conditions.

Section II introduces a method to extract the DDJ and ISI, while Section III discusses the key features of the previous works, from the viewpoints of GD and BW to the DDJ. Section IV describes the proposed bridged-shunt peaking technique with GAI. Section V details the experimental amplifier, followed by its measurement results in Section VI. Finally, we draw the conclusions in Section VII.

# II. GDR AND MAGNITUDE SURPLUS (FREQUENCY DOMAIN) VERSUS DDJ AND ISI (TIME DOMAIN)

The transfer characteristics of most inductive-peaking techniques are a complex high-order function with multiple zeros and poles (Table I), which cannot be simply expressed in a closed-form equation. Facing that the root locus method is employed to study the DDJ in the data eye, correlating the mutual impacts between the frequency and time domains.

#### A. Frequency Response

For a high-order (e.g., third order) transfer function, its magnitude and GD responses can be sketched as shown in Fig. 1(a) and (b), respectively. When the complex poles have



Fig. 2. Time-domain performances. (a) Top eye diagram as reference eye [corresponding to the red curve in Fig. 1(a)] and bottom eye diagram with severe DDJ and ISI [corresponding to the blue curve in Fig. 1(a)]. (b) Predicting DDJ versus  $\alpha$  and (c) predicted ISI versus  $\alpha$ , where  $\alpha = e^{-f_{-3}} \frac{dB}{(Datarate)}$ .

a low damping ratio (complex-pole to real-pole frequencies:  $f_{\text{complex}}/f_{\text{real}} > 2$ ), their contribution to the magnitude is insignificant. Thus, we expect a flat response  $H_F(\omega)$  [the red curve in Fig. 1(a)] with no peaking on the magnitude (close to a Bessel response), which can be served as the reference for comparison. When all poles shift toward a higher frequency and  $f_{\text{complex}}/f_{\text{real}}$  gets smaller, the complex poles take more effect, peaking up the high-frequency magnitude [the blue curve in Fig. 1(a)], and thereby raising the BWER. Meanwhile, the GDR sharply raises at high-frequency range. We can divide the total magnitude response  $|H_{\text{total}}(\omega)|$  into two parts: the flat magnitude response  $(|H_{\Delta}(\omega)|)$  that generates the peaking, as given in the following equation:

$$|H_{\text{total}}(\omega)| = |H_F(\omega)| \cdot |H_{\Delta}(\omega)|.$$
(1)

Also, we can subtract the flat phase  $\Phi_F(\omega)$  from the total phase  $\Phi_{\text{total}}(\omega)$  to obtain the residual phase  $\Phi_{\text{res}}(\omega)$ , which can be expanded in a Taylor series at a frequency  $\omega_0$  that shows the GD peaking

$$\Phi_{\rm res}(\omega) = \Phi_0 + \Phi_1(\omega - \omega_0) + \frac{\Phi_2}{2}(\omega - \omega_0)^2 + \cdots$$

where

$$\Phi_k = \left. \frac{\partial^k \Phi_{\rm res}(\omega)}{\partial \omega^k} \right|_{\omega = \omega_0}.$$
 (2)

The second coefficient of the series is the GD.

#### B. Time Response

From the above, we obtain and plot the corresponding total impulse response  $h_{\text{total}}(t)$  in Fig. 1(c)

$$h_{\text{total}}(t) = \int_{-\infty}^{+\infty} |H_{\text{total}}(\omega)| e^{j(\omega t + \Phi_{\text{total}}(\omega))} d\omega.$$
(3)

The main cursor in  $h_{total}(t)$  remains similar with and without gain peaking, but their long-tail postcursors are different [27]. For the step response  $s_{total}(t)$  that is the integral of  $h_{total}(t)$  as shown in Fig. 1(d), except for the existence of ringing or overshoot, a wide BW can reduce the step rising time (from 10% to 90%). It is helpful to reveal why the highfrequency GDR strongly impacts the ringing swing in the impulse response. Considering (1) and (2), two of the predicted impulse responses can be obtained as follows:

$$h_{\text{predict}_{\text{DDJ}}}(t) = \int_{-\infty}^{+\infty} |H_F(\omega)| e^{j(\omega t + \Phi_F(\omega) + \Phi_1(\omega))} d\omega \quad (4)$$
$$h_{\text{predict}_{\text{ISI}}}(t) = \int_{-\infty}^{+\infty} |H_{\text{total}}(\omega)| e^{j(\omega t + \Phi_{\text{total}}(\omega) - \Phi_1(\omega))} d\omega. \tag{5}$$

The DDJ can be predicted by (4) when the high-frequency GDR increases in the frequency domain. The ISI can be predicted by (5) including the surplus magnitude. As phase distortion gets more severe at high frequency, both DDJ and ISI are critical to the quality of the data eye.

# C. DDJ and ISI

The impulse response of a general LTI system contains the useful information (main cursor) and interference information (other cursors). The latter can be divided into two cases.

- The impulse response of an LTI system, under a limited BW, can suffer from a gradual long-tail characteristic [27], severely disturbing the current data transition by the prior symbols when increasing the data rate. This interference is characterized by both ISI and DDJ, which reduce the voltage and time margins of the data eye, respectively. Meanwhile, the ISI and DDJ are correlative: any circuit nonidealities can bring in ISI, and its effect on the timing shifts the threshold-crossing time of a data transition, resulting in the DDJ. The impact of the limited BW on DDJ with ISI has been studied elsewhere [28], [29] using either closed-form equations (first- and second-order LTIs) or the perturbation method (higher order LTI).
- 2) The impulse response with surplus magnitude and severe phase distortion [blue curve in Fig. 1(c)] can have the under-damped ringing feature, alternatively disturbing the present data transition by the previous symbols. This also generates DDJ and ISI [Fig. 2(a)]. Here, we propose an intuitive method to predict the DDJ, resulting from the phase distortion and ISI by considering the surplus magnitude. First, the total DDJ [Fig. 2(b)] is



Fig. 3. Existing wideband amplifiers (single stage). (a) Bridged-shunt peaking. (b) Series peaking. (c) Bridged-shunt series peaking. (d) Asymmetric T-coil peaking  $(L_1 \neq L_2)$ .

evaluated by convoluting the testing data [i.e., pseudorandom binary sequence (PRBS)] with  $h_{\text{total}}(t)$ . The DDJ<sub>total</sub> degrades significantly due to the strong interference of ringing as  $\alpha$  increases, where  $\alpha = e^{-f_{-3dB}/(\text{DataRate})}$ , showing how the  $f_{-3dB}$  is related to the achievable data rate [Fig. 2(b) and (c)]. Then, we utilize the flat function in (4) carrying the GDR to generate DDJ<sub>predict</sub> that is consistent with DDJ<sub>total</sub> in Fig. 2(b). As a result, we can extract the DDJ originated in the phase distortion. Applying the above, we convolute (5) to predict the tendency of the total ISI as shown in Fig. 2(c).

# III. GD AND DDJ OF THE EXISTING BW-EXTENSION TECHNIQUES

In this section, we discuss the prior BW-extension techniques [Fig. 3(a)-(d)], using their zero and pole locations to track the ripple in their GD responses, from which their DDJ can be estimated.

#### A. Bridged-Shunt Peaking

The RC amplifier is a common reference to show the BWER of each BW-extension technique. A simple load with only a resistor  $(R_L)$  and a capacitor  $(C_L = C_1 + C_2)$  has a limited BW of  $\omega_0 = 1/R_L C_L$ . As shown in Fig. 3(a), adding an inductor  $(L_1)$  in series with  $R_L$ , named as shunt peaking, can delay the signal current that flows into them from the transistor. Thus, the initial signal current charges  $C_L$  more, leading to a wider BW and a sharper transition of the output voltage without timing delay. In fact, there is no pure shunt peaking due to the presence of the parasitic capacitance  $(C_B)$  from the passive inductor at the node  $V_m$ .  $C_B$  serves as the bridged capacitor forms the bridged-shunt peaking. Although enlarging  $L_1$  extends the BW, the surplus inductance will peak up both the magnitude and GD responses.  $C_B$  in this case can help to compensate the excessive inductive effect, reducing the in-band magnitude and GD peaking while preserving a reasonable BWER of 1.84×. Considering the optimal parameters [6, Fig. 5], the zeros and poles can be positioned in the  $\sigma + j\omega$  plane according to [6, eq. (6)]. The GD response of a bridged-shunt network can be derived as

follows:

$$\tau_{g\_BSP}(\omega) = \sum_{k=1}^{2} \frac{\sigma_{zk\_BSP}}{\sigma_{zk\_BSP}^{2} + (\omega \pm \omega_{zk\_BPS})^{2}} - \sum_{k=1}^{3} \frac{\sigma_{pk\_BSP}}{\sigma_{pk\_BSP}^{2} + (\omega \pm \omega_{pk\_BSP})^{2}}$$
(6)

when  $k_B(k_B = C_B/C_L)$  varies from 0.1 to 0.3, and two unequal real zeros are turned to two complex zeros  $(-1.667 \pm 2.295j)$  and gradually get closer to other complex poles  $(-0.981 \pm 1.559j)$ . The peaking of 0.68 dB is negated by  $C_B$ , and a maximally flat response is achieved while keeping a constant BWER of  $1.84\times$ . Meanwhile, the GDR still exists but drops from  $2.08\times$  to  $1.43\times$  [Fig. 4(a)]. In the time domain, when  $\alpha > 0.6$ , more ISI will appear and brings up more DDJ (larger than 0.05 UI) for all above cases shown in Fig. 5(a). Yet, for a BWER =  $1.83\times$  with 0-dB peaking, a DDJ of 0.05 UI occurs earlier when  $\alpha = 0.48$ .

#### B. Series Peaking

The series peaking [Fig. 3(b)] is achieved by inserting a series inductor between the drive capacitor  $(C_1)$  and the load capacitor  $(C_2)$ .  $k_c = C_1/(C_1+C_2)$  is a parameter that indicates the load condition. The small-signal current charges  $C_1$  first due to the presence of the series inductor  $(L_2)$ . Thus, the initial capacitance to be charged is reduced from  $C_L$ , in the shunt peaking, to  $C_1$  (i.e.,  $k_c < 1$ ). This charging behavior improves BWER if carefully designed with the load condition  $k_c$ .  $L_2$ , located in the signal path, only affects the distribution of the poles but not the zeros. The capacitive splitting leads to a third-order equation [6, eq. (4)]. The three poles (one real and two complex poles) are solved and optimized for the BW-extended response under different  $k_c$  [6, Fig. 7]. The GD response is calculated as follows:

$$\tau_{g\_SP}(\omega) = \sum_{k=1}^{3} \frac{\sigma_{pk\_SP}}{\sigma_{pk\_SP}^2 + (\omega \pm \omega_{pk\_SP})^2}.$$
 (7)

A maximum BWER of  $3.17 \times$  with a 2-dB in-band gain ripple is obtained, when  $k_c = 0.4$  and m = 2.5. With the optimized real pole (-1.19) and complex poles (-0.662 ± 2.9*j*), the step response shows a faster rising time, but the large



Fig. 4. Normalized GD of (a) bridged-shunt peaking plotted by (6) ([6, Fig. 5]). (b) Series peaking plotted by (7) ([6, Fig. 7]). (c) Bridged-shunt series peaking plotted by (8) ([6, Fig. 9]). (d) Asymmetric T-coil peaking plotted by (9) ([6, Fig. 11]).

ringing induces more ISI (>25% when  $\alpha$  > 0.1) and DDJ of 0.05 UI for  $\alpha$  = 0.4. This implies that the data rate can only be  $< f_{-3} \frac{1}{\text{dB}}/0.9$  for an acceptable openness of the data eye.

Alternatively, if  $k_c = 0.3$  and m = 2.4, other real pole (-1.3) and complex poles (-1.02 ± 2.78*j*) are obtained. In this case, the magnitude response with 0-dB peaking appears with a BWER of 2.52×, the ISI on magnitude becomes <14.8%, and DDJ < 0.05 UI can be achieved even if  $\alpha$  goes up to 0.57 [Fig. 5(b)]. Finally, due to the presence of the low-frequency real pole, the GD decreases gradually and the complex poles also lead to a gradually increased GD, resulting in a high-frequency peaking as shown in Fig. 4(b).

# C. Bridged-Shunt Series Peaking

The bridged-shunt series peaking [Fig. 3(c)] combines the benefits of inductive peaking and capacitive splitting, simultaneously changing the initial charges in the resistive and capacitive paths. This allows a larger BWER of  $\sim 4 \times$ when  $k_c = 0.4$ , but entailing more passive inductors. The GD response is given by the following equation:

$$\tau_{g\_BSSP}(\omega) = \sum_{k=1}^{2} \frac{\sigma_{zk\_BSSP}}{\sigma_{zk\_BSSP}^{2} + (\omega \pm \omega_{zk\_BSSP})^{2}} - \sum_{k=1}^{5} \frac{\sigma_{pk\_BSSP}}{\sigma_{pk\_BSSP}^{2} + (\omega \pm \omega_{pk\_BSSP})^{2}}.$$
 (8)

If the optimal parameters  $k_c = 0.4$ ,  $m_1 = 8$ , and  $m_2 = 2.4$ [6, Fig. 9] are set, excess gain peaking can be avoided. The two zeros  $(-1.67 \pm 4.89j)$  and five poles  $(-1.59, -1.58 \pm 2.8j)$ and  $-0.535 \pm 3.99 j$ ) are obtained by [6, eq. (5)]. The lowest real pole (-1.59) dominates the total magnitude followed by the lower complex pole  $(-1.58 \pm 2.8i)$  located in-band. Other complex poles and zeros contribute weakly to the total magnitude. A GD peaking appears at high frequency but not on the magnitude. When the lowest real pole shifts toward -2.1and the lower complex poles go backward to  $-2.23 \pm 2.18i$ , the peaking happens on both the magnitude and GD responses for the optimal parameters  $k_c = 0.4$ ,  $m_1 = 6$ , and  $m_2 = 2.4$ . As shown in Fig. 4(c), below  $f_L$ , the GD is approximately constant, but between  $f_L$  and  $f_H$ , the GD peaks up regardless of the peaking on the magnitude at the transition. This GDR worsens the data eve. DDJ <0.05 UI and ISI >20% are achieved when  $\alpha < 0.4$ , as shown in Fig. 5(c).

# D. Asymmetric T-Coil Peaking

The bridged-shunt series peaking has a large BWER of  $\sim 4 \times$ when  $k_c = 0.4$ . To improve this further, the magnetic coupling effect from the T-coil topology was developed. An asymmetric  $(L_1 \neq L_2)$  T-coil peaking amplifier [Fig. 3(d)] can generate a negative mutual inductance (-M), resulting in a coupling coefficient  $(-k_m = -M/(L_1L_2)^{1/2})$ . This topology can be viewed as a combination of series and shunt peaking by



Fig. 5. DDJ versus  $\alpha$ . (a) Bridged-shunt peaking ([6, Fig. 5]). (b) Series peaking ([6, Fig. 7]). (c) Bridged-shunt series peaking ([6, Fig. 9]). (d) Asymmetric T-coil peaking ([6, Fig. 11]).

swapping  $L_1$  and  $R_L$  with  $L_2$ , while magnetically coupling  $L_2$ and  $L_1$ . As in series peaking,  $L_2$  splits  $C_1$  and  $C_2$ , such that the signal current can flow to  $C_1$  first. After that, the signal current via  $L_2$  flowing to  $R_L$  including  $L_1$  and  $C_2$  is similar to the shunt peaking. Also, the negative magnetic coupling can boost the current to flow through  $C_2$ , improving the BWER. Under the optimal parameters [6, Fig. 11], there are one real zero and four poles [6, eq. (6)]. The corresponding GD response is analyzed as follows:

$$\tau_{g\_ATCP}(\omega) = \frac{\sigma_{z\_ATCP}}{\sigma_{z\_ATCP}^{2} + (\omega \pm \omega_{z\_ATCP})^{2}} - \sum_{k=1}^{3} \frac{\sigma_{pk\_ATCP}}{\sigma_{pk\_ATCP}^{2} + (\omega \pm \omega_{pk\_ATCP})^{2}}.$$
 (9)

Under  $k_c = 0.2$ ,  $m_1 = 5.5$ ,  $m_2 = 2.4$ , and  $k_m = 0.6$ , we can obtain the two real poles (-1.68 and -3.96), two complex poles (-1.48 ± 4.16*j*), and one real zero (-2.88), such that a high BWER of 4.14× can be achieved with no peaking on the magnitude. The lower pole mainly makes the magnitude rolling off faster in the vicinity of 1 rad/s, and the higher real zero holds the magnitude of -2 dB at 3 rad/s. The peaking effect provided by other poles is weak when comparing with the main pole (-1.68). The GDR is ~1.41×, but the main pole (-1.68) brings severe valley at 1.9 rad/s, and the GD peaking at 4.1 rad/s is synthesized by the zero and other poles [Fig. 4(d)]. For another set of optimal parameters, the two complex poles at ~1.9 rad/s enhance the magnitude at 1 rad/s with respect to the former case. The real zero and higher complex poles both impact the high-frequency magnitude and GD responses with a larger BWER of ~5×. The maximum BWER is  $5.59 \times$  with 2-dB peaking, but generating a non-flat magnitude response [26, Fig. 19(a)]. The corresponding GD induces multiple ripples [26, Fig. 19(b)], with the maximum ripple of  $1.21 \times$  for  $k_c = 0.1$ .

Fig. 5(d) shows the DDJ for the optimal parameters given in [6, Fig. 11]. For no peaking on the magnitude, DDJ can be <0.05 UI for  $\alpha$  < 0.6. For a fixed  $f_{-3 \text{ dB}}$ , only a lower data rate ( $\alpha$  < 0.3) is allowed to maintain a DDJ of 0.05 UI considering 2-dB peaking. It implies that the practical BWER should be less than the theoretical value.

E. Summary of Frequency- and Time-Domain Characteristics

In the frequency domain, the existing BW-extension techniques can be categorized into four groups.

- A magnitude with peaking >0 dB must be accompanied with a large GD peaking.
- 2) A magnitude with the maximum peaking can easily approach the maximum BWER, but still with a large GD peaking.
- 3) A magnitude without peaking corresponds to a small GD peaking or in-band ripple.
- 4) A magnitude according to a Bessel roll-off has a constant GD.



Fig. 6. ISI versus  $\alpha$ . (a) Bridged-shunt peaking ([6, Fig. 5]). (b) Series peaking ([6, Fig. 7]). (c) Bridged-shunt series peaking ( [6, Fig. 9]). (d) Asymmetric T-coil peaking ([6, Fig. 11]).



Fig. 7. Step responses with (a) time axis-linear and (b) time axis-log of the existing and proposed different BW-extension techniques.

In the time domain, the ISI (Fig. 6) and DDJ can be summarized correspondingly.

- ISI >40% for all α and DDJ < 0.05 UI for α < 0.3 are impractical. The examples are: a BWER of 3.53× with 2-dB peaking in bridged-shunt peaking and a BWER of 2.56× with 3.3-dB peaking in series peaking.
- 2) ISI between 15% and 40% and DDJ <0.05 UI for  $\alpha$  < 0.3 limit the maximum data rate below  $f_{-3 \text{ dB}}$  to keep a better data eye. In other words, a maximum BWER with gain peaking of 1 to 2 dB also brings severe ISI and DDJ when the data rate goes up to 1.43 of  $f_{-3 \text{ dB}}$ , bounding the maximum data rate.
- 3) ISI between 8% and 15% and DDJ <0.05 UI for  $\alpha$  < 0.4 to 0.5 are a better choice for most applications.
- 4) Minimum ISI <5% and DDJ <0.01 UI are the reference prone to entering the BW-limited case.

Fig. 7 shows the step responses of the discussed BW-extension techniques, including the proposed bridgedshunt-GAI peaking (to be described in the following section) for comparison. All variants of shunt peaking and the *RC* amplifier do not have timing delay, which happens on those of series peaking. Due to the extra current brought by negative magnetic coupling, the timing delay of the asymmetric T-coil peaking is shorter than that of other series variants. A wider BWER shortens the data rising time in the step response, and the magnitude peaking with GDR indirectly maps to the ringing in the step response by (3).

# IV. PROPOSED BRIDGED-SHUNT-GAI PEAKING

Our idea is to add an auxiliary path at the internal node, rather than altering the input–output nodes directly in the different variants of the series peaking. As shown in Fig. 8(a), the parasitic capacitance  $(C_p)$  at  $V_m$  mainly caused by  $L_1$  transforms the topology from shunt to bridged-shunt peaking. Thus,  $C_p$  can beneficially provide an extra signal current path, neutralizing the peaking provoked by  $L_1$ . Together, they form a parallel *LC* resonant tank with the impedance of  $Z_{LC}$ , which receives a small-signal current going through  $R_L$  to



Fig. 8. (a) Simplified schematic of the proposed bridged-shunt peaking topology with a tunable GAI. (b) Its single-ended equivalent model to derive the normalized transimpedance  $Z_{TI}$ .

generate the small-signal voltage at  $V_m$ . This small-signal voltage multiplies the negative admittance ( $Y_{AI}$ ) like the voltagecontrolled current source, to generate the negative small-signal current, which passes through  $R_L$  and charges  $C_L$ . The BW is enhanced without time delay as there is no serially charging. The above extra current that charges  $C_L$  is similar to the negative mutual coupling, providing an extra initial current. As only one passive inductor is involved, the area efficiency is high.

Such  $Y_{AI}$  is realized by a tunable GAI, which is based on a positive-feedback gyrator  $(M_2)$  and a varactor  $(C_v)$ . As shown in Fig. 8(b),  $Y_{AI}$  can be equivalent to the series connection of negative capacitor  $(-1/sC_v)$  and transconductor  $(-1/g_{m,M2})$ . Since the GAI is alike a voltage-controlled current source, the signal voltage at node  $V_m$  will produce a current flowing from the GAI back to  $V_m$ , providing an extra signal charging  $C_L$  for BW extension. The charging current is controlled by  $g_{m,M2}$ , modifying the BW of the amplifier. The negative capacitance can also be viewed as a tunable inductor in parallel with  $L_1$ . Thus, the effective inductance at  $V_m$  becomes widely tunable to adjust the delay of the signal current passing into  $R_L$ , so as the in-band *ac* characteristics. For simplicity, the parasitics of all MOSFETs are ignored. The normalized transimpedance of the proposed load network  $Z_{\text{TI}}(s)$  is given in (10), shown at the bottom of this page, where  $k_B = (C_p/C_L), k_g = g_{m,M2}R_L$ , and  $k_v = (C_v/C_L)$  denote the optimization parameters. Two of them  $(k_g \text{ and } k_v)$  are new tunability offered by the GAI. The voltage bias  $(V_{BN})$  can alter the current flowing through M<sub>2</sub>, so as  $g_{m,M2}$  ( $k_g$ ). The variable capacitance  $C_v$  ( $k_v$ ) is tuned by the control voltage ( $V_C$ ). The tunability appears in the auxiliary path without introducing the extra parasitic effects on the input-output node directly. They allow tunable BW and ac characteristics including the in-band gain and GD.

The numerator in (10) is a third-order polynomial with at least a real zero  $z_1 = \sigma_{z_1}$  and two complex zeros  $z_{2,3} = \sigma_{z_2} \pm j\omega_{z_2}$ ; meanwhile, the denominator in (10) contains

two real poles  $p_{1,2} = \sigma_{p1,2}$  and two complex poles  $p_{3,4} = \sigma_{p3} \pm j\omega_{p3}$  according to the optimized parameters given in Fig. 9. The location of these zeros and poles in the  $\sigma + j\omega$ plane are directly decided by the coefficients of (10). A pair of pole–zero ( $p_1$  and  $z_1$ ) cancels under any of the following optimized cases, and the array of other zeros ( $z_{2,3}$ ) and poles ( $p_2$  and  $p_{3,4}$ ) determines the ac characteristics. The phase response  $\phi(\omega)$  of (10) is achieved by summing the phase contribution of each zero and pole [30]. The phase distortion is denoted by the GD, which is found by differentiating  $\phi(\omega)$ 

$$\tau_g(\omega) = -\frac{d\phi(\omega)}{d\omega} = \sum_{k=1}^2 \frac{\sigma_{zk}}{\sigma_{zk}^2 + (\omega \pm \omega_{zk})^2} - \sum_{k=1}^3 \frac{\sigma_{pk}}{\sigma_{pk}^2 + (\omega \pm \omega_{pk})^2}.$$
 (11)

# A. AC-Characteristic Optimization

The two tunable parameters ( $k_g$  and  $k_v$ ) in (10) render the proposed BW-extension technique to achieve different performance goals. In case 1,  $k_g$  is tuned under a fixed  $k_v$ . At a large  $k_v$  of 16, tuning  $k_g$  from 0.5 to 0.75 can raise the BWER from 2.37× to 3.14× [Fig. 9(a)]. The real pole  $p_2$  moves to the lower frequency, and two complex zeros  $z_{2,3}$  together with two complex poles  $p_{3,4}$  are shifted to higher frequency, but  $z_{2,3}$  exceeds  $p_{3,4}$ . In Fig. 9(a) and (b), the former dominates the downtrend on the magnitude and GD in the vicinity of 1.5 rad/s, and the latter generates the peaking on the magnitude and GD >2 rad/s. Although the BWER gradually increases, the high-frequency peaking on the magnitude and GD introduces more ISI and DDJ. In turn, a smaller  $k_g$  leads to a narrower BW. It also reduces the power dissipation and flattens the gain response and in-band GD.

In case 2,  $k_v$  is tuned under a fixed  $k_g$ . At a large  $k_g$  of 0.75,  $k_v$  has a minor effect on the BW extension; therefore, it can be tuned solely to change the upper part of the magnitude and GD responses, optimizing the in-band ripple as shown in Fig. 9(c) and (d).

Case 1 can coarsely tune the ac features, whereas case 2 can finely tune the in-band ac response. Besides above cases, a flat gain response can be achieved together with a higher BWER  $(2.56\times)$  when compared with the shunt peaking  $(1.72\times)$  and bridged-shunt peaking  $(1.83\times)$ .

#### B. Stability

A rearrangement of the bridged-shunt-GAI topology (Fig. 10) is similar to a voltage-controlled oscillator topology, in which an *LC* tank is in parallel with an active circuit  $(-g_{m,M2})$ . Except the parallel parasitic resistance  $(R_p)$  of  $L_1$ , another impedance path  $(Z_L)$  directly connecting the *LC* tank

$$Z_{\rm TI}(s) = \frac{\frac{s^3}{\omega_0^3} \frac{k_B k_v}{m k_g} + \frac{s^2}{\omega_0^2} \frac{k_B k_g + k_v (1 - k_g)}{m k_g} + \frac{s}{\omega_0} \frac{k_g + m k_v}{m k_g} + 1}{\frac{s^4}{\omega_0^4} \frac{k_B k_v}{m k_g} + \frac{s^3}{\omega_0^3} \frac{k_B (k_g + k_v) + k_v (1 - k_g)}{m k_g} + \frac{s^2}{\omega_0^2} \frac{k_g (1 + k_B - k_v) + m k_v}{m k_g} + \frac{s}{\omega_0} \frac{k_g + k_v}{m k_g} + 1}$$
(10)



Fig. 9. Simulation results. (a) and (b) At  $k_v = 16$ , the BWER raises from 2.37 ( $k_g = 0.5$ ) to 3.14 ( $k_g = 0.75$ ) and the magnitude responses are flat to maximize the BW. (c) and (d) At  $k_g = 0.75$ , a higher  $k_v$  aids to fine-adjust the in-band gain.



Fig. 10. Schematic to study the stability of the proposed amplifier.

adds up the load resistance, reducing the risk of oscillation. Assuming the resonance of the *LC* tank, the condition below is always satisfied  $\max\{g_{m,M2}\} \times (R_p || \operatorname{Re}\{Z_L\}) < 1$ , and there is no oscillation in all postlayout process–voltage–temperature (PVT) simulations.

# C. In-Band Gain Variation

An example is the four-stage amplifier design in [17], in which tuning the resistance of the pMOS transistor in the linear region in parallel with  $R_L$  can affect the commonmode (CM) voltage at the medium nodes and output node impacting the transconductance of the next stage, and finally affecting the overall gain, while showing a large in-band gain variation (~10 dB). Another example is the multistage LNA in [31] utilizing a tunable active inductor in the main signal path, exhibiting ~2-dB in-band ripple and ~4-dB in-band gain variation. Here, our tunability is achieved by varying  $k_g$  in the current domain and  $k_v$  in the voltage domain with a little expense of in-band gain variation. Elegantly, the main current  $(I_m)$  and the scalable auxiliary current  $(I_a)$  merely share  $L_1$  crossing at approximately a dc ground  $V_m$ , where only a small series parasitic resistance  $(R_s)$  of  $L_1$  exists (i.e., smaller signal swing at  $V_m$ ). Although increasing  $V_{BN}$   $(k_g)$ for a larger BW brings more  $I_{AI}$  via  $R_s$ , the CM voltages at  $V_M$  and  $V_{OUT}$  drop a little due to the small  $R_s$  (<15  $\Omega$ ). Considering the channel modulation of the MOSFET, although the transconductance per stage slightly decreases, the in-band gain varies very small under a fixed  $R_L$ . Suggested by simulations and confirmed by measurements, the in-band gain variation is <0.58 dB for the realized amplifier prototype to be described in Section VI.

#### D. Noise Induced by GAI

For high-frequency operation, mainly the thermal noise of the key transistors ( $M_1$ ,  $M_2$  and  $M_{\text{IBN}}$ ) and load resistor ( $R_L$ ) are considered. The input-referred noise (IRN) voltage can be derived as

$$\overline{V_{\text{in},n,\text{total}}^{2}} = \frac{4kT\gamma}{g_{m1}} \left\{ 1 + \frac{1}{g_{m1}R_{L}} \left| \frac{N_{n,R_{L}}(s)}{N(s)} \right|^{2} + \frac{k_{g}}{g_{m1}R_{L}} \left| \frac{\frac{s^{2}}{\omega_{0}^{2}} \frac{1}{m} \frac{k_{v}}{k_{g}}}{N(s)} \right|^{2} + \frac{ak_{g}}{g_{m1}R_{L}} \left| \frac{\frac{s}{\omega_{0}} \frac{1}{m}}{N(s)} \right|^{2} \right\}$$
(12)



Fig. 11. (a) Implemented four-stage differential amplifier and (b) schematic of the GAI.

where  $g_{mx}$  is the transconductance of the transistor  $(M_x)$  and  $a = g_{m,\text{IBN}}/g_{m2}$ . N(s) is the numerator of (10).  $N_{n,R_L}(s)$  is given in the following equation:

$$N_{n,R_L}(s) = \frac{s^3}{\omega_0^3} \frac{k_B k_v}{m k_g} + \frac{s^2}{\omega_0^2} \frac{k_B - k_v}{m} + \frac{s}{\omega_0} \frac{k_v}{k_g} + 1.$$
(13)

 $N_{n,R_L}(s)/N(s)$  follows an all-pass response. Depending on the  $L_{(1)}C_{(B)}$  tank, the third and fourth terms in (12) implement a bandpass response due to the GAI,  $k_g < 1$  reduces their noise contribution in the IRN. The signal gain  $(g_{m1}R_L > 1)$  per stage suppresses the total noise contribution of the GAIs.

# V. DESIGN OF A FOUR-STAGE AMPLIFIER WITH BRIDGED-SHUNT-GAI PEAKING

The proposed bridged-shunt-GAI peaking can be employed in high-speed amplifiers or drivers, flexibly replacing the conventional shunt-peaking topologies. Our prototype is a four-stage differential amplifier aiming to support a data rate of 50+ Gb/s [Fig. 11(a)]. With eight passive inductors aided with GAIs [Fig. 11(b)], we expect a total gain of 15 dB. Initially in the sizing, the gain of the common-source amplifier is set, and then the parasitic capacitance at output node  $(C_L)$  and  $V_m$  ( $C_{\rm pR}$ ) can be extracted from the layout. In the following, by setting  $L_1 = R_1 C_L / m$ ,  $L_1$  can be estimated. To further reduce the die area, a 3-D solenoid inductor [32], [33] composed of stacking three metal layers (Metals 4 to 6) is used as shown in Fig. 12. The capacitance of  $L_1$  at node  $V_m$  is also extracted as  $C_{pL}$ , which forms  $C_p$  together with  $C_{pR}$ . Before the GAI is turned ON ( $k_g = 0$ ), an initial size of  $M_n$ is inserted at  $V_m$ . It is verified that the ac response with flat magnitude and minimum GDR with and without the GAI is consistent.  $k_g > 0$  is achieved by tuning  $V_{BN}$ , maximizing the BW while preserving a low in-band gain ripple ( $\sim 2 \text{ dB}$ ). The varactor is tuned to optimize the in-band gain ripple. Full-wave electromagnetic simulations considering the coupling between the eight passive inductors were performed to adjust the coil parameters. Each inductor occupies  $20 \times 18 \ \mu m^2$  and has a self-resonant frequency of >75 GHz.

The simulated total in-band IRN and integrated inputreferred noise are  $< 2.6 \text{ nV}/\sqrt{\text{Hz}}$  and < 0.5 mV under different BWs, respectively. In the total noise, the first gain



Fig. 12. 3-D layout of the ultracompact size ( $20 \times 18 \ \mu m^2$ ) 3-D solenoid inductor and HFSS simulated inductance and Q-factor.

stage dominates the noise contribution of the differential GAI [Fig. 11(a)] and it gradually drops in the following stage as shown in Fig. 13(a). As shown in Fig. 13(b), the noise contribution of the differential GAI in each gain stage exhibits a bandpass feature in accord with the third and fourth terms in (12), namely, the GAI noise goes through the  $L_{(1)}C_{(B)}$  tank being injected into the output node. Below 10 GHz, the noise contribution from the differential GAI is <1%, and the maximum noise contribution of the differential GAI will go up to 18.72% at the maximum BW, when  $g_{m2}$  and  $g_{m,IBN}$  increase toward their maximum values by tuning  $V_{BN}$ .

The large-signal behavior is studied by using PRBS input. The simulated 1-dB gain-compression points of the first-stage and fourth-stage outputs are 350 and 75 mV<sub>pp</sub> [Fig. 14(a)], respectively. They are dominated by the input differential pairs. As the input swing ( $V_{in}$ ) increases in the linear region, the DDJ gradually reduces for each stage output due to the improvement of the rise/fall edges [Fig. 14(b)]. Entering the compression region, the output swing and DDJ of the fourth-stage output appear to be clamped. The bandpass function at  $V_m$  brings in the narrow pulses locating at the rise/fall edges of  $V_{out}$ . Their swings are <66% of  $V_{out}$  over the two regions. Thus, the attribute of the GAI holds. The phase distortion of  $Z_{TI}(s)$  controlled by the GAI limits the minimum DDJ in



Fig. 13. Simulated noise contribution of the differential GAI in the total IRN. (a) At different stages at 40 GHz. (b) Under  $V_{\rm BN}$  tuning corresponding to different BWs.



Fig. 14. (a) Voltage transfer characteristics and (b) DDJ versus  $V_{\rm in}$  under  $V_{\rm BN} = 0.8$  V and  $V_c = 0.9$  V. A data rate of 10 Gb/s  $2^7 - 1$  PRBS signal is used as input.

the compression region, when comparing with the amplitude compression behavior.

The postlayout simulations under PVT were conducted. The simulated maximum dc gain and BW are >14 dB and >42 GHz, respectively. The maximum in-band gain variation and in-band ripple are <0.8 dB and <2 dB, respectively.



Fig. 15. (a) Amplifier's chip photograph. (b) Its reference path with an identical test buffer to de-embed the factual performances of the amplifier. (c) Layout details. (d) Twenty chips were measured to confirm the robustness.  $GBW = dc Gain \times BW$ .

The tunability of the proposed BW-extension technique is valid for all cases.

### VI. MEASUREMENT RESULTS

The amplifier and its reference path with an identical test buffer for performance de-embedment were fabricated in 65-nm CMOS. Their chip micrographs are shown in Fig. 15(a) and (b), respectively. The total core area of the amplifier is  $110 \times 70 \ \mu m^2$  [Fig. 15(c)]. The noise measurement was not available.

## A. Frequency Domain

Based on the Keysight Network Analyzer (N5247A), the measured ac responses of the amplifier from the constant GD (group 4) to maximum BW response (group 2) are shown in Fig. 16(a)–(d). At  $V_C = 0$  V, increasing  $V_{BN}$  (i.e.,  $k_g$ ) from 0 to 0.8 V can extend the BW from 26.36 to 43.34 GHz, with the power consumption raising from 23 to 45 mW [Fig. 17(a)]. From the minimum to maximum BWs, the in-band GD variation increases from 17.3-18.4 ps to 17-35.3 ps, while the GDR goes up from 1.1 to 18.3 ps. The in-band ripple at maximum BW is 1.53 dB. The tunability only brings in <0.58 dB in-band gain variation. At  $V_{\rm BN} = 0.8$  V, increasing  $V_C$  (i.e.,  $k_v$ ) from 0 to 1.3 V can finely tune the in-band response, while preserving the BW (~43 GHz), GD (14.1-33.7 ps) and power consumption (45 mW) roughly unchanged [Fig. 17(b)]. The maximum in-band ripple is 2.34 dB at  $V_C = 1.3$  V.

The robustness of the proposed amplifier has been confirmed by measuring over 20 samples. The  $\sigma$  of all key parameters is <5% of its mean [Fig. 15(d)].

#### B. Time Domain

Due to equipment limits, direct time-domain measurements were done only up to a maximum data rate of 13.5 Gb/s with  $2^7 - 1$  PRBS generated by the Agilent pattern generator (J-BERT N4903B), and the eye data captured by the Keysight



Fig. 16. Measured (a)  $S_{DD21}$  and (b) GD under  $V_{BN}$  tuning. (c)  $S_{DD21}$  and (d) GD under  $V_C$  tuning. GD is calculated from the measured phase and fit function in MATLAB is used.



Fig. 17. Power consumption, in-band peaking, and BW versus different voltage controls (a)  $V_{BN}$  and (b)  $V_C$ .

Real-Time Oscilloscope (DSA91304A). Except the data rms jitter of  $\sim 1$  ps covering the data rate up to 13.5 Gb/s from the PRBS generator, the BW limitation introduced by the combination of the RF probes, bias tees, and cables also impacts the quality of the captured eye diagram [Fig. 18(a)]. The rms jitter is 1.84 ps at a data rate of 8 Gb/s including the equipment output rms jitter of 1.03 ps and random jitter, e.g., from the supply noise.

For the data rates beyond 13.5 Gb/s, indirect time-domain measurements using the measured S-parameter data are

pseudosimulated [26], with the equipment output rms jitter and the random jitter excluded. The de-embedded S-parameters of the amplifier were used. The DDJ and vertical opening are shown in Fig. 18(a) and (b), respectively. The steps to optimize the eye diagram are summarized as follows.

 A constant GD and an approximate Bessel magnitude give the best DDJ [Fig. 17(a)], but this is sensitive to the parasitics and PVT variations, while generating a BW-limited ISI on magnitude introduces DDJ, especially for high data rates [Fig. 18(b)].



Fig. 18. (a) DDJ and (b) vertical opening versus data rate.



Fig. 19. 60-Gb/s eye diagrams of the four-stage amplifier by the indirect time-domain measurements under (a)  $V_{BN} = 0.0$ , (b) 0.5, (c) 0.6, and (d) 0.8.

- For a maximum BW response with a large high-frequency GD peaking, the severe ISI degrades the vertical opening and brings a large DDJ above the 50+ Gb/s data rate.
- 3) A better choice for the ac characteristic is a magnitude without peaking and with smaller GD peaking, for balancing the DDJ and vertical opening. At the data rate of 60 Gb/s in Fig. 18, the eye diagrams at two extreme cases (1 and 2) are shown in Fig. 19(a) and (d), respectively. Their eye diagrams

belong to the reasonable case (3), as shown in Fig. 19(b) and (c).

The chip summary and performance benchmark are given in Table II. We employ the figure of merit (FOM = [(dc Gain × BW)/Power]) that reveals the performance per stage to compare this work with the recent art. This work succeeds in enhancing the flexibility of the ac characteristics, while achieving a better FOM and area efficiency. This work realizes tunability by varying  $k_g$  in the current domain and  $k_v$ in the voltage domain with a little expense of in-band gain

| Parameters                          | This Work                           | TMTT'12 [25]                     | ISSCC'06 [17]                                       | TMTT'08 [22]                         | JSSC'11 [23]                     | JSSC'04 [15]                                  |
|-------------------------------------|-------------------------------------|----------------------------------|-----------------------------------------------------|--------------------------------------|----------------------------------|-----------------------------------------------|
| Technology                          | 65nm CMOS                           | 65nm CMOS                        | 90nm CMOS                                           | 130nm CMOS                           | 180nm CMOS                       | 180nm CMOS                                    |
| Architecture                        | Bridged-shunt GAI peaking           | Distributed                      | Shunt-series peaking<br>+ Capacitive<br>Enhancement | Asymmetric<br>Transformer<br>peaking | Synthesis-Based<br>Two-Port      | Shunt-Series<br>Peaking<br>(Triple-Resonance) |
| Signal path style                   | Differential                        | Differential                     | Differential                                        | Single-Ended                         | Differential                     | Differential                                  |
| Number of stage N                   | 4                                   | 4                                | 5                                                   | 5                                    | 4                                | 5                                             |
| Passive inductor per gain stage     | 2 single-ended<br>passive inductors | 4 single-ended passive inductors | 4 single-ended passive inductors                    | 1 Single-Ended<br>Transformer        | 4 single-ended passive inductors | 4 single-ended passive inductors              |
| AC characteristics tunability       | Yes<br>(by Voltage)                 | No                               | Yes<br>(by R <sub>L</sub> )                         | No                                   | No                               | No                                            |
| Max. in-band gain<br>variation (dB) | 0.58                                | N/A                              | 10 <sup>*</sup>                                     | N/A                                  | N/A                              | N/A                                           |
| In-band ripple (dB)                 | 1.53                                | 3*                               | 2.5*                                                | 3.5*                                 | 4*                               | 2*                                            |
| In-band GD (ps) &<br>GD Ripple (ps) | 17 to 35.3<br>18.3                  | 40 to 75<br>35                   | N/A                                                 | 28.5 to 52.5<br>24                   | N/A                              | N/A                                           |
| Power (mW) @ V <sub>DD</sub>        | 45 @ 1.2V                           | 74.1 @ 1.3V                      | 57 @ 1V                                             | 79.5 @ 1.5V                          | 52 @ 1.8V                        | 190 @ 2.2V                                    |
| Active area (mm <sup>2</sup> )      | 0.0077                              | 0.94                             | 0.018                                               | 0.05                                 | 1.24                             | 0.45                                          |
| FOM **                              | 5.48                                | 11.04                            | 6.88                                                | 2.91                                 | 4.53                             | 0.65                                          |
| DC gain (dB)                        | 15.1                                | 22                               | 19                                                  | 10.3                                 | 18.5                             | 15                                            |
| Max. BW (GHz)                       | 43.34                               | 65                               | 44                                                  | 70.6                                 | 28                               | 22                                            |

TABLE II Chip Summary and Benchmark With the State-of-the-Art (Similar Data Rates)

\* Extracted values from plots \*\* FOM = [(DC Gain x BW) / Power]

variation. Also, differing from other works that employ more than two passive inductors along the signal path, causing more coupling and parasitics, this work extends BW by adding a GAI as an independent auxiliary path.

# VII. CONCLUSION

This paper has described an area-efficient and tunable BW-extension technique for a wideband CMOS amplifier to handle very high-rate signaling while keeping a highquality data eye. An analytical method that can predict the ISI and DDJ due to the ripples on magnitude and GD is proposed to systematically correlate the performances between the frequency and time domains. When compared with the existing BW-extension techniques with the same number of passive inductor per stage, the proposed bridged-shunt-GAI peaking topology exhibits the best BWERs with and without peaking on magnitude. The proof-of-concept prototype is a four-stage differential amplifier fabricated in a 65-nm CMOS, exhibiting a 15-dB gain and a 43-GHz BW, while occupying just 0.0077 mm<sup>2</sup>. Small in-band gain variation (0.58 dB) and ripple (1.53 dB) are achieved concurrently with low in-band GD (17 to 35.3 ps) and ripple (18.3 ps). The measured FOM is 5.48.

### ACKNOWLEDGMENT

The authors would like to thank C. Li for measurement support.

#### REFERENCES

- P.-C. Chiang, H.-W. Hung, H.-Y. Chu, G.-S. Chen, and J. Lee, "60 Gb/s NRZ and PAM4 transmitters for 400 GbE in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2014, pp. 42–43.
- [2] M.-S. Chen and C.-K. K. Yang, "A 50–64 Gb/s serializing transmitter with a 4-tap, *LC*-ladder-filter-based FFE in 65 nm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 50, no. 8, pp. 1903–1916, Aug. 2015.
- [3] J. Lee, P.-C. Chiang, P.-J. Peng, L.-Y. Chen, and C.-C. Weng, "Design of 56 Gb/s NRZ and PAM4 SerDes transceivers in CMOS technologies," *IEEE J. Solid-State Circuits*, vol. 50, no. 9, pp. 2061–2073, Sep. 2015.
- [4] J.-K. Kim, J. Kim, G. Kim, and D.-K. Jeong, "A fully integrated 0.13 μm CMOS 40-Gb/s serial link transceiver," *IEEE J. Solid-State Circuits*, vol. 44, no. 5, pp. 1510–1521, May 2009.
- [5] B. Raghavan et al., "A sub-2W 39.8–44.6 Gb/s transmitter and receiver chipset with SFI-5.2 interface in 40 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 48, no. 12, pp. 3219–3228, Dec. 2013.
- [6] S. Shekhar, J. S. Walling, and D. Allstot, "Bandwidth extension techniques for CMOS amplifiers," *IEEE J. Solid-State Circuits*, vol. 41, no. 11, pp. 2424–2439, Nov. 2006.
- [7] B. Hofer, Amplifier Frequency and Transient Response (AFTR) Notes. Portland, OR, USA: Tektronix, 1982.
- [8] S. S. Mohan, M. D. M. Hershenson, S. P. Boyd, and T. H. Lee, "Bandwidth extension in CMOS with optimized on-chip inductors," *IEEE J. Solid-State Circuits*, vol. 35, no. 3, pp. 346–355, Mar. 2000.
- [9] T. H. Lee, The Design of CMOS Radio-Frequency Integrated Circuits. Cambridge, U.K.: Cambridge Univ. Press, 2004.
- [10] H. A. Wheeler, "Wide-band amplifiers for television," *Proc. IRE*, vol. 27, no. 7, pp. 429–438, Jul. 1939.
- [11] F. A. Muller, "High-frequency compensation of RC amplifiers," Proc. IRE, vol. 42, pp. 1271–1276, Aug. 1954.

- [12] B. Analui and A. Hajimiri, "Bandwidth enhancement for transimpedance amplifiers," *IEEE J. Solid-State Circuits*, vol. 39, no. 8, pp. 1263–1270, Aug. 2004.
- [13] C.-H. Wu, C.-H. Lee, W.-S. Chen, and S.-I. Liu, "CMOS wideband amplifiers using multiple inductive-series peaking technique," *IEEE J. Solid-State Circuits*, vol. 40, no. 2, pp. 548–552, Feb. 2005.
- [14] J. Kim and J. F. Buckwalter, "Bandwidth enhancement with low group-delay variation for a 40-Gb/s transimpedance amplifier," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 57, no. 8, pp. 1964–1972, Aug. 2010.
- [15] S. Galal and B. Razavi, "40-Gb/s amplifier and ESD protection circuit in 0.18 μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 39, no. 12, pp. 2389–2396, Dec. 2004.
- [16] J. Kim *et al.*, "Circuit techniques for a 40 Gb/s transmitter in 0.13  $\mu$ m CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2005, pp. 150–151.
- [17] J. R. M. Weiss *et al.*, "A DC-to-44-GHz 19 dB gain amplifier in 90 nm CMOS using capacitive bandwidth enhancement," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2006, pp. 2082–2091.
- [18] T. Toifl, M. Kossel, C. Menolfi, T. Morf, and M. Schmatz, "A 23 GHz differential amplifier with monolithically integrated T-coils in 0.09 μm CMOS technology," in *IEEE MTT-S Int. Microw. Symp. Dig.*, vol. 1. Jul. 2003, pp. 239–242.
- [19] J. Kim, J.-K. Kim, B.-J. Lee, and D.-K. Jeong, "Design optimization of on-chip inductive peaking structures for 0.13-μm CMOS 40-Gb/s transmitter circuits," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 56, no. 12, pp. 2544–2555, Dec. 2009.
- [20] S. Galal and B. Razavi, "Broadband ESD protection circuits in CMOS technology," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2334–2340, Dec. 2003.
- [21] M. Kossel, C. Menolfi, T. Morf, M. Schmatz, and T. Toifl, "Wideband CMOS transimpedance amplifier," *Electron. Lett.*, vol. 39, no. 7, pp. 587–588, Apr. 2003.
- [22] J.-D. Jin and S. S. Hsu, "A miniaturized 70-GHz broadband amplifier in 0.13-µm CMOS technology," *IEEE Trans. Microw. Theory Techn.*, vol. 56, no. 12, pp. 3086–3092, Dec. 2008.
- [23] D. Pi, B.-K. Chun, and P. Heydari, "A synthesis-based bandwidth enhancement technique for CMOS amplifiers: Theory and design," *IEEE J. Solid-State Circuits*, vol. 46, no. 2, pp. 392–402, Feb. 2011.
- [24] C.-Y. Hsiao, T.-Y. Su, and S. S. H. Hsu, "CMOS distributed amplifiers using gate-drain transformer feedback technique," *IEEE Trans. Microw. Theory Techn.*, vol. 61, no. 8, pp. 2901–2910, Aug. 2013.
- [25] A. Jahanian and P. Heydari, "A CMOS distributed amplifier with distributed active input balun using GBW and linearity enhancing techniques," *IEEE Trans. Microw. Theory Techn.*, vol. 60, no. 5, pp. 1331–1341, May 2012.
- [26] J. S. Walling, S. Shekhar, and D. J. Allstot, "Wideband CMOS amplifier design: Time-domain considerations," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 55, no. 7, pp. 1781–1793, Aug. 2008.
- [27] Y. Chen, P.-I. Mak, L. Zhang, and Y. Wang, "A 0.002-mm<sup>2</sup> 6.4-mW 10-Gb/s full-rate direct DFE receiver with 59.6% horizontal eye opening under 23.3-dB channel loss at Nyquist frequency," *IEEE Trans. Microw. Theory Techn.*, vol. 62, no. 12, pp. 3107–3117, Dec. 2014.
- [28] J. Buckwalter, B. Analui, and A. Hajimiri, "Predicting data-dependent jitter," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 51, no. 9, pp. 453–457, Sep. 2004.
- [29] B. Analui, J. F. Buckwalter, and A. Hajimiri, "Data-dependent jitter in serial communications," *IEEE Trans. Microw. Theory Techn.*, vol. 53, no. 11, pp. 3388–3397, Nov. 2005.
- [30] P. Staric and E. Margan, Wideband Amplifiers. Berlin, Germany: Springer-Verlag, 2006.
- [31] M. M. Reja, K. Moez, and I. Filanovsky, "An area-efficient multistage 3.0- to 8.5-GHz CMOS UWB LNA using tunable active inductors," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 57, no. 8, pp. 587–591, Aug. 2010.
- [32] K. Hijioka, A. Tanabe, Y. Amamiya, and Y. Hayashi, "Crosstalk analysis method of 3-D solenoid on-chip inductors for high-speed CMOS SoCs," in *Proc. Int. Interconnect Technol. Conf.*, Jun. 2008, pp. 186–188.
- [33] S. Kaeriyama *et al.*, "A 40 Gb/s multi-data-rate CMOS transmitter and receiver chipset with SFI-5 interface for optical transmission systems," *IEEE J. Solid-State Circuits*, vol. 44, no. 12, pp. 3568–3579, Dec. 2009.



Yong Chen (S'10–M'11) received the B.Eng. degree in electronic and information engineering from the Communication University of China, Beijing, China, in 2005, and the Ph.D. in Engineering degree in microelectronics and solid-state electronics from the Institute of Microelectronics of Chinese Academy of Sciences, Beijing, in 2010.

From 2010 to 2013, he was a Post-Doctoral Researcher with the Institute of Microelectronics, Tsinghua University, Beijing. From 2013 to 2016, he was a Research Fellow with the Nanyang

Technological University, Singapore, where he was responsible for highspeed (40+ Gb/s) wireline communication and the low energy electronic systems project under the Singapore-MIT Alliance for Research and Technology on RF CMOS transceivers in VIRTUS/EEE. Since 2016, he has been an Assistant Professor with the State Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macao, China. His current research interests include analog/biomedical detection and RF integrated circuits, millimeter-wave systems and circuits, high-speed on-chip, and chip-to-chip electrical/optical interconnects.



**Pui-In Mak** (S'00–M'08–SM'11) received the Ph.D. degree from the University of Macau (UM), Macao, China, in 2006.

He is currently a Full Professor of electrical and computer engineering with the Faculty of Science and Technology, UM, and an Associate Director (Research) with the State Key Laboratory of Analog and Mixed-Signal VLSI, UM. His current research interests include analog and radio frequency circuits and systems for wireless and multidisciplinary innovations.

Prof. Mak was an Editorial Board member of the IEEE Press during 2014-2016, a member of the Board-of-Governors of the IEEE Circuits and Systems Society during 2009-2011, a Senior Editor of the IEEE JOURNAL ON EMERGING AND SELECTED TOPICS IN CIRCUITS AND SYSTEMS during 2014-2015, a Guest Editor of the IEEE RFIC VIRTUAL JOURNAL in 2014) and the IEEE JOURNAL OF SOLID-STATE CIRCUITS in 2018, and an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-I: REGULAR PAPERS during 2010-2011 and 2014-2015 and the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BRIEFS during 2010-2013. He was the TPC Vice Co-Chair of ASP-DAC in 2016 and a TPC member of the A-SSCC during 2013-2016, the ESSCIRC in 2016, and the ISSCC in 2016. He was a Distinguished Lecturer of the IEEE Circuits and Systems Society during 2014-2015 and is the Distinguished Lecturer of the IEEE Solid-State Circuits Society during 2017-2018. He was the co-recipient of the DAC/ISSCC Student Paper Award'05, the CASS Outstanding Young Author Award'10, the National Scientific and Technological Progress Award'11, the Best Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II'12-13, the A-SSCC Distinguished Design Award'15, and the ISSCC Silkroad Award'16. In 2005, he was decorated with the Honorary Title of Value for scientific merits by the Macau Government.



**Haohong Yu** received the B.Eng. degree in electrical and electronic engineering from Nanyang Technological University, Singapore, in 2013, where he is currently pursuing the Ph.D. degree.

His current research interests include inductorless noise-canceling low-noise amplifiers and wideband amplifiers.



**Chirn Chye Boon** (M'09–SM'10) received the B.E. (Hons.) and Ph.D. degrees in electrical engineering from Nanyang Technological University (NTU), Singapore, in 2000 and 2004, respectively.

He was a Senior Engineer with Advanced RFIC, NTU. Since 2005, he has been with NTU, where he is currently an Associate Professor. He is involved in radio frequency and millimeter-wave circuits and systems design for biomedical and communications applications. He is involved in conceptualized, designed, and silicon-verified 80 circuits/chips for

biomedical and communication applications. Since 2010, he has been the Program Director of RF and millimeter-wave research in the S\$50 million research center of excellence, VIRTUS, NTU. He is the Principal Investigator for Industry/Government Research Grants of the S\$8,646,178.22. He has authored over 100 refereed publications in the fields of RF and millimeter wave. He is a coauthor of *Design of CMOS RF Integrated Circuits and Systems* (World Sci., 2010).

Dr. Boon was the recipient of the year-2 Teaching Excellence Award and the Commendation Award for Excellent Teaching Performance from the School of Electrical and Electronic Engineering, NTU. He serves as a Committee member for various conferences. He is an Associate Editor for the IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS and a Golden Reviewer for IEEE ELECTRON DEVICES LETTERS.



**Rui P. Martins** (M'88–SM'99–F'08) was born in 1957. He received the bachelor's, master's, and Ph.D. degrees and Habilitation for Full Professor in electrical engineering and computers from the Department of Electrical and Computer Engineering, Instituto Superior Técnico (IST), TU of Lisbon, Lisbon, Portugal, in 1980, 1985, 1992, and 2001, respectively.

Since 1980, he has been with the Department of Electrical and Computer Engineering, IST, TU of Lisbon. Since 1992, he has been on leave from IST,

TU of Lisbon (University of Lisbon since 2013) and is currently with the Faculty of Science and Technology (FST), Department of Electrical and Computer Engineering, University of Macau (UM), Macao, China, where he has been a Chair Professor since 2013. From 1994 to 1997, he was the Dean of the Faculty, FST, and he has been a Vice-Rector with the University of Macau since 1997. Since 2008, after the reform of the UM Charter, he was nominated after open international recruitment and reappointed (in 2013) as Vice-Rector (Research) until 2018. Within the scope of his teaching and research activities, he has taught 21 bachelor and master courses and, with UM, has supervised (or co-supervised) 40 theses, Ph.D. (19) and Masters (21). He was a Co-Founder of Chipidea Microelectronics, Macao (now Synopsys) in 2001/2002, and created the Analog and Mixed-Signal VLSI Research Laboratory of UM in 2003, elevated in 2011 to the State Key Laboratory of China (the first in Engineering in Macao), being its Founding Director. He has co-authored 6 books, 9 book chapters, and 377 papers in scientific journals (111) and conference proceedings (266), as well as an additional 60 academic works, for a total of 470 publications. He holds 16 U.S. patents and 2 Taiwan patents.

Dr. Martins was the Founding Chairman of the IEEE Macau Section during 2003-2005 and the IEEE Macau Joint-Chapter on Circuits and Systems (CAS)/Communications (COM) during 2005-2008 [2009 World Chapter of the Year of the IEEE CAS Society (CASS)]. He was a General Chair of the 2008 IEEE Asia-Pacific Conference on CAS 2008 and was a Vice-President for Region 10 (Asia, Australia, and the Pacific) of the IEEE CASS during 2009-2011. Since then, he was a Vice-President (World) of Regional Activities and Membership of the IEEE CASS during 2012-2013, an Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BRIEFS during 2010-2013, and nominated Best Associate Editor of the IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS-II: EXPRESS BRIEFS for 2012 to 2013. He was a member of the IEEE CASS Fellow Evaluation Committee in 2013 and 2014 and the CAS Society representative of the Nominating Committee, for the election in 2014, of Division I (CASS/EDS/SSCS)-Director of the IEEE. He was the General Chair of the ACM/IEEE Asia South Pacific Design Automation Conference 2016. He was a Nominations Committee member in 2016 and is currently the Chair of the IEEE Fellow Evaluation Committee (class of 2018), both of the IEEE CASS. He was the recipient of two government decorations: the Medal of Professional Merit from the Macao Government (Portuguese Administration) in 1999 and the Honorary Title of Value from the Macao SAR Government (Chinese Administration) in 2001. In 2010, he was elected, unanimously, as a corresponding member of the Portuguese Academy of Sciences (in Lisbon), being the only Portuguese Academician living in Asia.