## 16.3 A Single-Channel 5.5mW 3.3GS/s 6b Fully Dynamic Pipelined ADC with Post-Amplification Residue Generation

Zihao Zheng<sup>1,2</sup>, Lai Wei<sup>1,2</sup>, Jorge Lagos<sup>2</sup>, Ewout Martens<sup>2</sup>, Yan Zhu<sup>1</sup>, Chi-Hang Chan<sup>1</sup>, Jan Craninckx<sup>2</sup>, Rui P. Martins<sup>1,3</sup>

<sup>1</sup>University of Macau, Macau, China <sup>2</sup>imec, Leuven, Belgium <sup>3</sup>University of Lisboa, Lisbon, Portugal

Multi-GS/s ADCs are key blocks for ADC-based serial links and mm-wave 5G receivers. The fastest architecture is the flash ADC [1], but the exponentially growing complexity with resolution makes it energy and area inefficient. Interpolation techniques [2] can reduce the number of comparators but result in lower conversion speeds, while an aggressive interpolation factor in [3] also increases the calibration complexity. SAR ADCs are by far the most power efficient, but only with time interleaving can they reach the required speeds. Hence, pipelined architectures are the preferred choice, but also here clock speeds above 1GS/s are not readily achieved, and the power consumption of the residue amplifier is critical. Previous work [5] explores the option of the fully dynamic pipelined architecture, which only operates up to a relatively low sampling rate of 550MS/s (per channel) owing to its complex residue-transferring realization and calibration. In this work, the pipelined approach is revisited. Different from the conventional architecture that executes 3 serial operations (sampling, quantization and residue amplification) in one clock cycle, a post-amplification residue generation scheme is presented that allows the amplification and conversion to run in parallel. Leveraging a linearized dynamic amplifier and on-chip gain and offset calibration, the prototype achieves 34.2dB SNDR with a Nyquist input at 3.3GS/s. The 6b ADC consumes 5.5mW and 0.0166mm<sup>2</sup> (including calibration), leading to a Walden FoM of 40fJ/conv.-step.

As illustrated in Fig. 16.3.1, the prototype ADC utilizes a simple architecture consisting of six 1b stages without backend flash, aggregating 6b resolution. Unlike the classical pipeline stage where the amplification must wait for the quantizer decision and the DAC feedback, in this work the reference subtraction is performed at the DAC of the next stage after signal amplification as in [6]. Instead of amplifying the residue, the full signal is amplified, and afterwards a correspondingly larger DAC reference is subtracted to create the desired pipeline residue. With the proposed arrangement, the comparator and residue amplifier (RA) in the same stage can be triggered in parallel. At the falling edge of  $\phi_{s}$ , stage[N] ends its tracking phase and enters hold mode. Simultaneously, the quantization result from stage[N-1] is set up and fed to the DAC, generating the residue in stage[N] with a proper feedback voltage. After the corresponding DAC switching settles, the quantizer and amplifier in stage[N] are triggered together. In this configuration, the amplifier amplifies the sampled input of the current stage rather than the residue, unlike in the conventional counterpart. In parallel with the quantization and amplification in stage[N], the output of the current stage amplifier is sampled in stage[N+1] during  $\phi_A$ . Concurrently the quantization result D[N] is fed to DAC[N+1] to generate the amplified version of residue[N] in stage[N+1], and the procedure is repeated in each stage. The proposed parallelized operation accelerates overall speed by allowing each stage to accommodate only two basic operations-sampling and conversion/amplification, thereby reducing idle time for high-speed pipeline operation.

As the amplifier needs to interface with a larger input signal, its linearity becomes critical. The dynamic amplifier (DA) is used due to its good power efficiency, but its conventional implementation suffers from relatively poor linearity for large input swings. Such nonlinearity is mainly originated by the differential current  $(I_{D1}-I_{D2})$  failing to linearly follow the input as described by eq. 1 in Fig. 16.3.2, which shows that the differential current doesn't increase proportionally to the input amplitude due to a nonlinear term. While a time-domain scheme [4] demonstrates a promising linearized result, it is not applicable to high-speed designs due to its relatively large delay (20~30ps), originating from the adaptive common-mode detection circuit. Therefore, a current-based linearization technique is proposed as shown in Fig. 16.3.2. On top of the basic DA structure, an auxiliary pseudo-differential input pair M<sub>3</sub>-M<sub>4</sub> is added in parallel, which shares the same main clock through  $M_5$  and  $M_6$ , providing the compensation currents  $I_{\text{D1Aux}}$  &  $I_{\text{D2Aux}}$  , as depicted in Fig. 16.3.2. According to the square-law model, the summation of IDIAUX and ID2AUX is approximately a constant plus a 2nd-order term  $(V_{1+}+V_{1-})$ , which is used to partially compensate the nonlinearity following eq. 2 in Fig. 16.3.2. Together with source degeneration (RD<sub>1</sub> & RD<sub>2</sub>), simulation results show that the proposed DA can maintain a relatively high and flat G<sub>m</sub> across a large input swing, leading to a 16dB SFDR improvement in this design and allowing an amplification time under 80ps. The linearized DA with 1.5× gain is only used in the first two stages, while a conventional DA with 2× gain is used in the remaining stages for lower input capacitance and better CMRR. The small inter-stage gain (1.5× / 2×) also ensures the enlarged references are still within supply.

To save power, the offset and gain of all stages are foreground-calibrated on-chip (Fig. 16.3.3). During calibration, the differential inputs are shorted, and the comparator offsets are tuned through calibration voltages (V<sub>cal</sub>) from a dedicated R2R DAC with an auxiliary input pair. This offset calibration is done onecomparator at a time and the comparator decisions are determined by an 8-time majority voting logic. The adjustment of V<sub>cal</sub> stops when an equal probability of ones and zeros is reached. After compensation of the comparator offsets, the RAs undergo a similar calibration scheme, but with offset compensated by a tunable loading, which avoids any calibration-induced interference on the RA gain and linearity. The stage[N] comparator is used to observe the offset of the stage[N-1] RA and the offset of the latter is tuned by adding/removing capacitors in a 6b capacitor bank (Cbank). The references in each stage are matched by the RA gain, which is controlled by a digital tunable amplification time. The inter-stage gains are calibrated from the last to the first stage. To detect the gain of stage[N], with nulled offsets in comparator and amplifier, a half-LSB voltage is generated in the DAC of stage[N]. Then, it is amplified and quantized by the current and/or subsequence stages. The quantization result D[N:5] is ideally 2(6-N) -1, which is the full-scale of stage[N] to stage[5]. The calibration starts with a minimum gain configuration and increases the gain until D[N:5] reaches its target. As the output common-mode of the DA has a strong correlation with the gain, such calibration also ensures proper internal common-mode values.

The ADC, fabricated in 28nm CMOS, with ~40fF input capacitance (excluding ESD), occupies an active area of 0.0166mm<sup>2</sup>, including on-chip calibration circuits (Fig. 16.3.7). During measurements, the foreground on-chip calibration is performed, and the obtained values are frozen throughout all measurements. Figure 16.3.4 plots the measured SFDR/SNDR and power vs. input frequency at 3.3GS/s, and vs. power supply (sample #2), as well as the Nyquist performance at 3.3GS/s for 3 randomly selected samples. The prototype consumes a purely dynamic power, which can be recognized by the linear proportional relationship between the power and clock frequency with a slope of 1.5µW/MHz. SNDR is 34dB up to 3.5GHz (2.12× Nyquist) and drops 1dB from 3.5GHz to 6GHz. With a fixed calibration set obtained at 0.9V supply and no re-cal., the SNDR degrades less than 3dB for a ±5% supply change. Figure 16.3.5 illustrates the measured spectrum (decimated by 225) at 3.3GS/s for a Nyquist 1.649GHz input, w/ and w/o calibration. Measured DNL and INL after calibration are +1.08 / -0.85 LSB, +1.11 / -1.04 LSB for a 6.25mV LSB. Figure 16.3.6 compares the design with ADCs above 2GS/s, indicating this work achieves state-of-the-art FOM<sub>w</sub>.

## Acknowledgments:

The authors would like to thank Mr. Un Pang Lei and Chi Wai Tang for tape-out support. This work was Funded by The Science and Technology Development Fund, Macau SAR (File no. 0003/2019/AFJ) and Research Grants of University of Macau (MYRG2019-00021-AMSV).

## References:

[1] V. H.-C. Chen and L. Pileggi, "An 8.5mW 5GS/s 6b Flash ADC with Dynamic Offset Calibration in 32nm CMOS SOI," *IEEE Symp. VLSI Circuits*, pp. C264-C265, June 2013.

[2] Y. Shu, "A 6b 3GS/s 11mW Fully Dynamic Flash ADC in 40nm CMOS with Reduced Number of Comparators," *IEEE Symp. VLSI Circuits*, pp. 26-27, June 2012.

[3] D. Oh et al., "A 65-nm CMOS 6-bit 2.5-GS/s 7.5-mW 8× Time-Domain Interpolating Flash ADC With Sequential Slope-Matching Offset Calibration," *IEEE JSSC*, vol. 54, no. 1, pp. 288-297, Jan. 2019.

[4] L. Yu, M. Miyahara and A. Matsuzawa, "A 9-bit 1.8 GS/s 44 mW Pipelined ADC Using Linearized Open-Loop Amplifiers," *IEEE JSSC*, vol. 51, no. 10, pp. 2210-2221, Oct. 2016.

[5] B. Verbruggen et al., "A 2.6mW 6b 2.2GS/s 4-Times Interleaved Fully Dynamic Pipelined ADC in 40nm digital CMOS," *ISSCC*, pp. 296-297, Feb. 2010.

[6] A. Nazemi, "A 10.3Gs/s 6bit (5.1 ENOB at Nyquist) Time-Interleaved/Pipelined ADC Using Open-Loop Amplifiers and Digital Calibration in 90nm CMOS", *IEEE Symp. VLSI Circuits*, pp. 14-15, June 2008.



## **ISSCC 2020 PAPER CONTINUATIONS**

