## 18.4 A 0.4V 430nA Quiescent Current NMOS Digital LDO with NAND-Based Analog-Assisted Loop in 28nm CMOS

Xiaofei Ma<sup>1,2</sup>, Yan Lu<sup>1</sup>, Rui P. Martins<sup>1,3</sup>, Qiang Li<sup>2</sup>

<sup>1</sup>University of Macau, Macau, China

<sup>2</sup>University of Electronic Science and Technology of China, Chengdu, China <sup>3</sup>Instituto Superior Tecnico/University of Lisboa, Lisbon, Portugal

Ultra-low-power fully-integrated voltage regulators with fast load-transient performance are highly attractive for low-power systems-on-a-chip (SoCs). In such systems, the digital units working in the subthreshold region are more sensitive to supply variations. The digital low-dropout regulator (DLDO) is more suitable for low-supply-voltage operation, as compared to an analog LDO regulator. But, traditional DLDOs are either slow or power hungry, and need a large output capacitor (consumes area) to survive a fast load transient. When a higher clock frequency is used for faster response, both the current efficiency and the loop stability are degraded [1]. An analog-assisted (AA) loop was used in [2] to provide a high-pass loop in parallel with the slow digital loop for fast response. However, a large coupling capacitor (100pF) was still needed, trading off area with power and speed. An NMOS power stage as a source follower is sometimes used in replica LDOs and cascaded LDOs [3] for its intrinsic response to load transient; the NMOS source follower naturally provides more output current when  $V_{OUT}$  drops. To improve upon the power-speed-area tradeoffs, this paper presents a DLDO using NMOS power switches, and employs a NAND-gate-based highpass analog path (NAP) to assist the slow low-power digital loop. With these two techniques, nearly two orders of better FoM is achieved relative to the state-ofthe-art

Figure 18.4.1 shows the DLDO with the NMOS intrinsic response and NAP, which combined, achieve a fast response. As the NMOS switch array needs to be driven by a high voltage, a small 2× charge pump (CP) is employed. Because the CP only supplies the tiny quiescent current of the gate drivers and the level shifters (LSS), a 12pF total capacitor (including  $C_{F1,2}$ =3pF and  $C_B$ =6pF) can meet the current demand sufficiently. The dynamic part of the switch-driving current will be filtered by  $C_B$ .

For the PMOS switch array and the AA loop in [2], the  $g_m$  of the switch array is proportional to the number of turned-on PMOS switches. When there are only a few switches turned on in light load, the  $g_m$  is too small to compensate a large load transient, and consequently, the effectiveness of the AA loop decays. For our NMOS switch array, we add the NAP to the two highest bits (being off in light load) of the switch array, breaking the tie between the effective  $g_m$  and the number of switches turned on. As shown in Fig. 18.4.2, the output voltage is AC coupled to  $V_{CP}$ , which is DC-biased to  $2 \times V_{DD}$  by  $R_1$ . Then,  $V_{CP}$  is connected to a modified NAND gate of which the corresponding PMOS  $M_1$  has large size to amplify the coupled signal to the power NMOS's gate when the output has an undershoot. Since the  $C_c$  is connected to the NAP.

Upon a load transient event, the NMOS power switch responds first to increase/decrease the output current. Then, the loop-1 starts the action of increasing the output current if the output has an undershoot. At the next clock rising edge, the loop-2 starts to adjust the output voltage to the preset value. Fig. 18.4.2 shows the simulated transient responses of the traditional PMOS DLDO, NMOS DLDO without NAP, and NMOS DLDO with NAP, when the load current jumps from 0.5mA to 20.5mA with 3ns edge time and zero  $C_L$ . The traditional PMOS DLDO would have a 426mV undershoot, while the NMOS DLDO's undershoot is 244mV benefiting from the NMOS intrinsic response. The NMOS DLDO with NAP loop (this work) has only 96mV of undershoot - a superior transient-response capability. Fig. 18.4.2 also exhibits the I-V characteristics of the turned-on NMOS and PMOS switches, when both the PMOS and NMOS have an  $I_{OUT}$ =11.5mA at  $V_{OUT}$ =450mV; the NMOS power stage can manage 19% higher current than the PMOS one with the same  $V_{OUT}$  variation of 20mV. Combined with the dead-zone control structure, an NMOS DLDO has a higher probability of working in the dead-zone region than a PMOS DLDO, reducing its dynamic power.

Figure 18.4.3 shows the overall architecture of the NMOS DLDO with NAP. The digital control loop uses a weighted shift register (SR) and a coarse-fine tuning structure to get higher DC accuracy and smaller recovery time. The weights of the SR in the fine and coarse loops are optimized for small glitches, while reducing

the number of registers. The fine loop contains an 8b SR, controlling eight 1x-size power switches. The coarse loop uses a 4b low SR and a 16b high SR with carry in/out between the two SRs. The low SR controls four 8x-size power switches and the high SR controls sixteen 32x-size power switches in the coarse loop. All the shift-register outputs go to the LSs to control the power stage in the  $2 \times V_{DD}$  domain.

Figure 18.4.4 shows the simulated voltage waveforms during a load transient. Due to the control-loop logic-delay mismatches, large glitches, which may enlarge the voltage undershoot/overshoot, will happen during the carry operation between the different weighted switch arrays [2]. Here, we designed the low and high SRs to have different logic delays so voltage glitches are opposite in direction to the undershoot/overshoot. In the proposed architecture, the mode-selecting multiplexer of the coarse loop, which is the clock-gating stage for the low SR, is implemented with ultra-high-threshold-voltage (UHVT) devices; and, the NOR gate connected to the high SR for clock gating uses regular-threshold-voltage (RVT) devices. As such, edges of  $\text{CLK}_{\text{H}}$  arrive earlier than  $\text{CLK}_{\text{L}}$ , as shown in the zoomed-in region. So, the weighted shift-register structure will experience a carry for the high SR first, and then reset the low SR, turning the glitch polarity to the desired direction.

The double-tail comparator [4] with an extra 'valid' output is used for the three comparators. The 'valid' signal will be high after the comparator finishes the comparison and will be low during its reset period. Thus, the fine and coarse loops are triggered by the 'valid' signal, instead of being triggered by the global clock, to avoid toggling the SR incorrectly. The rationale for this is that it is challenging to maintain the right clock timing for all the SRs and comparators, due to the PVT variations, especially under a 400mV sub-threshold input voltage. In addition, the comparator in the fine loop stops working during coarse tuning and freeze mode to reduce the dynamic current consumption.

The test chip was fabricated in a 28nm bulk CMOS process. The proposed NMOS DLDO has a 50mV dropout voltage and can deliver 20mA with 0.35-to-0.5V output, 0.4-to-0.55V input, and 4MHz clock. Fig. 18.4.5 shows the measured load transient response. With  $V_{0UT}$ =0.45V and  $V_{IN}$ =0.5V and a zero-output capacitor, when the load current changes from 0.5mA to 20.5mA with 3ns edge time, the measured undershoot and overshoot voltages are 117mV and 49mV, respectively. And, with  $V_{0UT}$ =0.35V,  $V_{IN}$ =0.4V, zero-output capacitor, and 1MHz clock, when the load current changes from 1mA to 16mA with 3ns edge time, the measured undershoot and overshoot voltages are 111mV and 46mV, respectively. For dynamic voltage scaling, the measured reference up- and down-tracking speeds are 238mV/µs and 91mV/µs, respectively. The comparison table in Fig. 18.4.6 shows that this design has two orders of better FoM as compared to the state-of-the-art, with the low quiescent current and small required capacitance. Fig. 18.4.7 shows the chip micrograph.

## Acknowledgments:

This work is supported in part by the Macao Science and Technology Development Fund (FDCT) 093/2016/A and SKL-AMSV-2017-2019, and the Research Committee of University of Macau, and in part by National Natural Science Foundation of China (NSFC) under grant 61534002.

## References:

[1] S. B. Nasir, et al., "A 0.13µm Fully Digital Low-Dropout Regulator with Adaptive Control and Reduced Dynamic Stability for Ultra-Wide Dynamic Range," *ISSCC*, pp. 98–99, 2015.

[2] M. Huang, et al., "An Output-Capacitor-Free Analog-Assisted Digital Low-Dropout Regulator with Tri-Loop Control," *ISSCC*, pp. 342–343, 2017.

[3] Y. Lu, et al., "An NMOS-LDO Regulated Switched-Capacitor DC–DC Converter With Fast-Response Adaptive-Phase Digital Control," *IEEE Trans. Power Electron.*, vol. 31, no. 2, pp. 1294–1303, 2016.

[4] D. Schinkel, et al., "A Double-Tail Latch-Type Voltage Sense Amplifier with 18ps Setup+Hold Time," *ISSCC*, pp. 314–605, 2017.

[5] L. G. Salem, et al., "A 100nA-to-2mA Successive-Approximation Digital LDO with PD Compensation and Sub-LSB Duty Control Achieving a 15.1ns Response Time at 0.5V," *ISSCC*, pp. 340–341, 2017.

[6] Y. J. Lee, et al., "A 200-mA Digital Low Drop-Out Regulator With Coarse-Fine Dual Loop in Mobile Application Processor," *IEEE JSSC*, vol. 52, no. 1, pp. 64–76, Jan. 2017.

[7] F. Yang and P. K. T. Mok, "A Nanosecond-Transient Fine-Grained Digital LDO With Multi-Step Switching Scheme and Asynchronous Adaptive Pipeline Control," *IEEE JSSC*, vol. 52, no. 9, pp. 2463-2474, 2017.







Figure 18.4.3: Overall architecture of the proposed NMOS NAP-DLDO; schematics of the dead-zone comparator and double-tail comparator.







Figure 18.4.2: Circuit implementation with NAND-gate-based high-pass analog path; the transient waveforms of the traditional PMOS DLDO, NMOS DLDO without NAP, and NMOS DLDO with NAP; I-V characteristics of the turned-on NMOS and PMOS power switches.





|                           |             |        | [2]       | [1]       | [5]       | [6]     | [7]      |
|---------------------------|-------------|--------|-----------|-----------|-----------|---------|----------|
|                           | This Work   |        | ISSCC'17  | ISSCC'15  | ISSCC'17  | JSSC'17 | JSSC'17  |
| Process                   | 28nm        |        | 65nm      | 130nm     | 65nm      | 28nm    | 65nm     |
| Area [mm <sup>2</sup> ]   | 0.0055      |        | 0.03      | 0.114     | 0.0023    | 0.021   | 0.158    |
| Туре                      | Digital     |        | Digital   | Digital   | Digital   | Digital | Digital  |
| Achitecture               | SR/NMOS/NAP |        | SR/AA     | SR/RDS    | SR/PD/PWM | SR/ADC  | Async.   |
| V <sub>IN</sub> [V]       | 0.4-0.55    | 0.4    | 0.5-1     | 0.5-1.2   | 0.5-1     | 1.1     | 0.6-1    |
| <b>ν</b> ουτ [ <b>ν</b> ] | 0.35-0.5    | 0.35   | 0.45-0.95 | 0.45-1.14 | 0.3-0.45  | 0.9     | 0.55-0.9 |
| F <sub>CLK</sub> [MHz]    | 4           | 1      | 10        | 400       | 240       | N.A.    | N.A.     |
| Ι <sub>Q_ΜΙΝ</sub> [μΑ]   | 0.81        | 0.43   | 3.2       | 24-221    | 14        | 110     | 300      |
| C <sub>TOTAL</sub> [pF]   | 24          |        | 100       | 1000      | 400       | 23500   | 1500     |
| ΔV <sub>out</sub> [mV]    | 117         | 111    | 105       | 90        | 40        | 120     | 50       |
| ΔI <sub>LOAD</sub>        | 20mA        | 15mA   | 10mA      | 1.4mA     | 1.06mA    | 180mA   | 100m     |
| /T <sub>EDGE</sub>        | /3ns        | /3ns   | /1ns      | /N.A.     | /1ns      | /4µs    | /2ns     |
| FOM*[ps]                  | 0.0057      | 0.0051 | 0.23      | 8600      | 199       | 7.75    | 2.3      |

$$* FOIVI = \frac{\Delta I_{LOAD}^2}{\Delta I_{LOAD}^2}$$

Figure 18.4.6: Comparison with the state-of-the-art.

