## 20.5 A Dual-Symmetrical-Output Switched-Capacitor Converter with Dynamic Power Cells and Minimized Cross Regulation for Application Processors in 28nm CMOS

Junmin Jiang<sup>1,2</sup>, Yan Lu<sup>1</sup>, Wing-Hung Ki<sup>2</sup>, Seng-Pan U<sup>1,3</sup>, Rui P. Martins<sup>1,4</sup>

<sup>1</sup>University of Macau, Macao, China

<sup>2</sup>Hong Kong University of Science and Technology, Hong Kong, China <sup>3</sup>Synopsys Macau Ltd, Macao, China <sup>4</sup>Instituto Superior Tecnico, Universidade de Lisboa, Portugal

Multicore application processors in smartphones/watches use power-saving techniques such as dynamic voltage and frequency scaling (DVFS) to extend battery cycle, and supply cores with different voltages [1]. High-efficiency fully integrated switched-capacitor (SC) power converters with no external components are promising candidates [2]. Typically, SC converters with different specifications are independently designed (Fig. 20.5.1), leading to a large area overhead, as each converter has to handle its peak output power. Recently, multi-output SC converters are reported to tackle this issue. In [3], an on-demand strategy is used to control two outputs, each with a different loading range, and the outputs are not interchangeable. In [4], the two output voltages are fixed with voltage conversion ratios (VCRs) of 2x and 3x only. In [5], the controller is integrated, but the three output voltages are still from three individual SC converters. Without reallocating the capacitors in the power stages, capacitor utilization is low, as margins have to be reserved to cater for each converter's peak output power. This paper presents a fully integrated dual-output SC converter with dynamic powercell allocation for application processors. The power cells are shared and can be dynamically allocated according to load demands. A dual-path VCO that works independently of power-cell allocation is proposed to realize a fast and stable regulation loop. The converter can deliver a maximum current of 100mA: one output can be adjusted to deliver 100mA, while the other handles a very light load; or both outputs can be adjusted to deliver 50mA each with over 80% efficiency.

Figure 20.5.1 shows the dynamic power-cell allocation strategy. The converter consists of two channels,  $CH_1$  and  $CH_2$ , with output voltages,  $V_{01}$  and  $V_{02}$ , respectively. Each output is regulated through frequency modulation. The switching frequencies of the two channels are  $f_1$  and  $f_2$ . The goal is to adjust them to be equal so that both channels have the same power density, and the converter achieves the best overall efficiency. Assume, for example, that the two channels start with the same number of power cells, but the load of  $CH_1$  is larger than that of  $CH_2$ . To regulate the outputs properly, we should initially have  $f_1 > f_2$ , and assign more power cells to  $CH_1$ . It means the physical boundary should migrate to the right until  $f_1$  and  $f_2$  are approximately equal. By balancing the power densities of the two channels witching frequency, both switching and parasitic losses are reduced. By dynamically adjusting both the number of power cells and the optimal switching frequencies, the channels are able to provide sufficient power to the loads, and utilization of capacitors is maximized.

The power cells are connected to either  $CH_1$  or  $CH_2$  by channel selection switches. The boundary between the two channels is controlled by the outputs of the bidirectional shift register (SR) sel[1:m+n]. The direction of boundary shifting is determined by the frequency comparator. After each comparison, the boundary will only shift along adjacent power cells as sel[1:m+n] will only shift by one bit. As such, potential glitches due to reconnecting power cells are minimized. There are a total of 82 power cells, and they work with interleaving phases to reduce the output ripple voltage. The VCRs of the two outputs (R<sub>1</sub> and R<sub>2</sub>) are determined by the ratio selector that senses  $V_{REF}/V_{IN}$ .

To enable the allocation while minimizing cross regulation, a dual-path voltagecontrolled oscillator (VCO) is employed, shown in Fig. 20.5.2. The VCO consists of 82 delay cells that generate the clock phases for each power cell. Each delay cell in CH<sub>1</sub> (DC<sub>1</sub>[n]) has a complementary delay cell in CH<sub>2</sub> (DC<sub>2</sub>[n]). The phases  $\varphi_{1[n]}$  and  $\varphi_{2[n]}$  are chosen by the MUX and then distributed to the power cell. If sel[n] = 1, DC<sub>1</sub>[n] of VCO (CH<sub>1</sub>) is enabled. At the same time, DC<sub>2</sub>[n] will be shorted by the MUX and the clock phase is redirected to the next cell. In this way, the number of delay cells in each VCO is equal to the number of its power cells, and multiphase interleaving can take effect to reduce the output ripple voltage. The frequency of the VCO is controlled by the error amplifier, and the two outputs are separately regulated, regardless of the power-cell arrangement. As the speed of the regulation loop is much faster than that of power-cell allocation, stability is ensured. Each power cell consists of 2 flying capacitors and 8 power transistors and the VCR can be  $2/3 \times$  or  $1/2 \times$ . The configuration of each power cell is optimized to minimize the parasitic loss [6]. The channel selection switches, controlled by sel[n], connect the local output V<sub>0L</sub> to V<sub>01</sub> or V<sub>02</sub>.

Figure 20.5.3 shows the control logic that consists of the frequency comparator and the power-cell shift register. First, the one-shot signals ( $ck_{1os}$  and  $ck_{2os}$ ) control  $P_1$  and  $P_2$  to charge  $C_{C1}$  and  $C_{C2}$  for one clock period only. The ready signals (ready<sub>1</sub> and ready<sub>2</sub>) are activated after charging is finished, and trigger the comparison between  $V_{F1}$  and  $V_{F2}$ . After a short delay,  $C_{C1}$ ,  $C_{C2}$  and logic are reset. For the comparison, if  $V_{F1}$ <br/>V<sub>F2</sub>, meaning that  $f_1$ > $f_2$ , the direction signal of the shift register is then set as direct=0, and the selection signals will shift left by one bit. This frequency adjustment repeats until  $f_1$  and  $f_2$  are very close to each other. The frequency comparator will then issue stop=1, and shifting will be terminated. To ensure accurate charging, the current sources and capacitors ( $C_{C1}$  and  $C_{C2}$ ) are well matched. For robust control, offsets are added to the comparators to form a hysteresis window. The whole process is driven solely by  $ck_1$  and  $ck_2$ , without an additional system clock.

The proposed dual-output SC converter was fabricated in a 28nm CMOS process. The active area is 1.2×0.5mm<sup>2</sup>. Fig. 20.5.4 shows the measured waveforms of the steady-state outputs, reference tracking and load transient. Measured results demonstrate that two output voltages can be independently regulated and the two switching frequencies were adjusted to be very close. The measured reference up- and down-tracking speeds were 500mV/µs and 334mV/µs, respectively. No obvious cross regulation was observed at V<sub>02</sub>, while V<sub>01</sub> was undergoing reference tracking. With the load at V<sub>01</sub> switched from 4mA to 40mA, the settling time was within 500ns. The cross regulation at V<sub>02</sub> was less than 10mV at the rising edge and negligible at the falling edge, verifying that the dual-path VCO control minimized cross regulation.

Figure 20.5.5 shows measured efficiencies with load currents  $I_{01}$  and  $I_{02}$ . The peak efficiency was 83.3% and the split load currents were 50mA for both channels. With dynamic power-cell allocation, the converter achieved over 80% efficiency consistently when  $I_{01}$  and  $I_{02}$  were larger than 15mA. Efficiency with allocation is improved by 4.8% vs. without allocation. Fig. 20.5.6 shows the performance comparison, and Fig. 20.5.7 shows the chip micrograph. Via dynamic power-cell allocation, the dual-output SC converter achieves high efficiency over a broad load range for two outputs with minimized cross regulation.

## Acknowledgments:

This work is supported in part by the Macao Science and Technology Development Fund (FDCT) and the Research Committee of the University of Macau, and in part by the Research Grants Council of the Theme-Based Research Scheme (TRS) of Hong Kong under the project T23-612/12-R.

## References:

[1] A. Wang, et al., "Heterogeneous Multi-Processing Quad-Core CPU and Dual-GPU Design for Optimal Performance, Power, and Thermal Tradeoffs in a 28nm Mobile Application Processor," *ISSCC*, pp. 180-181, 2014.

[2] Y. Lu, et al., "A 123-Phase DC-DC Converter-Ring with Fast-DVS for Microprocessors," *ISSCC*, pp. 364-365, 2015.

[3] C. K. Teh and A. Suzuki, "A 2-Output Step-Up/Step-Down Switched-Capacitor DC-DC Converter with 95.8% Peak Efficiency and 0.85-to-3.6V Input Voltage Range," *ISSCC*, pp. 222-223, 2016.

[4] Z. Hua, et al., "A Reconfigurable Dual-Output Switched-Capacitor DC-DC Regulator With Sub-Harmonic Adaptive-On-Time Control for Low-Power Applications," *JSSC*, vol. 50, no. 3, pp. 724-736, Mar. 2015.

[5] W. Jung, et al., "A 60%-Efficiency 20nW-500µW Tri-Output Fully Integrated Power Management Unit With Environmental Adaptation and Load-Proportional Biasing for IoT Systems," *ISSCC*, pp. 154-155, 2016.

[6] J. Jiang, et al., "A 2-/3-Phase Fully Integrated Switched-Capacitor DC-DC Converter in Bulk CMOS for Energy-Efficient Digital Circuits with 14% Efficiency Improvement," *ISSCC*, pp. 366-367, 2015.











Figure 20.5.5: Measured efficiency versus loading currents with and without dynamic power allocation.



Figure 20.5.2: Circuit implementation of dual-path VCO, delay cell of dual-path VCO and power stage.



Figure 20.5.4: Measured waveforms of steady state output voltages, reference tracking and loading transient response.

| Work                                | [3] ISSCC`16                                    | [4] JSSC`15                                      | [5] ISSCC`16                                                                 | This work                                              |
|-------------------------------------|-------------------------------------------------|--------------------------------------------------|------------------------------------------------------------------------------|--------------------------------------------------------|
| Technology                          | 65nm                                            | 0.35µm                                           | 180nm                                                                        | 28nm                                                   |
| Topology                            | Step-Up/Down                                    | Step-Up                                          | Step-Down                                                                    | Step-Down                                              |
| Number of Outputs                   | 2                                               | 2                                                | 3                                                                            | 2                                                      |
| Passive Type                        | On-chip<br>Off-chip                             | Off-chip                                         | On-chip<br>(MIM+MOS)                                                         | On-chip<br>(MOM+MOS)                                   |
| V <sub>IN</sub>                     | 0.85-3.6V                                       | 1.1-1.8V                                         | 0.9-4V                                                                       | 1.3-1.6V                                               |
| Vout                                | 0.1-1.9V                                        | 2V, 3V                                           | 0.6V, 1.2V, 3.3V                                                             | 0.4-0.9V                                               |
| I <sub>O, MAX</sub>                 | 10mA                                            | 24mA                                             | 100uA*                                                                       | 100mA                                                  |
| Total C <sub>FLY</sub>              | 1µF                                             | 9.4µF                                            | 3nF                                                                          | 8.1nF                                                  |
| η <sub>peak</sub>                   | 95.8%                                           | 89.5%                                            | 81%                                                                          | 83.3%                                                  |
| Power Density                       | N/A                                             | N/A                                              | 250µW/mm <sup>2</sup>                                                        | 150mW/mm <sup>2</sup>                                  |
| Max. Load per Output                | V <sub>01</sub> : 1mA<br>V <sub>02</sub> : 10mA | V <sub>01</sub> : 12mA<br>V <sub>02</sub> : 12mA | V <sub>01</sub> : ЗЗµА<br>V <sub>02</sub> : ЗЗµА<br>V <sub>03</sub> : ЗЗµА * | V <sub>01</sub> : 0-100mA<br>V <sub>02</sub> : 100-0mA |
| Symmetrical Outputs                 | No                                              | No                                               | No                                                                           | Yes                                                    |
| *Extracted from measurement results |                                                 |                                                  |                                                                              |                                                        |

Figure 20.5.6: Comparison with prior art.

