# A 0.032-mm<sup>2</sup> 0.15-V Three-Stage Charge-Pump Scheme Using a Differential Bootstrapped Ring-VCO for Energy-Harvesting Applications

Haidong Yi, Student Member, IEEE, Jun Yin, Member, IEEE, Pui-In Mak, Senior Member, IEEE, and Rui P. Martins, Fellow, IEEE

Abstract—This brief reports a compact and fully integrated three-stage charge-pump (CP) scheme with a 1:10 step-up ratio for energy-harvesting applications. To undertake a low-voltage input (e.g., from thermoelectric or solar source), our CP scheme features a differential bootstrapped ring-VCO generating sixphase clock signals with a boosted swing. Driven by replicas of these swing-boosted clock signals, the entailed number of CP stages is reduced and a 1:10 step-up ratio can be achieved with only a 3-stage CP, resulting in a higher PCE. Using the replicas of the clock signals also reduces substantially the dependency of the clock frequency on the load drivability. Fabricated in 65-nm CMOS, a 0.87-V output voltage is measured at a 38.8% power conversion efficiency, under a 500-k $\Omega$  load and a 0.15-V input. The chip area is 0.032 mm<sup>2</sup>.

*Index Terms*—Bootstrapped, CMOS, energy harvesting, charge pump (CP), ring-VCO, reverse current, ultra-low voltage.

### I. INTRODUCTION

**E** NERGY harvesting is an essential feature of emerging smart sensors and Internet-of-Things (IoT) radios to be power-autonomous [1]–[4]. Solar and thermoelectric are promising energy sources but deliver deeply low output voltages hindering their utility. To address it, recent efforts have been focused on the fully-integrated DC-DC boost converters that can undertake a tiny input voltage ( $V_{in}$ ), while offering a high output voltage ( $V_{out}$ ) and a high power conversion efficiency (PCE).

In the literature, the inductive DC-DC converter facilitates small  $V_{in}$  (20 mV) operation, and demonstrates a high PCE by using synchronous rectification in the discontinuous mode [5]. Yet, despite its high voltage conversion ratio (>20), it calls for a large off-chip inductor (4.7  $\mu$ H) impeding its integration

Manuscript received December 6, 2016; revised February 3, 2017; accepted February 23, 2017. Date of publication March 1, 2017; date of current version January 29, 2018. This work was supported in part by the Macao Science and Technology Development Fund (FDCT) SKL Fund and in part by University of Macau under Grant SRG2014-00012-AMSV. This brief was recommended by Associate Editor T. B. Tarim.

H. Yi and P.-I. Mak are with State-Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macau, China, and also with the Faculty of Science and Technology—ECE, University of Macau, Macau, China (e-mail: pimak@umac.mo).

J. Yin is with State-Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macau, China (e-mail: junyin@umac.mo).

R. P. Martins is with State-Key Laboratory of Analog and Mixed-Signal VLSI, University of Macau, Macau, China, and also with Instituto Superior Técnico, Universidade de Lisbon, Lisbon, Portugal (e-mail: rmartins@umac.mo).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSII.2017.2676159

level. Alternatively, a fully-integrated LC oscillator with an on-chip transformer [4] can be exploited to boost up the clock swing, reducing the startup voltage (85 mV), and startup time owing to the high-frequency clock (138 MHz). Yet, the silicon area (1.8 mm<sup>2</sup>) and PCE (~0.057% at 1-M $\Omega$  load) are penalized. For the switched-capacitor DC-DC convertor [6], it exhibits a high PCE (34% at 0.18-V  $V_{in}$ ) by using an adaptive circuit to optimize the dead-time automatically according to the input voltage and dynamic body bias, but still, 6 off-chip capacitors (10 nF each) are entailed. For the self-oscillating doubler [7], it allows full integration by cascading themselves to build the harvester with a reconfigurable overall conversion ratio (9× to 23×). Also, its leakage-based frequency-control delay element secures low idle power consumption (<3 nW) over a wide range of output (5 nW to 5  $\mu$ W). Yet, its PCE falls significantly at low  $V_{in}$  (0.25 V) due to the power loss associated with the 4 cascaded stages.

This brief describes a fully-integrated 3-stage charge pump (CP) scheme that measures only a 0.032-mm<sup>2</sup> chip area in 65-nm CMOS, and a high PCE of 38.8% at  $V_{in} = 0.15$  V. The key technique is a differential bootstrapped ring-VCO (BTRO) that can: 1) generate multi-phase clock signals with a boosted swing to reduce the required number of stages of the CP for a certain step-up ratio; 2) significantly reduce the dependency of the clock frequency on the load drivability by creating replicas of the boosted clock signals, and 3) reduce the loss due to the reverse current for a better PCE.

## II. TYPICAL AND PROPOSED CP SCHEMES

A fully-integrated DC-DC boost converter can be based on a multi-stage passive CP scheme (e.g., 3 stages in Fig. 1), which can deliver a high  $V_{out}$  to power-up or activate other circuitries [6]. Even in ultra-scale CMOS technologies, it is hard to power-up a ring-VCO-based clock generator at  $V_{\rm in}$  < 0.2 V due to the concern of leakage power versus the threshold voltage. Also, to enhance the output drivability at low voltage, clock buffers are normally necessitated. Yet, the power dissipated by the clock buffers and the loss of the CP elements substantially limit the voltage and power that can be extracted from  $V_{in}$ . To improve the PCE, this brief proposes a CP scheme (Fig. 2) that features a 3-stage BTRO that can offer a 6-phase clock with an enhanced output swing  $(\sim 3 \times V_{in})$ , details later). The 6-phase clocks are evenly distributed across the 3 stage CP elements through a replica driver  $D_2$ , which shares the same bootstrapped capacitors with the delay cell  $D_1$  of the BTRO.

Bootstrap clocking has been developed in [8] and [9] in a single-ended style. Our *differential* BTRO (Fig. 2) features a 3-stage inverter delay cell; each consists of two voltage

1549-7747 © 2017 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.



Fig. 1. A typical 3-stage CP scheme powered-up by a small input voltage  $(V_{in})$ . Clock buffers are normally required between the clock generator and CP elements to decouple them and optimize the load drivability.



Fig. 2. Proposed 3-stage CP scheme with a differential BTRO.

triplers featuring two pairs of differential outputs. One of the pairs is for driving the delay cells, such that the oscillation frequency of the BTRO can be optimized mainly for the PCE. Whereas another pair can be optimized solely to drive the CP elements. Ideally, the swing of the clock can be tripled  $(-V_{in} \text{ to } 2V_{in})$ , becoming adequate to switch the transistors (triode region) even at a very low  $V_{in}$ . In other words, the inverter is inherently a voltage tripler. Besides, the precharging transistors,  $M_{p1}$  and  $M_{n1}$ , are controlled by one of the differential inputs In+, rather than the output Out+that used in single-ended delay cell [8], [9]. In this way, the reverse current through  $M_{p1}$  and  $M_{n1}$ , caused by the delay between In- and Out+, can be reduced. Unlike [6] where each CP element shares the same 2-phase clock, here the load at each BTRO's output is naturally balanced. Thus, the loading associated with the CP elements can be fully absorbed by the bootstrap capacitors in the BTRO.

Since the design target is to generate a 0.9-V  $V_{out}$  from a 0.15-V  $V_{in}$  at a 500-k $\Omega$  loading resistor while maximizing the PCE, we choose a 3-stage CP scheme that can ideally generate  $V_{out} = V_{in} + 3V_{in} + 3V_{in} = 10V_{in}$  to guarantee a 0.9-V  $V_{out}$  when considering the voltage drops on both capacitors and MOS transistors, issue to be discussed later.

#### III. DESIGN DETAILS AND OPTIMIZATION

## A. PCE Optimization

This brief is implemented in 65-nm CMOS using generalpurpose low-threshold transistors (*gplvt*). The threshold voltage is -267 mV for PMOS and 347 mV for NMOS. The PCE is defined by the ratio of the output dc power ( $P_{out,dc}$ ) to the total input power ( $P_{in}$ ),

$$PCE = \frac{P_{\text{out,dc}}}{P_{\text{in}}} = \frac{V_{\text{out}}^2}{V_{\text{in}}I_{\text{in}}R_{\text{L}}}.$$
 (1)



Fig. 3. The simulated PCE versus (a) the sizes of  $M_{p1}$  and  $M_{p,D2}$  and (b) the size of  $M_{p,D2}$  for different sizes of  $M_{p1}$ .

Due to the limit of maximum channel length that can be supported by the device model, the driver  $D_1$  (Fig. 2) consists of several cascode NMOS and PMOS with an inversed size ratio. For the driver  $D_2$ , it utilizes the nominal PMOS-NMOS topology. Here, the PCE is optimized at 0.15-V  $V_{in}$ . The ideal clock swing from -150 to 300 mV helps to reduce the turnon resistance and leakage current of the transistors in  $D_{1,2}$ . Thus, the voltage drop on the transistors at a non-zero output current can be reduced, so as the conduction loss. Obviously, there is a voltage drop on the bootstrap and CP capacitors due to the output current. According to [10], the voltage drop is given by

$$V_{\rm L} = \frac{I_{\rm out}}{2f_{\rm osc}C_{\rm b}}.$$
(2)

 $I_{out}$  is the output current,  $f_{osc}$  is the oscillation frequency of the BTRO;  $C_b$  denotes the bootstrap capacitor or CP capacitor. Fig. 3 depicts the simulation model.  $C_L$  serves as the power-storage capacitor at the output, and its size can be chosen to balance the startup time and output ripple requirements (details in Section III-B). Excluding the charge sharing due to the parasitic capacitors and leakage currents, the output DC voltage can be calculated:

$$V_{\rm out} = 10V_{\rm in} - 3(V_{\rm mos} + 3V_{\rm L}).$$
 (3)

where  $V_{\text{mos}}$  and  $3V_{\text{L}}$  are the total voltage drops on the transistors and on the capacitors  $C_{\text{bn}}$ ,  $C_{\text{bp}}$  and  $C_{\text{cp}}$  in one stage BTRO and CP element, respectively.

The power analysis of BTRO has been reported in [8], where most of the power loss comes from the dynamic switching loss and leakage current loss. In the proposed scheme, where the BTRO also forms part of the power transfer path, finite output current results in an important conduction loss term that must also be considered. Thus, the size of  $D_2$  is crucial to maximize the PCE. During the optimization, the sizes of  $M_{p1}$ , and the PMOS in  $inv_p$  and  $inv_n$  are the same, so as the sizes of  $M_{n1}$  and the NMOS transistors in  $inv_p$  and  $inv_n$ . The PMOS-NMOS ratio is 2:1. The insights can be obtained by sweeping the sizes of  $M_{p1}$  and the PMOS transistor in  $D_2(M_{p,D2})$  when all capacitors are set as 2.5 pF and the sizes of transistors in  $D_1$  and CP are fixed (Fig. 3). For a small transistor size, V<sub>mos</sub> is relatively large, and the PCE is dominated by the conduction loss and can be improved by upsizing the transistors. On the other hand, if the transistor sizes are enlarged, the PCE will be loaded down by the dynamic loss due to the increased gate capacitance. It can also be observed from Fig. 3 that the sizes of transistors in  $D_2$  must be large to reduce the conduction loss, but should not affect the PCE much even when it is oversized from the optimal point. The same observation applies for the transistors in the CP [Fig. 4(a)].

The choices of  $C_{\rm b}$  and  $f_{\rm osc}$  would affect  $V_{\rm out}$  and thus the PCE. If considering only the conduction loss, the proposed



Fig. 4. The simulated PCE (a) versus the size of transistors in the CP and (b) versus  $C_b$  with other parameters maintained constant.



Fig. 5. Simulated PCE versus  $f_{osc}$  with optimized size of transistors in  $inv_{p,n}$  and  $D_1$  at different  $C_b$ .

CP scheme can be modeled as a voltage source  $V_{\rm s} = 10V_{\rm in}$ in series with an output resistance  $R_{out}$  representing the total conduction loss from the transistors and capacitors [10]. Then the PCE only considering the conduction loss can be estimated as  $PCE_{cd} = R_L/(R_L + R_{out}) = V_{out}/V_s$ . To reduce the conduction loss, a large  $V_{out}$  is preferable which can be achieved by choosing a large  $f_{\rm osc}C_{\rm b}$  product to reduce the voltage drop  $V_{\rm L}$  on the capacitors according to (2) and (3). As shown in Fig. 4(b), the simulated  $V_{out}$  and PCE increase with  $C_b$  but will finally saturate, since the voltage drop  $V_{mos}$  and the conduction loss of the transistors become dominant. The simulated  $V_{\text{out}}$  is ~1.05 V when  $V_{\text{in}} = 0.15$  V and  $C_{\text{b}} = 20$  pF, resulting in a PCE<sub>cd</sub> of 70%. However, the simulated overall PCE is degraded to 52% [Fig. 4(b)] due to the dynamic switching loss. Generally, the dynamic switching loss can be reduced by lowering  $f_{\text{osc}}$ . For a certain  $C_{\text{b}}$ , as shown in Fig. 5, the dynamic switching loss dominates at a high  $f_{osc}$  while the conduction loss dominates at a low  $f_{osc}$ . Thus an optimal  $f_{osc}$  exists which results in a maximum PCE. Fig. 5 also depicts that the optimal  $f_{\rm osc}$  can be reduced at a larger  $C_{\rm b}$ , lowering both the dynamic and leakage loss. Alternatively, the increased gate capacitor of  $D_1$  leads to more dynamic switching loss, which makes the improvement of PCE less effective when  $C_b$  is large. In this brief,  $C_{\rm b} = 2.5$  pF and  $f_{\rm osc} = 15.2$  MHz are chosen to obtain a high PCE with a reasonable chip area. The corresponding  $V_{\rm L}$ and  $V_{\text{MOS}}$  are ~25 and ~107 mV. Fig. 6(a) plots the transient waveforms of the critical nodes from pre-layout simulations. The simulated  $V_{out}$ , oscillation frequency of the BTRO, and PCE are 954 mV, 15.2 MHz and 47.3%, respectively.

To stem the reverse current loss, we use one of the differential inputs In+, instead of the output Out+, to control the precharging transistors  $M_{p1}$  and  $M_{n1}$ . From simulations [Fig. 6(b)], the reverse current in  $M_{p1}$  is reduced significantly with  $M_{p1}$  controlled by In+, resulting in a 34% PCE improvement.

## B. Output Voltage Ripple and Startup Time

To elucidate the relationship between the voltage ripple and clock phase, let us exam firstly the operation of a 2-phase



Fig. 6. Simulated transient (a) voltage waveforms at the critical nodes and (b) current waveforms in  $M_{p1}$  with different control signals.

 TABLE I

 PCE VERSUS PARASITIC CAPACITORS AT DIFFERENT NODES

| 2.5 pF          | Conditions                                      | Simulated PCE |  |
|-----------------|-------------------------------------------------|---------------|--|
| MIM Capacitor   | 25 fF at N <sub>bp,n</sub> (N <sub>op,n</sub> ) | 46.9%         |  |
|                 | 25 fF at Out±,cp ( $N_{cp\pm}$ )                | 43.4%         |  |
| MOM Capacitor   | 75 fF at N <sub>bp,n</sub> (N <sub>op,n</sub> ) | 45.7%         |  |
|                 | 75 fF at Out±,cp ( $N_{cp\pm}$ )                | 36.6%         |  |
| Ideal Capacitor | No C <sub>p</sub>                               | 47.3%         |  |



Fig. 7. A 2-phase cross-coupled CP voltage doubler (left) and its output voltage waveform with ideal switches (right).

voltage doubler (Fig. 7). Similar to [11], we assume that the switches and all other conductive interconnects are ideal, and the current flowing between the input and output sources, and capacitors, are impulsive. As a result, the charge transfer and redistribution occur immediately after the switches are closed. Fig. 7 shows the waveform of the output voltage. When  $CLK_p$  is high, according to the conservation of charge,  $C_{b1}(V_{o1} - V_{o2}) + C_L(V_{o1} - V_{o2}) = I_0(T/2)$ . With  $C_{b1} = C_{b2} = C_b$ , the output voltage ripple can be expressed as:

$$V_{\rm R} = V_{\rm o1} - V_{\rm o2} = \frac{I_0 T}{2(C_b + C_L)}.$$
(4)

According to (4), a high-frequency clock and a large  $C_L$  aid reducing the output ripple. The simulated output ripple is shown in Fig. 6(a). The ripple frequency is twice of the BTRO frequency. The output ripple  $V_R$  is only determined by the rising and falling edges of the differential outputs of the final stage of the BTRO (i.e.,  $Out\pm$ , cp3), since each CP element is only responsible for charging  $C_{cp}$  of its following one, rendering the differential clocks only affect the output node of its associated CP element. The simulated output ripple with different output capacitors is shown in Fig. 8. A large  $C_L$  reduces the ripple, but at the cost of a longer startup time (Fig. 14). Thus a loading capacitor of ~30 pF is chosen, with an output ripple < 1 mV.



Fig. 8. Simulated output voltage ripple at different loading capacitors.



Fig. 9. Major parasitic capacitors inside the BTRO (left) and CP (right) elements.

#### C. Parasitic Effects

The parasitic capacitors from the transistors and interconnects should be minimized to reduce the power loss in each switching period. The crucial parasitic capacitors of the BTRO and CP elements are shown in Fig. 9. Comparatively, the voltage swings at node  $N_{cp\pm}$  and  $Out\pm$ , cp of the CP are 3 times of those at nodes of  $N_{bp,n}(N_{op,n})$ . Thus, the PCE of the entire CP scheme is affected mainly by the parasitic capacitors at node  $N_{cp\pm}$  and  $Out\pm$ , cp, as confirmed by simulations (Table I), in which the PCE of using MIM (~1% parasitic capacitance), MOM (~3% parasitic capacitance) and ideal capacitors are compared.  $C_p$  at nodes of  $N_{cp\pm}(Out\pm$ , cp) degrades the PCE the most. Thus, the chosen MIM capacitors in this brief can favor the PCE, while minimizing  $C_p$  at nodes of  $N_{cp\pm}(Out\pm$ , cp).

The body and source nodes of  $M_{p1}$  and  $M_{n1}$  are tied together to make the parasitic diodes forward-biased when  $C_{bp}$  ( $C_{bn}$ ) is charged, and reverse-biased when  $C_{bp}$  ( $C_{bn}$ ) is discharged [12], [13]. Also, such connection reduces their threshold voltage. The same situation applies to the transistors in the CP. The bodies of NMOS and PMOS transistors in  $D_1$  and  $D_2$  are connected to nodes  $N_{bn}$  and  $N_{bp}$ , respectively. Thus,  $M_{p1}$ , and the PMOS transistors in  $D_1$  and  $D_2$ ,  $M_{n1}$ , and the NMOS transistors in  $D_1$  and  $D_2$ , can be co-located in the same deep n-well, respectively, to reduce the layout area.

## **IV. MEASUREMENT RESULTS**

The 3-stage CP scheme is fabricated in 65-nm CMOS with a 0.033-mm<sup>2</sup> active area (Fig. 10) which is dominate by the capacitors. The transistors in the deep n-well only occupy an area of ~5%. Fig. 11 shows the measured PCE and  $V_{out}$  as a function of  $V_{in}$  from 0.1 to 0.3 V at a 500-k $\Omega$  load. The maximum  $V_{in}$  of ~0.33 V is limited by the maximum  $V_{gs}$ (i.e.,  $3V_{in} = 1V$ ) of transistors in D<sub>1</sub> and D<sub>2</sub> to maintain a long-term reliability. To avoid the reliability issue when the



Fig. 10. Chip micrograph of the fabricated 3-stage CP scheme.



Fig. 11. Comparison of measured and post-layout simulated PCE and  $V_{\text{out}}$  versus  $V_{\text{in}}$  at a 500-k $\Omega$  loading resistor.



Fig. 12. (a) Measured operating frequency of the BTRO with different loading resistors; (b) Measured  $V_{\text{out}}$  and  $P_{\text{in}}$  versus  $V_{\text{in}}$  without loading current.

input voltage of the thermoelectric or solar source is larger than 0.33 V, we could compare the  $V_{out}$  with a desired voltage (e.g., 0.9 V) and adjust the loading current according to the comparison result to keep  $V_{out}$  constant and  $V_{gs}$  of transistors in D<sub>1</sub> and D<sub>2</sub> below the maximum allowed value.

The simulation results at process corners are also plotted for comparison. The measured PCE and  $V_{out}$  is consistent with the simulation results in the TT corner. The circuit converts a 0.17-V  $V_{in}$  into a 1.15-V  $V_{out}$  with a maximum PCE of 42.1%. For the targeted  $V_{in} = 0.15$  V, the PCE is 38.8%. The PCE drops when  $V_{in} > 0.18$  V, owing to the increased oscillation frequency of the BTRO. Fig. 12(a) depicts the measured oscillation frequency of BTRO versus  $V_{in}$  at different loading resistors. The frequency increases with  $V_{in}$  at a slope of ~2 MHz/10 mV. Thanks to the split-output driver of the BTRO, the measured oscillation frequency at  $V_{in} = 0.15$ V only increases by 1 MHz (8.3%) when  $R_L$  is increased by 10× from 1 to 10 MΩ.

The measured  $V_{out}$  and  $P_{in}$  versus  $V_{in}$  without loading current is plotted in Fig. 12(b). A minimum  $V_{in}$  of ~80 mV can be converted into a 0.322-V  $V_{out}$ , with a 148.8-nW total power consumption. Thus, at the system level, the CP scheme can act as a starter with an 80-mV  $V_{in}$ . The measured dependence of  $V_{out}$  on the loading current at different  $V_{in}$  is shown in Fig. 13(a). Due to the limited pump capacitor of 2.5 pF, the maximum output current is ~6  $\mu$ A at a 0.2-V  $V_{in}$ , while  $V_{out}$  decreases from 1.67 to 0.73 V when the output current goes

|                   | ESSCIRC'14 [4]                              | JSSC'14 [7]                                                                     | JSSC'15 [6]                                                                                  | This work                                                                   |
|-------------------|---------------------------------------------|---------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|
| CMOS Process      | 65 nm                                       | 0.18 µm                                                                         | 0.13 µm                                                                                      | 65 nm                                                                       |
| No. of Stages and | 5-stage CP with                             | 4-stage                                                                         | 3-stage CP with dynamic                                                                      | 3-stage CP with                                                             |
| Key Technique     | passive clock boost                         | self-oscillating doubler                                                        | body bias and dead-time                                                                      | Differential BTRO                                                           |
| Conversion Ratio  | 1 : 31                                      | 1 : (9 to 23)                                                                   | 1:4                                                                                          | 1 : 10                                                                      |
| Area              | 1.8 mm <sup>2</sup>                         | 0.86 mm <sup>2</sup>                                                            | 0.066 mm <sup>2</sup>                                                                        | 0.032 mm <sup>2</sup>                                                       |
| PCE               | 0.057%                                      | 38% @ 0.25 V V <sub>in</sub><br>50% @ 0.45 V V <sub>in</sub>                    | 34% @ 0.18 V V <sub>in</sub><br>72.5% @ 0.45 V V <sub>in</sub>                               | 38.8% @ 0.15 V V <sub>in</sub><br>45% @ 0.2 V V <sub>in</sub>               |
| Pros              | Very small <i>V<sub>in</sub></i><br>(85 mV) | High PCE @ large V <sub>in</sub> ,<br>Wide output power range<br>(5 nW to 5 µW) | High PCE @ large V <sub>in,</sub><br>High I <sub>out</sub> (21 μA @ 0.18 V V <sub>in</sub> ) | Small V <sub>in</sub> , small area,<br>Moderate PCE @ small V <sub>in</sub> |
| Cons              | Area hungry<br>(2 x Transformers)           | Low PCE @ small V <sub>in</sub>                                                 | Low PCE @ small V <sub>in,</sub><br>6 off-chip capacitors                                    | Low <i>I</i> <sub>out</sub> (1.74 µA)                                       |

TABLE II Performance Summary and Comparison



Fig. 13. Measured (a)  $V_{\text{out}}$  and (b) PCE versus the loading current with different  $V_{\text{in}}$ .



Fig. 14. Measured and simulated startup times versus  $C_L$  at  $V_{in} = 0.15$  V.

up from 0.01 to 6  $\mu$ A. The driving capability can be improved by connecting several CP schemes in parallel to keep the same PCE at the cost of chip area. At a  $V_{in}$  of 0.1 V, the circuit can still output a 0.79 V at a 0.1- $\mu$ A loading current.

Fig. 13 (b) shows the measured dependence of PCE on the load currents at different  $V_{in}$ . Optimized at  $V_{in} = 0.15V$ , the maximum PCE is 40.8% with a 1.5- $\mu$ A loading current. At 0.2-V $V_{in}$ , the maximum PCE is increased to 45%.

The measured startup times for  $V_{\text{out}}$  to reach 90% of the final voltage are 25 to 117  $\mu$ s over a range of 23.9 to 145.9-pF  $C_{\text{L}}$  (Fig. 14), which are consistent with the simulation results in the TT corner. Table II gives the performance summary and compares the work with the prior art. The proposed design has the advantage of very small chip area and is fully integrated.

## V. CONCLUSION

The design and implementation of a 65-nm CMOS ultracompact  $(0.032 \text{ mm}^2)$  and fully-integrated 3-stage CP scheme with a high step-up ratio (1:10) have been reported. A differential BTRO effectively generates a set of multi-phase swing-boosted outputs to drive the CP elements, reducing the number of stages of the CP, and the clock frequency dependency on the load drivability. Also, systematic device sizing and consideration of the timing and parasitic effects render a high PCE (38.8%) possible at a small input voltage (0.15 V) suitable for energy-harvesting applications.

## REFERENCES

- M. Tabesh, M. Rangwala, A. M. Niknejad, and A. Arbabian, "A powerharvesting pad-less mm-sized 24/60GHz passive radio with on-chip antennas," in *IEEE Symp. VLSI Circuits Dig. Tech. Papers*, Honolulu, HI, USA, Jun. 2014, pp. 1–2.
- [2] J. Yi, W.-H. Ki, and C.-Y. Tsui, "Analysis and design strategy of UHF micro-power CMOS rectifiers for micro-sensor and RFID applications," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 54, no. 1, pp. 153–166, Jan. 2007.
- [3] Y.-C. Shih and B. P. Otis, "An inductorless DC–DC converter for energy harvesting with a 1.2-μW bandgap-referenced output controller," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 58, no. 12, pp. 832–836, Dec. 2011.
- [4] H. Fuketa *et al.*, "An 85-mV input, 50-μs startup fully integrated voltage multiplier with passive clock boost using on-chip transformers for energy harvesting," in *Proc. Eur. Solid State Circuits Conf. (ESSCIRC)*, Sep. 2014, pp. 263–266.
- [5] E. J. Carlson, K. Strunz, and B. P. Otis, "A 20 mV input boost converter with efficient digital control for thermoelectric energy harvesting," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 741–750, Apr. 2010.
- [6] J. Kim, P. K. T. Mok, and C. Kim, "A 0.15 V input energy harvesting charge pump with dynamic body biasing and adaptive dead-time for efficiency improvement," *IEEE J. Solid-State Circuits*, vol. 50, no. 2, pp. 414–425, Feb. 2015.
- [7] W. Jung *et al.*, "An ultra-low power fully integrated energy harvester based on self-oscillating switched-capacitor voltage doubler," *IEEE J. Solid-State Circuits*, vol. 49, no. 12, pp. 2800–2811, Dec. 2014.
- [8] Y. Ho, Y.-S. Yang, C. Chang, and C. Su, "A near-threshold 480 MHz 78 μW all-digital PLL with a bootstrapped DCO," *IEEE J. Solid-State Circuits*, vol. 48, no. 11, pp. 2805–2814, Nov. 2013.
- [9] Y. Ho, Y.-S. Yang, and C. Su, "A 0.2–0.6 V ring oscillator design using bootstrap technique," in *IEEE Asian Solid-State Circuits Conf. Dig. Tech Papers (A-SSCC)*, Nov. 2011, pp. 333–336.
- [10] J. F. Dickson, "On-chip high-voltage generation in NMOS integrated circuits using an improved voltage multiplier technique," *IEEE J. Solid-State Circuits*, vol. 11, no. 3, pp. 374–378, Jun. 1976.
- [11] M. D. Seeman and S. R. Sanders, "Analysis and optimization of switched-capacitor DC–DC converters," *IEEE Trans. Power Electron.*, vol. 23, no. 2, pp. 841–851, Mar. 2008.
- [12] Z. Hameed and K. Moez, "A 3.2 V –15 dbm adaptive threshold-voltage compensated RF energy harvester in 130 nm CMOS," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 62, no. 4, pp. 948–956, Apr. 2015.
- [13] H. Peng, N. Tang, Y. Yang, and D. Heo, "CMOS startup charge pump with body bias and backward control for energy harvesting step-up converters," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 61, no. 6, pp. 1618–1628, Jun. 2014.