### PURDUE UNIVERSITY



# Design of Scaled CMOS Circuits in the Nano-meter Regime: Leakage Tolerance and Computing with Leakage

# Kaushik Roy

Professor of Electrical & Computer Engineering Purdue University

















- Leakage Power
  - Subthreshold, Gate, Junction, GIDL, Punchthrough, ....
- Dynamic Power
  - Due to charging/discharging of capacitive load
  - Short-circuit power due to direct path currents when there is a temporary connection between power and ground



# Leakage Power















# Subthreshold leakage (I<sub>sub</sub>)

Exponentially dependence on Vgs and Vth.

$$I_{sub} = \frac{w_{eff}}{L_{eff}} \mu \sqrt{\frac{q \varepsilon_{si} N_{cheff}}{2 \Phi_s}} v_T^2 exp\left(\frac{V_{gs} - V_{th}}{n v_T}\right) \left(1 - exp\left(\frac{-V_{ds}}{v_T}\right)\right)$$

### Vth modulation

- ✓ Short channel Effect Vth reduction due to
  - > Increase in Vds (DIBL),
  - > Reduction in Channel Length (Vth roll off).
- ✓ Body effect negative Vbs increases Vth.
- ✓ Quantum confinement effect increases Vth

$$V_{th} = V_{FB} + \left(\Phi_{s0} - \Delta\Phi_{s}\right) + \gamma \sqrt{\Phi_{s0} - V_{bs}} \left(1 - \lambda \frac{W_{dm}}{L_{eff}}\right) + V_{nce} + V_{QM}$$
$$\Delta\Phi_{s} = \left[2(V_{bi} - \phi_{s0}) + V_{ds}\right] \times \left[e^{-L/2l_{c}} + 2e^{-L/l_{c}}\right] \quad and \quad l_{c} = \sqrt{\left(\varepsilon_{si} / \varepsilon_{os} \eta\right) T_{os} W_{dm}}$$













































# **Proposed Solution**

 Use a Shannon expansion based design and supply gating to

- Reduce the quiescent current
- Improve the leakage yield
- Reduce test power
- Improve the test coverage/test length







# **Test Power**

- Sources of test power
  - Scan registers
  - Combinational circuits
- Combinational circuit consumes 78% test power

### Advantages of SBS

- No changes required in scan register and test application procedure
- Can reduce both switching and leakage power
- At-speed testing can be performed easily
- Other techniques can be integrated for power saving in registers



## **Test Coverage/Test Length**

- High test coverage is needed because
  - New failure mechanisms have emerged
  - Defect density has increased
- Cost of ATE prohibits exhaustive testing of chip

• Circuits employing BIST for periodic self-test requires high coverages with smaller test time

### Advantages of SBS

- Reduction in number of faults due to smaller area after multi-level expansion in some cases
- Increased observability of internal nodes



Avg. reduction of 20% (21%) in test time with deterministic (random) patterns

Supply Gating in Scan Design -- Low-power Scan Operation















































| <ul> <li>Leakage Reduction with OBB</li> <li>Leakage savings ranged from 14-55% compared to zero<br/>body bias case for nominal 70nm and 50nm transistors in<br/>Taurus device simulations.</li> </ul> |              |            |                                  |                                 |                  |                      |
|--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|--------------|------------|----------------------------------|---------------------------------|------------------|----------------------|
| Tech.                                                                                                                                                                                                  | Temp<br>(°C) | $V_{B}(V)$ | I <sub>OFF</sub><br>(normalized) | I <sub>ON</sub><br>(normalized) | $I_{ON}/I_{OFF}$ | Leakage<br>Reduction |
| <b>70nm</b> -                                                                                                                                                                                          | 25           | 0          | 1                                | 97115                           | 97115            | 43%<br>55%           |
|                                                                                                                                                                                                        | 25           | -0.16      | 0.57                             | 91005                           | 159657           |                      |
|                                                                                                                                                                                                        | 70           | 0          | 5.14                             | 120673                          | 23477            |                      |
|                                                                                                                                                                                                        | 70           | -0.20      | 2.30                             | 118269                          | 51421            |                      |
|                                                                                                                                                                                                        | 25           | 0          | 1                                | 3478                            | 3478             | 45%                  |
| 50nm                                                                                                                                                                                                   | 25           | 0.15       | 0.55                             | 3992                            | 7258             |                      |
|                                                                                                                                                                                                        | 70           | 0          | 2.51                             | 4044                            | 1611             | 1494                 |
|                                                                                                                                                                                                        | 70           | 0.09       | 2.15                             | 4286                            | 1993             | 14%                  |
|                                                                                                                                                                                                        |              |            |                                  |                                 |                  |                      |





| Variation Reduction Results                                                                                                                                                              |                   |             |         |                |  |  |
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|-------------------|-------------|---------|----------------|--|--|
| <ul> <li>OBB reduces mean leakage by 30-37%</li> <li>OBB reduces the spread of leakage values by 40-71%</li> <li>Taurus Device simulation results for 50nm nmos with Gaussian</li> </ul> |                   |             |         |                |  |  |
| distributed parameter variations                                                                                                                                                         |                   |             |         |                |  |  |
| Device                                                                                                                                                                                   | Leakage Variation |             |         |                |  |  |
| Variation                                                                                                                                                                                | µ @ ZBB [A]       | µ @ OBB [A] | σ@ZBB   | $\sigma @ OBB$ |  |  |
| Length                                                                                                                                                                                   | 1.14e-7           | 7.97e-8     | 3.89e-8 | 2.32e-8        |  |  |
| VDD                                                                                                                                                                                      | 1.20e-7           | 7.87e-8     | 3.19e-8 | 1.33e-8        |  |  |
| Peak Halo<br>Doping X                                                                                                                                                                    | 1.27e-7           | 7.96e-8     | 1.96e-8 | 5.70e-9        |  |  |











## MTCMOS (cont'd)

- Advantage:
  - Effective for standby leakage reduction
  - Easily implemented based on existing circuits
  - 1-V MTCMOS DSP chip for mobile phone application (1996)
- Disadvantage:
  - Increase area and delay
  - If data retention is required in standby mode, an extra high- $V_{th}$  memory circuit is needed

















| SRAM Leakage Reduction Schemes |                            |                                                                              |                                               |                                                               |                                             |
|--------------------------------|----------------------------|------------------------------------------------------------------------------|-----------------------------------------------|---------------------------------------------------------------|---------------------------------------------|
| Schemes                        | Source Masing              | Fwd/Reverse<br>Fory-Biasing<br>(V <sub>PWELL</sub> ,<br>V <sub>NWELL</sub> ) | Dynamic V <sub>DD</sub><br>(V <sub>DL</sub> ) | Floating<br>Bitlines<br>(V <sub>BL</sub> , V <sub>BLB</sub> ) | Negative<br>Word Line<br>(V <sub>WL</sub> ) |
|                                |                            | Vrvvii,<br>Vrvvii,<br>evVuis<br>Active Standby                               |                                               | Vic.Vis.3<br>Vic. Links<br>Vic. Active Sandy Vic.s            | Vis<br>0<br>Active Standay                  |
| Leakage reduction              | Sub: ↓↓<br>Gate: ↓↓        | Sub: ↓↓<br>BTBT:↑(RBB)                                                       | Sub, gate:↓<br>*Bitline leak: -               | Sub:↓<br>Gate:↓                                               | Sub:↓<br>*Gate: ↑                           |
| Delay                          | *Delay increase            | No delay increase                                                            | No delay increase                             | No delay<br>increase                                          | No delay<br>increase                        |
| Overhead                       | Low transition<br>overhead | Large transition<br>overhead                                                 | Large transition<br>overhead                  | *Precharge<br>latency overhead                                | *Low charge<br>pump efficiency              |
| Stability                      | Impact on SER              | No impact on<br>SER                                                          | *Worst SER                                    | No impact on<br>SER                                           | No impact on<br>SER, voltage<br>stress      |











| 2x16K-Byte SRAM Testchip                                                               |                      |                             |  |  |  |
|----------------------------------------------------------------------------------------|----------------------|-----------------------------|--|--|--|
|                                                                                        | Technology           | 180nm 6-metal<br>CMOS       |  |  |  |
| 16KB SRAM                                                                              | Chip Size            | 3.3X2.9 mm <sup>2</sup>     |  |  |  |
| في Array for<br>Leak. Probing                                                          | Supply Voltage       | 1.8V                        |  |  |  |
| SNM Test                                                                               | Threshold<br>Voltage | NMOS: 0.53V<br>PMOS: -0.53V |  |  |  |
| Conv.<br>SK SRAM<br>Rew Decoder<br>SK SRAM<br>3K SRAM<br>3K SRAM<br>38 SRAM<br>38 SRAM | Read Access<br>Cycle | 984MHz<br>@ 1.8V, RT        |  |  |  |
| Column 1/0 / Column 1/0                                                                | Active Current       | 0.14mW/MHz<br>@ 1.8V        |  |  |  |
| Address Búf.<br>& Self Decay Test Interface                                            | Standby<br>Current   | 7.27μA<br>(16KB array)      |  |  |  |
| Kim, Roy, ISSCC'05                                                                     |                      |                             |  |  |  |









• MEDICI: gate/BTBT leakage is also modeled





Computing with Leakage for Ultralow Power: Digital Subthreshold Logic







































|                      | ver Con<br>itecture &  |        |                          |                         |
|----------------------|------------------------|--------|--------------------------|-------------------------|
| Implementation       | Clock<br>frequenc<br>y | Vdd    | Energy<br>/Operatio<br>n | # of<br>Transisto<br>rs |
| + Sub-CMOS           | 748 kHz                | 650 mV | 19.1 nJ                  | 31k                     |
| + Sub-CMOS           | 22 kHz                 | 450 mV | 2.47 nJ                  | 111k                    |
| + Sub-Pseudo<br>NMOS | 22 kHz                 | 400 mV | 1.77 nJ                  | 86k                     |

Parallel architecture lowers the clock rate, reduces power dissipation by 87%

Pseudo NMOS logic styles provides another 28% reduction

