

### Below the Threshold - Modelling, Technology, and Performance

Joachim Rodrigues

Department of Electrical and Information Technology

Lund University, Sweden

SOS Workshop 2010

Sep 22<sup>nd</sup>, 2010

## People and Sponsors

### People working on low power

- Yasser Sherazi
- Dr. Omer Can Akgun
- Dr. Peter Nilsson
- Dr. Joachim Rodrigues





#### This work is financed by

Vinnova (SoS), SSF (UPD), Vetenskapsrådet.

... more people who contributed

Yusuf Leblebici (EPFL), Jens Sparsø (DTU)

### Motivation and Sub-VT Basics



#### Why Sub-threshold?

- ► Scale V<sub>DD</sub> aggressively and reduce energy dissipation.
- Optimum supply voltage value for lowest energy per operation.
- Low-energy applications where speed is not constraint.
- Leakage energy decreases exponentially, dynamic decreases quadratically.
- ► Delay increases exponentially ⇒ Scientific Challenge.

## Sub-threshold Energy model



No standard/commercial flow available which characterizes designs with  $V_{\text{DD}} \leq 400 \text{ mV}.$ 

### High-level Energy Model

- Conventional EDA tools.
- SPICE-accurate in a fraction of SPICE simulation time.
- Any RTL design.
- Standard- and full-custom based designs.
- Custom developed scripts.
- Asynchronous/ synchronous designs.

### **Energy Model Application**

### Simulation and measurement of a cardiac event detector in 65 nm LL-HVT.



Simulated and measured data. Energy minimum at  $V_{DD}$  = 320 mV (20 kHz).

## **Technology Selection**



 A reference design is synthesized over several technology nodes.

- Energy in sub-V<sub>T</sub> is simulated over a scaled V<sub>DD</sub>
- Low-leakage options were used.
- 65 nm LL-HVT is most energy efficient.

Energy dissipation for varying supply voltages across technology nodes.

With special low-power process options, it is beneficial to migrate to smaller technologies.

## Throughput Improvement by Parallelizing

A Half-band filter is characterized in the sub-V $_{\mathcal{T}}$  domain.



- ► Throughput is increased by a parallel processing ⇒ reduce V<sub>DD</sub>.
- Area (and leakage) increases accordingly.

## Energy vs. throughput for original and unfolded architecture.

# Fabricated ASICs

An synchronous and an asynchronous Sub-V $_{\mathcal{T}}$  ASIC were fabricated in 65 nm LL-HVT.

- Measurements confirm energy simulations.
- ASICs are functional downto 250 mV (as expected).
- Naked (no protection).
- ► In-house RTL-GDSII flow.

## Analog Completion Detection- Current Sensing



Computation signature is detected as the temporary drop of the supply voltage at the drain node of the transistor.

We are interested in the computation phase signal.

## Throughput Gain



- Idle time in asynchronous circuits is constant (communication protocol).
  - Idle time in synchronous circuits is constraint by worst-case corner case.



### Speed Improvement

- 58.7 % speed-up including overhead (PrimeTime).
- 52.6 % speed-up with real data (HSPICE).

### Power Domain Separation



Current drawn by the registers is dominant. To retrieve *processing time*, independent power domains for combinational and sequential gates are introduced (Synopsys DC).

## Some Measurements...



Oscilloscope measurements of the ASIC in- and output signals.



## Measurement of the completion detection signal

The completion detection signal (inverted) is measured and plotted successively and overplotted (varies between 6 and  $40\mu$ s).

## The Device



### ASIC 65 LL-HVT

- Implemented as part of a multi project die in a 65 nm digital process.
- ASICs are fully operational down to 250 mV.
- Asynchronous design: Core area increases by 8.2% (power domain separation), completion detection circuit adds another 4.6%.

# The Movie

Propagation delay of the critical path is measured.

- ► A high-level energy estimation flow for sub-V<sub>T</sub> domain characterization was presented.
- ► The flow is applicable to standard-cell and full-custom design.
- A self-timed (asynchronous) and synchronous ASIC were fabricated in 65 LL-HVT CMOS
- The ASICs are functional down to 250 mV
- ASASAP (as soon and simple as possible)