

# **3D Stackable Circuits and Memory**

#### KARL-MAGNUS PERSSON AND LARS-ERIK WERNERSSON







### Introduction

- Energy efficient computing with big data background and motivation
- A promising candidate ReRAM

### **Considerations for ReRAM Integration**

- 3D methods and possibilities
- Scaling materials, trends, and challenges

### **Ongoing Research Efforts**

- Stanford monolithic 3D integration of circuit and memory
- Lund vertically integrated nanowire selectors and RRAM





### **Machine Learning Hardware Implications**

- Iterative re-programming of memory
- Performance limited by read/write of none-volatile-memory (NVM)

### **Hardware Challenges**

- Component improvement stagnated Moore's law has halted
- NVM technologies 10,000x slower than computing
- Separate compute and memory circuitry infer large inefficiencies

#### **Possible solutions**

- Creative systems → co-integrated circuits and memory in 3D, introducing new materials
- True neuromorphic hardware → synaptic networks using computational units



NANO

**ELECTRONICS** 

# **NAND Flash and Inherent Limitations**



### • The upside

- 3D integrated, 128 layers in the near future
- Minimal feature size down to 5 nm
- The downside
  - Read in ns but write in ms
  - Further scaling not possible due to tunneling



# BiCS

Bit Cost Scalable NAND Flash



Schematic structure of planar NAND Flash

https://www.storagenewsletter.com





### **Resistive switching**

- Mobile ions form a filament
- Low voltage programming (1-2 V)

### Speed

• 10 ns read/write

### Scaling

- 6F<sup>2</sup>/layer
- F < 5 nm

### **3D ReRAM**

- 128 layers
- 64 Tb per chip





Stanford is a leading institution in the field of ReRAM under the supervision of Prof. H.-S. Philip Wong

### **RRAM Mechanics**





H.-S. Philip Wong et al - Proceedings of the IEEE 2012

### **RRAM Mechanics**









# **Overview of different ReRAM Technologies**

### • RRAM (oxRRAM)

- Anode filament, oxygen vacancies form conductive path
- High endurance,  $1^{12}$  cycles at device level
- 3D compatible

### • CBRAM

- Cathode filament, bridging with metal ion movement
- Similar structure to RRAM
- Endurance questionable (finite number of switches)

### • PCRAM

- Phase change memory, a flash heating switches dielectric film between amorphous and crystalline state
- Endurance questionable (finite number of switches)

### • STT-MRAM

- Spin-transfer-torque magnetic RAM
- Changing orientation of spin changes the conductivity
- Advanced material stack, 3D compatibility unlikely





NANO

**ELECTRONICS** 

# **Overview of different ReRAM Technologies**

- Anode filament, oxygen vacancies form conductive path
- High endurance,  $1^{12}$  cycles at device level
- 3D compatible
- CBRAM
  - Cathode filament, bridging with metal ion movement
  - Similar structure to RRAM
  - Endurance questionable (finite number of switches)
- PCRAM
  - Phase change memory, a flash heating switches dielectric film between amorphous and crystalline state
  - Endurance questionable (finite number of switches)
- STT-MRAM
  - Spin-transfer-torque magnetic RAM
  - Changing orientation of spin changes the conductivity
  - Advanced material stack, 3D compatibility unlikely



Karl-Magnus Persson Nanoelectronics



Stanford Memory Trends, H.-S. P. Wong et al <u>https://nano.stanford.edu/stanford-memory-trends</u>



**Research Trend I – Stacking Circuits and Mem** 

- Performance predictions of ReRAM with large scale circuit simulations using calibrated models
- Study shows benchmarks of a contemporary Intel Xeon Phi system VS a system with CNT-cores and STT-MRAM + 3D RRAM
- Major part of the improved performance in memory intense applications like PageRank comes from the new memory technology
- Proposed system shows up to 1000x gains in combined power and speed

### Conventional CPU is idle 97% of the time!!

![](_page_10_Figure_7.jpeg)

![](_page_10_Figure_8.jpeg)

![](_page_10_Figure_9.jpeg)

NANO

**ELECTRONICS** 

# **Implications for Neuromorphic Computing**

### **IBM TrueNorth**

- SyNAPSE DARPA funded initiative to simulate the brain
- Dedicated neuromorphic hardware
- 4096 computational units (1 unit pictured)
- Memory occupies about 40% of the chip area
- Conventional memory (SRAM)

### RRAM

- 3D RRAM with 128 layers
- 64 Tb per chip

![](_page_11_Picture_10.jpeg)

Synapses

Memory (256 × 410)

1 unit chip layout (2014)

### SRAM ~ 120-140 F<sup>2</sup> → 1T1R RRAM 20x smaller!!

![](_page_11_Picture_14.jpeg)

V<sub>M</sub> A Route

Parameters |

# **3D RRAM - Architectural Concepts**

Word Line -

Memory

Cell

**Bit Line** 

![](_page_12_Picture_2.jpeg)

HRRAM (horizontal)

- Most simplistic
- Superior performance due to low RC interconnects
- Diode selector limits feasible number of layers
- Least cost efficient
- Intel Xpoint

![](_page_12_Figure_9.jpeg)

Vord Line

### VRRAM type II

- Interconnect capacitance limited performance
- Litho-free stacking
- Bit performance improves with # layers
- Most scalable/cost efficient

### VRRAM type I

- Interconnect resistance limited performance
- More energy efficient than VRAM type II

# Vertical RRAM – Lithography Free

![](_page_13_Picture_1.jpeg)

13

![](_page_13_Figure_2.jpeg)

#### Litho-free formation of a stair-case structure

![](_page_13_Figure_4.jpeg)

Tanaka et al – VLSI symposium 2007

# **Research Trend II – In Memory Computations**

- 3D RRAM allows for hyper-dimensional vectors with computations directly in the memory
- NOR and NAND operations can be accomplished with propagation of specific pulse trains
- It is not determined if in-memory computations will be a viable way forward, could be task specific

![](_page_14_Picture_6.jpeg)

TiN/Ti

(TE)

Layer 4 (L4)

Layer 3 (L3) Layer 2 (L2) Layer 1 (L1)

![](_page_14_Figure_7.jpeg)

32b

3D synapses

1000

training

images

neuror

Input

![](_page_14_Picture_8.jpeg)

# **RRAM – Area Scaling**

NANO ELECTRONICS GROUP

- RRAM switching is ideally independent of device area as only one filament forms
- Area dependence is instead partly coupled to self-capacitance, and a reduction in parasitic current discharge
- However, the probability to form a filament increase with area
- To reduce increased forming voltage and spread in the distribution, surface roughness and material quality at the interfaces are of crucial importance

Y. Y. Chen – ME 2013

Ann Chen – Globalfoundires 2013

![](_page_15_Figure_9.jpeg)

# **RRAM - Oxide Thickness Scaling**

- Scaling the dielectric necessary to reduce minimum feature size
- Surface roughness affect the spread of the performance distribution
- Etched out vertical pillar have a smoother surface, 3D • thus reduces the feature size

![](_page_16_Figure_5.jpeg)

Zhao et al – IEDM 2014

(b) Planar (a) distance: ~40nm 34 T TiN Peak 400 Height: 300 ~15nm 0 200 100 200 300 400 HfO Pt nm Large surface roughness compared to 500 17 nm tox, multiple sites for filament growth **3D VRRAM** RMS 2nm roughness: HfO<sub>x</sub> 250 **VRRAM** is less ~2.08 nm Pt vulnerable to TIN Pt surface ~10nm nm roughness 500 nm SiO<sub>2</sub> 250

Peak-to-peak

#### Joon Sohn – IEDM 2014

- Large arrays require MOSFET selectors to reduce leakage
- Vertical geometry allows for more aggressive thickness scaling as it reduces roughness

**Considerations on Scaled 3D Arrays** 

- Simulations show metal plane thickness will limit array size due to resistance of the vias, sub 6-nm metal is highly resistive
- Graphene and other 2D materials way become a viable way forward for large scaled arrays

![](_page_17_Picture_6.jpeg)

Karl-Magnus Persson Nanoelectronics

![](_page_17_Figure_7.jpeg)

NANO

**ELECTRONICS** 

GROUP

17

![](_page_18_Picture_1.jpeg)

![](_page_18_Figure_2.jpeg)

### **Stanford Research: CNTs with 3D RRAM**

![](_page_19_Picture_1.jpeg)

![](_page_19_Picture_2.jpeg)

![](_page_19_Figure_3.jpeg)

![](_page_19_Picture_4.jpeg)

16 CNT MOSFETs and 4 layers of RRAM

 ~18 lithography steps + countless of fabrication procedures

# TiN RRAM Layer 1 and 2 – Form and Reset

![](_page_20_Figure_1.jpeg)

NANO

**ELECTRONICS** 

### Lund Research: RRAM with NWFETs

![](_page_21_Picture_1.jpeg)

![](_page_21_Picture_2.jpeg)

![](_page_21_Picture_3.jpeg)

NANO

**ELECTRONICS** 

# **Initial 2D RRAM Tests**

![](_page_22_Picture_2.jpeg)

0.5

1

0

![](_page_22_Figure_3.jpeg)

10<sup>-16</sup>

-2

2

0

Voltage (V)

- most beneficial
- Currently investigating different • materials and combinations
- Voltage envelope is a great concern (<1 V) for low-power MOSFETs

ITO/Al<sub>2</sub>O<sub>3</sub>/HfO<sub>2</sub>/Ti-stack

10<sup>-8</sup>

-1.5

-1

-0.5

Voltage (V)

# Lund Research: RRAM with NWFETs

- Integration on vertical MOSFETS only demonstrated on individual Si-pillars
- III-Vs offer low-power operation
- Lund has demonstrated record high T-FET performance using nanowires
- A unique approach would be to combine T-FETs and low-power RRAM technology for ultra-low-power operation

![](_page_23_Picture_6.jpeg)

![](_page_23_Picture_7.jpeg)

![](_page_24_Picture_1.jpeg)

### Conclusions

- Contemporary NVM technologies hugely limiting memory intense applications
- 3D stacking circuits and ReRAM could potentially improve efficiency for data intense computing by 1000x
- MOSFET selector needed for large arrays and 3D integration
- New approach with 3D integration of RRAM directly on top of vertical MOSFETs, no array demonstration to date
- Implementing TFETs selectors, RRAM would benefit in the same way as for CMOS logic, enabling larger, faster, and more enrgy efficient circuits

![](_page_24_Picture_8.jpeg)

![](_page_25_Picture_0.jpeg)

![](_page_25_Picture_1.jpeg)

This work has been funded by the Wallenberg Foundation