

## Digital Integrated Circuits A Design Perspective

## Semiconductor Memories

## **Chapter Overview**

Memory Classification Memory Architectures □ The Memory Core Periphery Reliability **Case Studies** 

## **Semiconductor Memory Classification**

| <b>Read-Write Memory</b> |                                       | Non-Volatile<br>Read-Write<br>Memory | <b>Read-Only Memory</b>                |
|--------------------------|---------------------------------------|--------------------------------------|----------------------------------------|
| Random<br>Access         | Non-Random<br>Access                  | EPROM<br>E <sup>2</sup> PROM         | Mask-Programmed<br>Programmable (PROM) |
| SRAM<br>DRAM             | FIFO<br>LIFO<br>Shift Register<br>CAM | IFO FLASH<br>IFO<br>Register<br>AM   |                                        |

## **Memory Architecture: Decoders**



Intuitive architecture for N x M memory Too many select signals: N words == N select signals

Decoder reduces the number of select signals  $K = log_2 N$ 

## **Array-Structured Memory Architecture**

#### Problem: ASPECT RATIO or HEIGHT >> WIDTH



## **Hierarchical Memory Architecture**



#### **Advantages:**

- **1. Shorter wires within blocks**
- 2. Block address activates only 1 block => power savings

## **Read-Only Memory Cells**













**Diode ROM** 

MOS ROM 1

MOS ROM 2

## **MOS OR ROM**



## **MOS NOR ROM**



## **MOS NOR ROM Layout**





## **MOS NAND ROM**



All word lines high by default with exception of selected row

## **Precharged MOS NOR ROM**



PMOS precharge device can be made as large as necessary, but clock driver becomes harder to design.

## Non-Volatile Memories The Floating-gate transistor (FAMOS)





**Device cross-section** 

**Schematic symbol** 

## **Floating-Gate Transistor Programming**



Avalanche injection

Removing programming voltage leaves charge trapped

Programming results in higher  $V_T$ .

## **FLOTOX EEPROM**



**FLOTOX** transistor

Fowler-Nordheim *I-V* characteristic

## **EEPROM Cell**



Absolute threshold control is hard Unprogrammed transistor might be depletion ⇒ 2 transistor cell



## Flash EEPROM



#### Many other options ...

## **Characteristics of State-of-the-art NVM**

**Table 12-1** Comparison between nonvolatile memories ([Itoh01]).  $V_{DD} = 3.3 \text{ or } 5 \text{ V}; V_{PP} = 12 \text{ or } 12.5 \text{ V}.$ 

|             | Coll                  | Cell<br>Area            | Mechanism    |               | External Power<br>Supply |          | - Brogrom/      |
|-------------|-----------------------|-------------------------|--------------|---------------|--------------------------|----------|-----------------|
|             | Nr. of<br>Transistors | (ratio<br>wrt<br>EPROM) | Erase        | Write         | Write                    | Read     | Erase<br>Cycles |
| MASK<br>ROM | 1 T (NAND)            | 0.35–5                  | _            | _             | _                        | $V_{DD}$ | 0               |
| EPROM       | 1 T                   | 1                       | UV Exposure  | Hot electrons | $V_{PP}$                 | $V_{DD}$ | ~100            |
| EEPROM      | 2 T                   | 3–5                     | FN Tunneling | FN Tunneling  | $V_{PP}$ (int)           | $V_{DD}$ | $10^4 - 10^5$   |
| Flash       | 1 T                   | 1-2                     | FN Tunneling | Hot electrons | $V_{PP}$                 | $V_{DD}$ | $10^4 - 10^5$   |
| Memory      |                       |                         | FN Tunneling | FN Tunneling  | $V_{PP}$ (int)           | $V_{DD}$ | $10^4 - 10^5$   |

## **Read-Write Memories (RAM)**

#### □ STATIC (SRAM)

Data stored as long as supply is applied Large (6 transistors/cell) Fast Differential

#### **DYNAMIC (DRAM)**

Periodic refresh required Small (1-3 transistors/cell) Slower Single Ended



## 6-transistor CMOS SRAM Cell



## **CMOS SRAM Analysis (Read)**



$$k_{n,M5} \left( (V_{DD} - \Delta V - V_{Tn}) V_{DSATn} - \frac{V_{DSATn}^2}{2} \right) = k_{n,M1} \left( (V_{DD} - V_{Tn}) \Delta V - \frac{\Delta V^2}{2} \right)$$
$$\Delta V = \frac{V_{DSATn} + CR(V_{DD} - V_{Tn}) - \sqrt{V_{DSATn}^2 (1 + CR) + CR^2 (V_{DD} - V_{Tn})^2}}{CR}$$

## **CMOS SRAM Analysis (Read)**





## **CMOS SRAM Analysis (Write)**



## 6T-SRAM — Layout



## **Resistance-load SRAM Cell**



Static power dissipation -- Want R  $_L$  large Bit lines precharged to V  $_{DD}$  to address t  $_p$  problem

## **SRAM Characteristics**

 
 Table 12-2
 Comparison of CMOS SRAM cells used in 1-Mbit memory (from [Takada91])

|                               | Complementary<br>CMOS                 | Resistive Load                        | TFT Cell                              |
|-------------------------------|---------------------------------------|---------------------------------------|---------------------------------------|
| Number of transistors         | 6                                     | 4                                     | 4 (+2 TFT)                            |
| Cell size                     | 58.2 μm <sup>2</sup><br>(0.7-μm rule) | 40.8 μm <sup>2</sup><br>(0.7-μm rule) | 41.1 μm <sup>2</sup><br>(0.8-μm rule) |
| Standby current<br>(per cell) | 10 <sup>-15</sup> A                   | 10 <sup>-12</sup> A                   | 10 <sup>-13</sup> A                   |

**3-Transistor DRAM Cell** 



No constraints on device ratios Reads are non-destructive Value stored at node X when writing a "1" = V<sub>WWL</sub>-V<sub>Tn</sub>

## **3T-DRAM** — Layout



**1-Transistor DRAM Cell** 



Write: C<sub>S</sub> is charged or discharged by asserting WL and BL. Read: Charge redistribution takes places between bit line and storage capacitance

$$\Delta V = V_{BL} - V_{PRE} = V_{BIT} - V_{PRE} \frac{C_S}{C_S + C_{BL}}$$

Voltage swing is small; typically around 250 mV.

## **DRAM Cell Observations**

- □ 1T DRAM requires a sense amplifier for each bit line, due to charge redistribution read-out.
- □ DRAM memory cells are single ended in contrast to SRAM cells.
- The read-out of the 1T DRAM cell is destructive; read and refresh operations are necessary for correct operation.
- □ Unlike 3T cell, 1T cell requires presence of an extra capacitance that must be explicitly included in the design.
- □ When writing a "1" into a DRAM cell, a threshold voltage is lost. This charge loss can be circumvented by bootstrapping the word lines to a higher value than  $V_{DD}$

## **Sense Amp Operation**



## **1-T DRAM Cell**



#### **Cross-section**

Layout

#### Uses Polysilicon-Diffusion Capacitance Expensive in Area

### SEM of poly-diffusion capacitor 1T-DRAM



## **Advanced 1T DRAM Cells**



#### **Stacked-capacitor Cell**

**Trench Cell** 



# Decoders Sense Amplifiers Input/Output Buffers Control / Timing Circuitry



## **Row Decoders**

#### Collection of 2<sup>M</sup> complex logic gates Organized in regular and dense fashion

(N)AND Decoder

$$WL_{0} = A_{0}A_{1}A_{2}A_{3}A_{4}A_{5}A_{6}A_{7}A_{8}A_{9}$$
$$WL_{511} = \bar{A}_{0}A_{1}A_{2}A_{3}A_{4}A_{5}A_{6}A_{7}A_{8}A_{9}$$

#### **NOR Decoder**

$$WL_{0} = \overline{A_{0} + A_{1} + A_{2} + A_{3} + A_{4} + A_{5} + A_{6} + A_{7} + A_{8} + A_{9}}$$
$$WL_{511} = \overline{A_{0} + A_{1} + A_{2} + A_{3} + A_{4} + A_{5} + A_{6} + A_{7} + A_{8} + A_{9}}$$

## **Hierarchical Decoders**

#### Multi-stage implementation improves performance







 $WL_3$ 바 V<sub>DD</sub>  $WL_2$ 마  $V_{\underline{D}}$ WL<sub>1</sub> 마 V<sub>DD</sub> WL<sub>0</sub> ۰ŀ  $\overline{A}_0$  $\overline{A}_1$ A<sub>1</sub>

 $V_{DD}$ 

2-input NOR decoder

2-input NAND decoder

A

## 4-input pass-transistor based column decoder



Advantages: speed ( $t_{pd}$  does not add to overall memory access time) Only one extra transistor in signal path Disadvantage: Large transistor count

## 4-to-1 tree based column decoder



Number of devices drastically reduced

Delay increases quadratically with # of sections; prohibitive for large decoders Solutions: buffers

> progressive sizing combination of tree and pass transistor approaches

## **Decoder for circular shift-register**



## **Sense Amplifiers**



#### Idea: Use Sense Amplifer



## **Differential Sense Amplifier**



## **Differential Sensing – SRAM**



## Latch-Based Sense Amplifier (DRAM)



Initialized in its meta-stable point with EQ

Once adequate voltage gap created, sense amp enabled with SE Positive feedback quickly forces output to a stable operating point.

## **Reliability and Yield**

• Semiconductor memories trade off noise-margin for density and performance



Highly Sensitive to Noise (Crosstalk, Supply Noise)

High Density and Large Die size cause Yield Problems

Y = 100  $\frac{Number"" of""Good"" Chips"" on"" Wafer}{Number"" of"" Chips"" on"" Wafer$ 

$$\boldsymbol{Y} = \left[\frac{1-e^{-AD}}{AD}\right]^2$$

**Increase Yield using Error Correction and Redundancy** 

## **Noise Sources in 1T DRam**



## **Open Bit-line Architecture — Cross Coupling**



## **Folded-Bitline Architecture**



## **Transposed-Bitline Architecture**



(a) Straightforward bit-line routing



(b) Transposed bit-line architecture



#### **1** Particle ~ 1 Million Carriers

## Redundancy



#### Memories

## **Error-Correcting Codes**

#### **Example: Hamming Codes**

| $P_1 P_2 B_3 P_4 B_5 B_6 B_7$              | e.g. B3 Wrong |     |  |
|--------------------------------------------|---------------|-----|--|
| with                                       |               |     |  |
| $P_1 \oplus B_3 \oplus B_5 \oplus B_7 = 0$ | 1             |     |  |
| $P_2 \oplus B_3 \oplus B_6 \oplus B_7 = 0$ | 1             | = 3 |  |
| $P_4 \oplus B_5 \oplus B_6 \oplus B_7 = 0$ | 0             |     |  |

## **Redundancy and Error Correction**



## Sources of Power Dissipation in Memories



#### From [Itoh00]

## **Programmable Logic Array** *Pseudo-NMOS PLA*







AND-plane

**OR-plane** 

# Semiconductor Memory Trends (updated)



## **Trends in Memory Cell Area**



From [Itoh01]

## **Semiconductor Memory Trends**



Technology feature size for different SRAM generations