A High Performance Co-design of 26 nm 64 Gb MLC NAND Flash Memory using the Dedicated NAND Flash Controller

ByoungSung You, JinSu Park, SangDon Lee, Gwangho Baek, JaeHo Lee, Minsu Kim, Jongwoo Kim, Hyun Chung, Eunseong Jang, and TaeYoon Kim

Abstract—It is progressing as new advents and remarkable developments of mobile device every year. On the upper line reason, NAND FLASH large density memory demands which can be stored into portable devices have been dramatically increasing. Therefore, the cell size of the NAND Flash memory has been scaled down by merely 50% and has been doubling density each per year. [1] However, side effects have arisen the cell distribution and reliability characteristics related to coupling interference, channel disturbance, floating gate electron retention, write-erase cycling owing to shrinking around 20nm technology. Also, FLASH controller to manage shrink effect leads to speed and current issues. In this paper, It will be introduced to solve cycling, retention and fail bit problems of sub-deep micron shrink such as Virtual negative read used in moving read, randomization. The characteristics of retention, cycling and program performance have 3 K per 1 year and 12.7 MB/s respectively. And device size is 179.32 mm² (16.79 mm × 10.68 mm) in 3 metal 26 nm CMOS.

Index Terms—NAND FLASH memory, controller, Moving read, Virtual negative read, randomization, cycling, retention, interference, disturbance, SoC, SiP, ONFI.

I. INTRODUCTION

As the mobile and portable device usage of NAND Flash memory have drastically been increased by such as smart phones, tablet personal computer, digital cameras, digital audio players, USB memories, hand-held game console and lots of portable devices, it is necessary to get large density and high performance. However, there exit two drawbacks of a scaling issue and a communicated speed, current issues. In case of sub deep 20 nm scaling in Fig. 1, [3] we have to concern about abnormal sensing cells, inherent coupling issues. There are lots of issues related to interference caused by adjacent programmed cells and fringing field, disturbance caused by excessive channel boosting, floating node charge loss tolerance, charge trapping caused by floating shrink and write performance degradation following more the number of program pulses around sub-20 nm node. [3] Providing

Fig. 1. A technology trend of a NAND Flash memory.
that both a NAND Flash memory and the dedicated NAND controller are packaged into one-chip to overcome scaling issue and high performance, and then there exist the large consumption of operating current and the bottle neck data path between interfaces using the embedded algorithm which is applied with SoC architecture.

II. CHIP ARCHITECTURE

As the block diagram in Fig. 2, both the NAND Flash memory and the dedicated NAND Flash controller are packaged into the one-chip that can implement high reliability and performance. As explaining operation simply, a host and the dedicated NAND Flash controller communicated with command, address and data input, output through an external interface and a data path can be stored into cache buffer bank0 and bank1 through the external interface because of dual channel. A control path related to command, address is directly communicated with control unit using FSM, Glue logic in the control unit. There exist the sequences of power-up and initial boot. In the case of power-up, the dedicated NAND controller is provided that synchronous clock is forced into µ-controller. The composition of the control unit is separated in µ-Controller, FSM, Glue logic and data encoder, decoder respectively. On the condition of executing command, each function blocks make it complete the sequential ECC algorithm which is driven by µ-Controller simultaneously. Initially, a subordinate firmware related to algorithm is stored into a EEPROM. The NAND Flash memory is communicated by dual channel through the dedicated controller internal interface. It can be enhanced the performance with the methodology of interleave or concurrent in the multi-chip package and NAND Flash memory provides the ONFI compatible function of Set, Get parameter which can modify the internal parameter of sensing level, ISPP, Program start bias and unselected word line bias level to improve reliability using Moving read, Interference cancelling read and randomized process through the dedicated controller.

III. OVERCOME SCALING PROBLEMS

1. Moving Read using Virtual Negative Read

In case that Erase and Program sequence make steady progress gradually, Fowler–Nordheim tunneling comes from making degradation an interface field oxide and an internal floating gate, dramatically. As growing the number of cycling, these unit characteristics of retention is sharply degraded because interposing electron oxidation sites and floating gate electron trap number make the path of un-trapped electron, physical and electrical damages in Fig. 3(a) and (b).

Fig. 2. One chip block diagram of NAND Flash memory and dedicated controller.

Fig. 3. (a) Cycling degradation effect, (b) Retention degradation effect.
There exists a problem of initial program cell distribution shifting left side because a charge get out of a floating gate in accordance with data retention characteristics. As well-known with moving read scheme, this method is wide usage of initial program cell state to adjust and compensate read-bias judging from the amount of left shifted cell distribution.[2]. It’s possible to compensate a retention drawback within small Vth window using adjusting read 2 or read 3 sensing level. However, in the case of read 1, it is impossible to shift the word line sensing level under 0 V or more minus voltage out of accordance with adjusting read 2 and read 3. As it can be seen in Fig. 4, If data retention loss is occurred from the initial program cell status, moving read using virtual negative read at read 1 should be used to compensate a cell distribution with minus sensing read1 bias.

As the VCORE bias forces into all core signals except for a select word line to implement this method in the Fig. 5, it can be get an effect equal to forcing minus voltage which is a negative sensing level into the select word line. the VCORE bias forces into a bit line, PWELL, and source line (SL) and keeping read 1 bias into the select word line can be implemented by forcing VCORE bias level the same as shifting the left oriented cell distribution.

As it’s mentioned in Fig. 4 and 5, in case that the sensing level of READ1 (RD1) is 0 V, the Virtual negative read is proposed to solve this data retention problem because of lowering moving read sensing level of READ 1 into a minus level.

As the moving read method using the virtual negative read in Fig. 6, the randomized data of page buffer main and spare per each unit chunk make it program into the NAND cell array. If data retention is occurred, cell data will be shift cell distribution into left oriented direction. The following sequence for the compensation, it is to read from NAND Flash memory page buffer to the dedicated controller cache buffer after NAND cell array page read in Fig. 7 in the meantime the dedicated controller have encoded ECC data of cache buffer bank0 and bank1 respectively. This data carry out checking equivalent between original page buffer data and ECC code. In case of equality to both either or maximum loop related to read re-try number, Fig. 7 sequence will be exited. In the opposite case, it will be read the
compensated data using shifting $\Delta R_1$, $\Delta R_2$ and $\Delta R_3$. The VCORE bias with the virtual negative read is determined by data retention loss level. And the dedicated NAND Flash controller calculates the shifting level and the number of read count. It’s also implemented by the function of set, get parameter which supports NAND Flash memory internally. In the case of using moving read algorithm with virtual negative, user correctable error bit ratio can be considerably improved more than ~70% in Fig. 8. However, there exist increasing random access time about 5 µs and read operation current called ICC1 about 2 mA because of charging and discharging into bit line, word line and PWELL during page read. This value can endure performance degradation and power consumption within the range of NAND Flash memory specification.

2. Randomized Data Input and Output based on Controller Interface

The randomized data encoding and decoding method in Fig. 9 explains reducing an interference gap related to both bit line to bit line and word line to word line, an erase 1 cell margin and a pattern dependency. Both the NAND Flash memory and the dedicated controller are shown in Fig. 9.

First of all, the external host controls the dedicated NAND Flash controller of the external I/F through two channels. In the case of data input, a random vector is generated from a seed generator. After data input of Seed<7:0>, the seed data which is synchronized with WE# generates RV<7:0>. That data which is encoded by XOR passes through the cache buffer and then the internal NAND I/F. On the contrary to stating above, data output is implemented by synchronizing with RE# toggle. There exists a seed of N cases in each page as dividing from page buffer size.

$$\left\lfloor \frac{\text{page buffer size}}{N} \right\rfloor = N \times K + C (\text{Seed mapping data})$$

---

---

Fig. 7. Moving read concept and algorithm based on controller interface with virtual negative read.

Fig. 8. Reliability margin comparison according to erase-write 3 K cycles and Bake 2h at 150 °C.

Fig. 9. Randomized behavior block.
Assuming that there exist the register number of \( N \) with LFSR(linear feedback shift register), the feedback polynomial data operated exclusive-OR is stored into 1st register bit and that sequence is circulated until the end of page address in Fig. 10.

This data is mapped from the seed coder in Fig. 11. In case of data input and output, this method allocates data with the identical column and page.

There exists a pattern dependency as mentioned above. All inhibit pattern has the zero fail bit. On the contrary that the worst pattern is in Fig. 12(d) and the best pattern is in Fig. 12(b). The randomized pattern data avoids the worst pattern and that fail bit count improves from 700 bits to 6 bits. It’s not different that best case of PV1-ERASE compares with the random pattern except for the inhibit pattern.

### VI. PERFORMANCE ENHANCEMENT

#### 1. Peak Current Compensation

There exits the problem of large peak current consumption to design the large density and the high performance NAND Flash memory because of large bit line loading, large page buffer size, multi-stack and multi-channel. This peak current problem leads to serious power drop or abnormal function. This value must be satisfied with summing both the dedicated controller and multi-stack NAND Flash memory. In case that the many cells which are not programmed are existed around the beginning of program pulse, the slope will be lack of rising. In accordance with increasing program pulse, the cells are programmed gradually and the bit line has no lack of rising slope which is targeted. The value of peak current is enhanced about 43% from 236 mA to 135 mA, depicted in Fig. 13(a–c).
2. Performance Enhancement

Both a NAND Flash memory and the dedicated NAND controller are packaged into one-chip. Therefore, performance degradation could be occurred essentially. The architectural solution has implementations of tWC, tRC in 14ns. According to technology shrink drawback of interference. [2, 3] This NAND Flash memory has the average parameter of tPROG is measured in 1361 µs. The performance is calculated in 12.7 MB/s using multi plane and cache program operation in Fig. 14(a~c).

\[
\text{Throughput (with cache)} = \frac{\text{Operated lnae number} \times \text{Page buffer size}}{\text{tPROG}}
\]  

(2)
V. CONCLUSIONS

In this paper, it has been implemented by co-work designing both the NAND flash memory and the dedicated NAND controller for a high reliability and high speed IO. The Moving read using virtual negative read algorithm and randomization is proposed to make it a high reliability. As a result, fail bit ratio is significantly improved 70% and the randomized pattern data avoids the worst pattern and that fail bit count improves from 700 bits to 6 bits. Furthermore, the NAND chip size is reduced because of implementing those methods in the dedicated controller. However, there exit drawbacks of a large current and a communicated channel speed. It has been adapted with the shortest structure path and the smallest current consumption to overcome drawbacks providing the dual channel and multi-stack. The tWC and tRC of data input, output parameters are measured in 14 ns. The value of peak current is enhanced about 43% through adjusting slope control. The NAND Flash memory has the performance 12.7 MB/s. And this device is configured by two planes with row decoder at the both sides on each cell string. Each plane has 2048 blocks included within 256 page which is consist of 8K bytes page size. The 64 Gb MLC NAND Flash Memory key features in 26 nm technology can be seen in table1. and the chip architecture also is depicted in Fig. 15.

ACKNOWLEDGMENTS

The authors would like to appreciate our NAND Flash design team, DV Team, Device Team, Product Team, and Process Team for great support and development.

REFERENCES


Table 1. The 64Gb NAND Flash Memory key features in 26 nm technology

<table>
<thead>
<tr>
<th>Technology</th>
<th>26 ns CMOS with 3 metals</th>
</tr>
</thead>
<tbody>
<tr>
<td>Density</td>
<td>64G bits</td>
</tr>
<tr>
<td>Organization</td>
<td></td>
</tr>
<tr>
<td>1 Page size</td>
<td>8K Bytes</td>
</tr>
<tr>
<td>1 Block size</td>
<td>256 pages</td>
</tr>
<tr>
<td>1 Plane size</td>
<td>2048 block</td>
</tr>
<tr>
<td>1 Chip size</td>
<td>2 plane</td>
</tr>
<tr>
<td>Power supply</td>
<td>2.7 V ~ 3.6 V</td>
</tr>
<tr>
<td>Program throughput</td>
<td>12.7 MB/s</td>
</tr>
<tr>
<td>Die size</td>
<td>179.32 mm²</td>
</tr>
</tbody>
</table>

Fig. 14. (a) Program speed and tWC measurement at MSB page, (b) Program speed and tWC measurement at LSB page and (c) Read speed and tRC measurement at MSB page.

Fig. 15. The microscope photo of the NAND Flash memory chip architecture.
ByoungSung You was born in Jecheon, Chungbuk, Korea on 1976. He received the B.S. and M.S degrees in Department of Electronic Engineering from Sogang University, Korea, in 2002, and 2004 respectively. He is now working as a senior engineer on NAND Flash memory design team since he joined hynix in 2004 for 8 years. He designed high density NAND flash memory products such as 70nm SLC 4Gb/2Gb/1Gb, 41nm MLC 32Gb/16Gb, 26nm MLC 64Gb NAND flash memories. His research interests include NAND core algorithm, high-speed interface, embedded-NAND controller system and ASIC. He holds lots of patents worldwide.

JinSu Park received the B.S. and M.S. degrees from the Department of Electrical Engineering from Kyungbook University, Korea, in 1997 and 2001. He joined Hynix Semiconductor, Icheon, Korea, in 2001, and has been working in the Flash Development Division for NAND flash memory design.

SangDon Lee received the B.S. and M.S. degrees in Department of Electronic, Electrical, Control & Instrumentation Engineering from HANYANG University, Korea, in 2003 and 2005. He is currently a Senior Engineer at Hynix Semiconductor Inc, from 2005. He has been working at Flash Development Division for NAND flash memory design. His interests are CMOS analog circuits including high-voltage pump, regulator, PLL and low-power analog circuit.

Gwangho Baek received the B.S. and M.S. degrees in Electronic and Electrical Engineering from Kyungpook National University, Korea, in 2003 and 2005, respectively. From 2006, he has been working for Hynix semiconductor Inc., Icheon, Korea. Currently, His research interest is flash memory circuit design.

JaeHo Lee received the B.S. and M.S. degrees in Electronic and Electrical Engineering from Sogang University, Korea, in 2005 and 2007, respectively. He joined Hynix semiconductor, Icheon, Korea in 2007. He has been working at Flash Development Division for NAND flash memory design. His interests are CMOS analog circuits including high-voltage pump, regulator, PLL and low-power analog circuit.

Minsu Kim received the B.S degrees in Electronic and Electrical and computer Engineering from Hanyang University, Seoul Korea, in 2007 respectively. He is now working as a engineer on NAND Flash memory design team since he joined Hynix in 2007 for four years. He designed high density NAND flash memory products such as 48 nm MLC 16 Gb, 41 nm MLC 32 Gb/16 Gb, 26 nm MLC 64 Gb NAND Flash memories. His research interests includes high-speed interface, memory circuits , data transfer system circuits.

Jongwoo Kim received the B.S. degree from the Department of Electronic Engineering, Kookmin University, Seoul, Korea, in 2008. He joined the Flash Memory Design, Hynix Semiconductor Company, Icheon, Korea, in 2008, where He has been working on the circuit design of NAND Flash memories

Hyun Chung received the B.S. degrees in Electronic and Electrical Engineering from Sogang University, Korea, in 2008. She is a Engineer at Hynix Semiconductor Inc, Korea.
Eunseong Jang received the B.S. degree from the Department of Electronic Engineering, Chungbuk National University, Chungbuk, Korea, in 2009. She is currently an Engineer at Hynix, Icheon, Korea.

TaeYoon Kim received M.S degree in Department of Electronic Engineering from Chungbuk National University, Korea, in 1992. He is now working as a team leader on Head of Advanced NAND Flash memory design team since he joined Hynix in 1992. He designed high speed DRAM products such as 100 nm 256 M DDR1, 80 nm 512 M DDR2, 66 nm 512 M DDR2 and high density NAND flash memory products such as 26 nm MLC 64 Gb/32 Gb NAND flash memories.