没有合适的资源?快使用搜索试试~ 我知道了~
Log Buffer Based Flash Translation Layer
5星 · 超过95%的资源 需积分: 9 28 下载量 31 浏览量
2009-06-05
18:40:07
上传
评论
收藏 643KB PDF 举报
温馨提示
试读
27页
A Log Buffer Based Flash Translation Layer Using Fully Associative Sector Translation
资源推荐
资源详情
资源评论
A Log Buffer-Based Flash Translation Layer
Using Fully-Associative Sector Translation
SANG-WON LEE
Sungkyunkwan University
DONG-JOO PARK
Soongsil University
TAE-SUN CHUNG
Ajou University
DONG-HO LEE
Hanyang University
SANGWON PARK
Hankook University of Foreign Studies
and
HA-JOO SONG
Pukyong National University
Flash memory is being rapidly deployed as data storage for mobile devices such as PDAs, MP3
players, mobile phones, and digital cameras, mainly because of its low electronic power, nonvolatile
storage, high performance, physical stability, and portability. One disadvantage of flash memory
is that prewritten data cannot be dynamically overwritten. Before overwriting prewritten data,
a time-consuming erase operation on the used blocks must precede, which significantly degrades
This work was supported in part by MIC & IITA through IT Leading R&D Support Project, in part
by MIC & IITA through Oversea Post-Doctoral Support Program 2005, in part by the Ministry
of Information and Communication, Korea under the ITRC support program supervised by the
Institute of Information Technology Assessment, IITA-2005-(C1090-0501-0019), and also supported
in part by Seoul R&D Program(10660).
Authors’ addresses: Sang-Won Lee, School of Information and Communications Engineering,
Sungkyunkwan University, Suwon 440-746, Korea; email: swlee@acm.org; Dong-Joo Park, School
of Computing, Soongsil University, Seoul 156-743, Korea; email: djpark@ssu.ac.kr; Tae-Sun
Chung, College of Information Technology, Ajou University, Suwon 443-749, Korea; email:
tschung@ajou.ac.kr; Dong-Ho Lee, Department of Computer Science and Engineering, Hanyang
University, Ansan 426-791, Korea; email: dhlee72@cse.hanyang.ac.kr; Sangwon Park, Informa-
tion Communication Engineering, Hankook University of Foreign Studies, Yongin 449-791, Korea;
email: swpark@hufs.ac.kr; Ha-Joo Song, Division of Electronic, Computer, and Telecommunication,
Pukyong National University, Busan 608-737, Korea; email: hajusong@pknu.ac.kr.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is
granted without fee provided that copies are not made or distributed for profit or direct commercial
advantage and that copies show this notice on the first page or initial screen of a display along
with the full citation. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,
to redistribute to lists, or to use any component of this work in other works requires prior specific
permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn
Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org.
C
2007 ACM 1539-9087/2007/07-ART18 $5.00 DOI 10.1145/1275986.1275990 http://doi.acm.org/
10.1145/1275986.1275990
ACM Transactions on Embedded Computing Systems, Vol. 6, No. 3, Article 18, Publication date: July 2007.
2
•
S.-W. Lee et al.
the overall write performance of flash memory. In order to solve this “erase-before-write” problem,
the flash memory controller can be integrated with a software module, called “flash translation
layer (FTL).” Among many FTL schemes available, the log block buffer scheme is considered to be
optimum. With this scheme, a small number of log blocks, a kind of write buffer, can improve the
performance of write operations by reducing the number of erase operations. However, this scheme
can suffer from low space utilization of log blocks. In this paper, we show that there is much room
for performance improvement in the log buffer block scheme, and propose an enhanced log block
buffer scheme, called FAST (full associative sector translation). Our FAST scheme improves the
space utilization of log blocks using fully-associative sector translations for the log block sectors.
We also show empirically that our FAST scheme outperforms the pure log block buffer scheme.
Categories and Subject Descriptors: B.3.2 [Design Styles]: Mass Storage; B.4.2 [Input/Output
Devices]: Channels and Controllers; D.4.2 [Storage Management]: Secondary Storage
General Terms: Algorithms, Performance
Additional Key Words and Phrases: Flash memory, FTL, address translation, log blocks, associative
mapping
ACM Reference Format:
Lee, S.-W., Park, D.-J., Chung, T.-S., Lee, D.-H., Park, S., and Song, H.-J. 2007. A log buffer-
based flash translation layer using fully-associative sector translation. ACM Trans. Embedd.
Comput. Syst. 6, 3, Article 18 (July 2007), 27 pages. DOI = 10.1145/1275986.1275990
http://doi.acm.org/ 10.1145/1275986.1275990
1. INTRODUCTION
Flash memory is being rapidly deployed as data storage for mobile devices, such
as PDAs, MP3 players, mobile phones, and digital cameras, mainly because of its
small size, low power consumption, shock resistance, and nonvolatile memory
[Douglis et al. 1994; Kim et al. 2002; Gal and Toledo 2005]. Compared to a hard
disk with the inevitable mechanical delay in accessing data (that is, seek time
and rotational latency), flash memory provides fast uniform random access. For
example, the read and write time per sector (typically, 512 bytes) for NAND-
flash memory is 15 and 200 μs [Samsung Electronics 2005], respectively, while
each operation in contemporary hard disks takes around 10 ms [Hennessy and
Patterson 2003].
However, flash memory suffers from the write bandwidth problem. A write
operation is slower than a read operation by an order of magnitude. In addi-
tion, a write operation may have to be preceded by a costly erase operation
because flash memory does not allow overwrites. Unfortunately, write opera-
tions are performed in a sector unit, while erase operations are executed in
a block unit: usually, one block consists of 32 sectors. An erase operation is
very time-consuming, compared to a write operation; usually, a per-block erase
time is 2 ms [Samsung Electronics 2005]. These inherent characteristics of
flash memory reduce write bandwidth, which is the performance bottleneck in
flash-based mobile devices.
To relieve this performance bottleneck, it is very important to reduce the
number of erase operations resulting from write operations. For this, flash
memory vendors have adopted an intermediate software module, called flash
translation layer (FTL), between the host applications (in general, file systems)
and flash memory [Estakhri and Iman 1999; Kim and Lee 2002; Kim et al. 2002;
ACM Transactions on Embedded Computing Systems, Vol. 6, No. 3, Article 18, Publication date: July 2007.
A Log Buffer-Based Flash Translation Layer
•
3
Shinohara 1999]. The key role of FTL is to redirect each write request from the
host to an empty area that has been already erased in advance, thus softening
the limitation of “erase-before-write.” In fact, various FTL algorithms have thus
far been proposed [Chung et al. 2006] and each FTL performance varies con-
siderably, depending on the characteristics of the applications. Among them,
the log block scheme is well known for excellent performance [Kim et al. 2002].
The key idea of this scheme is to maintain a small number of log blocks in
flash memory as temporary storage for overwrites. If a collision (an overwrite)
occurs at a sector of flash memory, this scheme forwards the new data to an
empty sector in the log blocks, instead of erasing the original data block. Since
these log blocks act as cushions against overwrites, the log block scheme can
significantly reduce the number of total erase operations.
However, when an overwrite occurs in a data block, the write can be redi-
rected only to one log block, which is dedicated for the data block; it cannot
be redirected to other log blocks. For a given overwrite, if there is no log block
dedicated to its data block, a log block should be selected as a victim and re-
placed. Thus, if there are a limited number of log blocks, they might have to be
frequently replaced for overwrites. Unfortunately, with the log block scheme,
the log blocks being replaced usually have many unused sectors. That is, the
space utilization, which can be formally defined as “the percentage of the writ-
ten sectors in a log block when it is replaced,” of each log block is low. In our
view, low space utilization degrades the performance of the log block scheme.
We need to find out the root cause of low space utilization. For this, we view the
role of log blocks from a different perspective. The log blocks in the log block
scheme could be viewed as “a cache for overwrites,” in which a logical sector
to be overwritten can be mapped only to a certain log block—its dedicated log
block. From this viewpoint, the address associativity between logical sectors
and log blocks is of block level. Thus, we call the log block scheme presented
by Kim et al. [2002], the block-associative sector translation (BAST) scheme.
We argue that this inflexible block-level associativity in BAST is the primary
cause of low space utilization of log blocks.
In this paper, we propose a novel FTL scheme that overcomes the low space
utilization of BAST. The basic idea is to make the degree of associativity between
logical sectors and log blocks higher, thus achieving better write performance.
In our scheme, the sector to be overwritten can be placed in any log block and
we, therefore, call our scheme the fully-associative sector translation (FAST)
scheme. Hereafter, we denote the FAST scheme and the BAST scheme shortly
as FAST and BAST, respectively. As in the computer architecture’s CPU cache
realm [Hennessy and Patterson 2003], if we view the log blocks as a kind of cache
for write operations and enlarge the associativity, we can reduce the write miss
ratio and, therefore, achieve better FTL performance. The main contributions
of this paper can be divided into three achievements: (1) to identify the major
problem of BAST and its root cause, (2) to provide the motivation of FAST and
describe its principal idea and algorithms, and (3) to compare the performance
of FAST and BAST over various configurations. An interesting result is that
FAST can, in a best case, result in more than 50% reduction both in total elapsed
time and in total number of erases.
ACM Transactions on Embedded Computing Systems, Vol. 6, No. 3, Article 18, Publication date: July 2007.
4
•
S.-W. Lee et al.
Fig. 1. An anatomy of the NAND flash memory system.
The remainder of this paper is organized as follows. Section 2 gives a brief
overview of flash memory and FTL and explains BAST and its disadvantages.
Section 3 describes the key idea of FAST and its algorithms. Section 4 compares
the performance of our scheme with that of BAST. Finally, Section 5 concludes
this paper and outlines future work.
2. BACKGROUND
In this section, we provide a brief overview of flash memory and FTL, and
explain in detail the principles behind BAST and its disadvantages. The latter
part is somewhat long, but we need to explain BAST in detail because the
scheme is not well known in the computer science field.
2.1 Flash Memory and the Flash Translation Layer
Figure 1 shows the general organization of a NAND-type flash memory system.
It consists of one or more flash memory chips, a controller mainly executing
FTL codes in ROM, an SRAM maintaining address-mapping information, and a
PCMCIA host interface. The host system views flash memory as a hard disklike
block device and thus issues read or write commands along with “logical” sector
addresses and data. FTL translates the commands into low-level operations,
such as read, write, and erase, using “physical” sector addresses. During the
address translation, FTL looks up the address-mapping table in SRAM. When
issuing overwrites, FTL redirects the physical address to an empty location (free
space), thus avoiding erase operations. After finishing an overwrite operation,
FTL changes the address-mapping information in SRAM. The outdated block
can be erased later.
ACM Transactions on Embedded Computing Systems, Vol. 6, No. 3, Article 18, Publication date: July 2007.
A Log Buffer-Based Flash Translation Layer
•
5
Besides the address translation from logical sectors to physical sectors,
FTL carries out several important functionalities, such as guaranteeing data
consistency and uniform wear leveling (or so called block recycling). FTL should
be able to maintain a consistent state for data and metadata, even when flash
memory encounters an unexpected power outage. For uniform wear leveling,
FTL should make every physical block erased as evenly as possible without
performance degradation. This functionality of uniform block usage is essen-
tial, because there is a physical upper limit on the maximum number of erases
allowed for each block (usually, 100,000 times). If a block is erased above the
threshold, the block may not function correctly, and thus it is marked as invalid.
Even though the block may still work, FTL marks it as invalid for the safety
of flash memory. Therefore, if only some physical blocks have been intensively
erased, they become unusable very quickly, therefore reducing the durability of
flash memory. Therefore, FTL need to use all the blocks as uniformly as possi-
ble. Even though these kinds of FTL functionalities are important issues, they
are beyond the scope of this paper. In this paper, we focus on the performance
issue of the address-mapping technique.
Let us revisit the address-mapping issue. The mapping between the logi-
cal address (which the file system uses to interface with flash memory) and
the physical address (which the flash controller actually uses to store and re-
trieve data) can be managed at the sector, block, or hybrid level. In sector-
level address mapping, a logical sector can be mapped to any physical sector
in flash memory. In this respect, this mapping approach is very flexible. How-
ever, its main disadvantage is that the size of mapping table is too large to
be viable in the current flash memory-packaging technology. For example, let
us consider a flash memory of 1 GB size and a sector size of 512 bytes, which
has 2 million sectors. In this case, the mapping information is too large to
maintain in SRAM. In block-level address mapping, a logical sector address
consists of a logical block number and an offset. The mapping table main-
tains only the mapping information between logical and physical blocks, while
the offsets of the logical and physical blocks are identical. Therefore, the size
of block-mapping information is very small, compared to the size of sector-
level mapping information. However, this approach also has a serious pitfall:
when an overwrite for a logical sector is necessary, the corresponding block
is remapped to a free physical block. The overwrite is done at the same off-
set in the new physical block, the other sectors of the original data block
are copied to the new physical block, and ‘finally’ the original physical block
should be erased. With regard to block-level mapping, this “erase-before-write”
problem is an inevitable performance bottleneck, mainly because of the in-
flexibility in address mapping. In order to overcome the disadvantages of the
sector-level and block-level mapping approaches, several hybrid approaches,
including BAST, have been proposed. In the hybrid scheme, in addition to a
block-level mapping table, a sector-level mapping table for a limited number
of blocks is maintained. Thus, it satisfies the size limitations of mapping infor-
mation and also mitigates the “erase-before-write” problem drastically. For a
detailed discussion on this issue, please refer to Chung et al. [2006] and Gal and
Toledo [2005].
ACM Transactions on Embedded Computing Systems, Vol. 6, No. 3, Article 18, Publication date: July 2007.
剩余26页未读,继续阅读
资源评论
- lowinrain2012-12-11很不错的论文,有深度
angelojiang
- 粉丝: 0
- 资源: 2
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功