没有合适的资源?快使用搜索试试~ 我知道了~
leveldb的lsm-tree核心原理
试读
10页
需积分: 0 0 下载量 122 浏览量
更新于2024-08-08
收藏 507KB PDF 举报
leveldb原理与核心算法介绍,重点介绍lsm-tree运行原理与leveldb的compaction过程,有助于深入理解leveldb的运行机制和算法过程,对leveldb的使用和调优有非常大的帮助
ben stopford
Gently Flexing the Grid
Log Structured Merge Trees
It’s nearly a decade since Google released its ‘Big Table’ paper. One of the many cool aspects of that paper was the file
organisation it uses. The approach is more generally known as the Log Structured Merge Tree, after this 1996 paper, although the
algorithm described there differs quite significantly from most real-world implementations.
LSM is now used in a number of products as the main file organisation strategy. HBase, Cassandra, LevelDB, SQLite, even MongoDB
3.0 comes with an optional LSM engine, after it’s acquisition of Wired Tiger.
What makes LSM trees interesting is their departure from binary tree style file organisations that have dominated the space for
decades. LSM seems almost counter intuitive when you first look at it, only making sense when you closely consider how files work
in modern, memory heavy systems.
Some Background
In a nutshell LSM trees are designed to provide better write throughput than traditional B+ tree or ISAM approaches. They do this
by removing the need to perform dispersed, update-in-place operations.
So why is this a good idea? At its core it’s the old problem of disks being slow for random
operations, but fast when accessed sequentially. A gulf exists between these two types of access, regardless of whether the disk is
magnetic or solid state or even, although to a lesser extent, main memory.
The figures in this ACM report here/here make the point well. They show that, somewhat counter intuitively, sequential disk access is
faster than randomly accessing main memory. More relevantly they also show sequential access to disk, be it magnetic or SSD, to be
at least three orders of magnitude faster than random IO. This means random operations are to be avoided. Sequential access is well
worth designing for.
So with this in mind lets consider a little thought experiment: if we are interested in write throughput, what is the best method to
use? A good starting point is to simply append data to a file. This approach, often termed logging, journaling or a heap file, is fully
sequential so provides very fast write performance equivalent to theoretical disk speeds (typically 200-300MB/s per drive).
Benefiting from both simplicity and performance log/journal based approaches have rightfully become popular in many big data
tools. Yet they have an obvious downside. Reading arbitrary data from a log will be far more time consuming than writing to it,
involving a reverse chronological scan, until the required key is found.
This means logs are only really applicable to simple workloads, where data is either accessed in its entirety, as in the write-ahead log
of most databases, or by a known offset, as in simple messaging products like Kafka.
So we need more than just a journal to efficiently perform more complex read workloads like key based access or a range search.
Broadly speaking there are four approaches that can help us here: binary search, hash, B+ or external.
1. Search Sorted File: save data to a file, sorted by key. If data has defined widths use Binary search. If not use a page index +
scan.
2. Hash: split the data into buckets using a hash function, which can later be used to direct reads.
3. B+: use navigable file organisation such as a B+ tree, ISAM etc.
4. External file: leave the data as a log/heap and create a separate hash or tree index into it.
All these approaches improve read performance significantly ( n->O(log(n)) in most). Alas these structures add order and that order
impedes write performance, so our high speed journal file is lost along the way. You can’t have your cake and eat it I guess.
下载后可阅读完整内容,剩余9页未读,立即下载
资源推荐
资源评论
185 浏览量
173 浏览量
2019-07-31 上传
2018-11-15 上传
140 浏览量
123 浏览量
2021-10-03 上传
137 浏览量
150 浏览量
2021-02-05 上传
5星 · 资源好评率100%
132 浏览量
2022-05-14 上传
5星 · 资源好评率100%
143 浏览量
115 浏览量
2019-09-06 上传
2021-02-21 上传
2021-02-20 上传
2021-03-30 上传
183 浏览量
142 浏览量
141 浏览量
5星 · 资源好评率100%
127 浏览量
5星 · 资源好评率100%
5星 · 资源好评率100%
114 浏览量
120 浏览量
资源评论
奋斗的松鼠
- 粉丝: 343
- 资源: 1
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功