【免费】leveldb的lsm-tree核心原理资源-CSDN文库

需积分: 0 122 浏览量更新于2024-08-08 收藏 507KB PDF 举报

leveldb原理与核心算法介绍，重点介绍lsm-tree运行原理与leveldb的compaction过程，有助于深入理解leveldb的运行机制和算法过程，对leveldb的使用和调优有非常大的帮助

ben stopford

Gently Flexing the Grid

Log Structured Merge Trees

It’s nearly a decade since Google released its ‘Big Table’ paper. One of the many cool aspects of that paper was the file

organisation it uses. The approach is more generally known as the Log Structured Merge Tree, after this 1996 paper, although the

algorithm described there differs quite significantly from most real-world implementations.

LSM is now used in a number of products as the main file organisation strategy. HBase, Cassandra, LevelDB, SQLite, even MongoDB

3.0 comes with an optional LSM engine, after it’s acquisition of Wired Tiger.

What makes LSM trees interesting is their departure from binary tree style file organisations that have dominated the space for

decades. LSM seems almost counter intuitive when you first look at it, only making sense when you closely consider how files work

in modern, memory heavy systems.

Some Background

In a nutshell LSM trees are designed to provide better write throughput than traditional B+ tree or ISAM approaches. They do this

by removing the need to perform dispersed, update-in-place operations.

So why is this a good idea? At its core it’s the old problem of disks being slow for random

operations, but fast when accessed sequentially. A gulf exists between these two types of access, regardless of whether the disk is

magnetic or solid state or even, although to a lesser extent, main memory.

The figures in this ACM report here/here make the point well. They show that, somewhat counter intuitively, sequential disk access is

faster than randomly accessing main memory. More relevantly they also show sequential access to disk, be it magnetic or SSD, to be

at least three orders of magnitude faster than random IO. This means random operations are to be avoided. Sequential access is well

worth designing for.

So with this in mind lets consider a little thought experiment: if we are interested in write throughput, what is the best method to

use? A good starting point is to simply append data to a file. This approach, often termed logging, journaling or a heap file, is fully

sequential so provides very fast write performance equivalent to theoretical disk speeds (typically 200-300MB/s per drive).

Benefiting from both simplicity and performance log/journal based approaches have rightfully become popular in many big data

tools. Yet they have an obvious downside. Reading arbitrary data from a log will be far more time consuming than writing to it,

involving a reverse chronological scan, until the required key is found.

This means logs are only really applicable to simple workloads, where data is either accessed in its entirety, as in the write-ahead log

of most databases, or by a known offset, as in simple messaging products like Kafka.

So we need more than just a journal to efficiently perform more complex read workloads like key based access or a range search.

Broadly speaking there are four approaches that can help us here: binary search, hash, B+ or external.

1. Search Sorted File: save data to a file, sorted by key. If data has defined widths use Binary search. If not use a page index +

scan.

2. Hash: split the data into buckets using a hash function, which can later be used to direct reads.

3. B+: use navigable file organisation such as a B+ tree, ISAM etc.

4. External file: leave the data as a log/heap and create a separate hash or tree index into it.

All these approaches improve read performance significantly ( n->O(log(n)) in most). Alas these structures add order and that order

impedes write performance, so our high speed journal file is lost along the way. You can’t have your cake and eat it I guess.

下载后可阅读完整内容，剩余9页未读，立即下载

资源推荐

资源评论

奋斗的松鼠

粉丝: 343
资源: 1

leveldb的lsm-tree核心原理

开源项目1-leveldb-master.rar

leveldb-1.18 源码及 leveldb实现解析.PDF

leveldb-master.rar

leveldb 源码

DB leveldb实现解析

leveldb源码

65520809LSM_LSM算法_优化_优化算法_

LevelDB数据库 v1.23.zip

leveldb,leveldb到java的端口.zip

wickdb：基于Pure Rust LSM树的嵌入式存储引擎

levelDB源码

Leveldb_高效的 kv 数据库_c语言

leveldb_1_20_win32源码

leveldb.reimpl

开源项目-dgraph-io-badger.zip

lsm：日志结构合并数据库

LSM：코스모LSM프로젝트

lsmdb:向leveldb学习并重写

高级数据库管理系统_549212_2203010.docx

SelectiveLSM_matlab_源码.zip

lsmt：LSM树

level db实现解析

RocketMQ 可视化工具 Dashboard下载

Keil5 MDK5.40版本0积分免费下载

免费插件-AI插件-illustrator插件集合-尺寸标注-智能填充-颜色自动处理-自动批处理-Windows安装包.zip

嵌入式入门-ADS-安装包

pycdc、pycdas工具(最新2024.06.04编译)，Python3.9-3.12可用的反编译工具(exe转py)

yolov8源码+yolov8n、s、m.pt文件整合8.2.0版本

VMware Workstation Pro17安装包

MIX ramdisk iPhone激活锁绕过软件

最新资源