TEA:ATraffic-efficientErasure-codedArchivalSchemeforIn-memoryStores资源-CSDN文库

81 浏览量 2021-02-08 06:25:20 上传评论收藏 1.46MB PDF 举报

资源推荐

资源详情

资源评论

TEA: A Traic-eicient Erasure-coded Archival Scheme for

In-memory Stores

Bin Xu

binxu@hust.edu.cn

Huazhong University of Sci.& Tech.

Wuhan, Hubei, China

Jianzhong Huang

∗

Qiang Cao

∗

Huazhong University of Sci.& Tech.

Wuhan, Hubei, China

Xiao Qin

xqin@auburn.edu

Auburn University

Auburn, AL 36849, USA

ABSTRACT

To achieve good trade-o between access performance and mem-

ory eciency, it is appropriate to adopt replication and erasure

coding to keep popular and unpopular in-memory datasets, re-

spectively. An issue of redundancy transition from replication to

erasure coding (a.k.a., erasure-coded archival) should be addressed

for unpopular in-memory datasets, since caching workloads exhibit

long-tail distributions and most in-memory data are unpopular.

In this paper, we propose an encoding-oriented replica placement

policy - ERP - by incorporating an interleaved declustering mecha-

nism, and design a trac-ecient erasure-coded archival schemes

-TEA - for ERP-powered in-memory stores. With ERP in place, TEA

embraces three salient features: (i) it alleviates cross-rack trac

raised by retrieving data-block replicas, (ii) it improves rack-level

load balancing by distributing replicas via load-aware primary-rack-

selection approach, and (iii) it mitigates block-relocation operations

launched to sustain rack-level fault-tolerance. The empirical results

show that TEA not only brings forth lower cross-rack trac than

four candidate encoding schemes, but also exhibits superb archival-

throughput and rack-level-balancing performance. In particular,

TEA accelerates archival throughput by at least 70.8%; and improves

rack-level load-balancing by a factor of more than 1.58x relative to

the four competitors.

CCS CONCEPTS

• Information systems →

Distributed storage;

• Computer sys-

tems organization → Re dundancy.

KEYWORDS

In-memory store, Erasure encoding, Replication, Archival

ACM Reference Format:

Bin Xu, Jianzhong Huang, Qiang Cao, and Xiao Qin. 2019. TEA: A Trac-

ecient Erasure-coded Archival Scheme for In-memory Stores. In 48th

International Conference on Parallel Processing (ICPP 2019), August 5–8, 2019,

Kyoto, Japan. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/

3337821.3337826

∗

Jianzhong Huang (hjzh@hust.edu.cn) and Qiang Cao (caoqiang@hust.edu.cn) are the

joint corresponding authors.

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for prot or commercial advantage and that copies bear this notice and the full citation

on the rst page. Copyrights for components of this work owned by others than ACM

must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,

to post on servers or to redistribute to lists, requires prior specic permission and/or a

fee. Request permissions from permissions@acm.org.

ICPP 2019, August 5–8, 2019, Kyoto, Japan

ACM ISBN 978-1-4503-6295-5/19/08... $15.00

https://doi.org/10.1145/3337821.3337826

1 INTRODUCTION

1.1 Motivation

The following three aspects motivate us to delve in the development

of an erasure-coded archival scheme for in-memory stores.

Aspect #1

–low access latency in in-memory stores. We are in

an era of data-driven business world. For example, data-intensive

analytics has become indispensable since enterprises want to gain

insights to products, services and marketing strategies from in-

creasing volumes of data. Commonly, a data-intensive application

is supported by a cluster consisting of hundreds of nodes and peta-

bytes of data. It is a technology trend to constitute an in-memory

store upon the cluster to achieve low-latency performance. A tra-

ditional case is that Facebook leverages Memcached as a building

block to construct a distributed key-value store facilitating the

world’s largest social network [19].

Aspect #2

–demands for redundancy strategies. Since volatile

DRAM only maintains data while it is powered, existing in-memory

stores accomplish memory-level fault-tolerance by applying repli-

cation and/or erasure coding. Replication is a simple yet eective

redundancy scheme. For instance, Repcached [

] and Bigmem-

ory [

] keep two replicas in memory among nodes. Compared

to the replication, erasure coding achieves higher space eciency

dened as a ratio of user data to the combination of user data

and redundancy data [

]. Unsurprisingly, space-ecient erasure

codes are also adopted by in-memory stores, e.g.,

EC-cache

[

Cocytus [31], MemEC [30], and Ring [24].

Aspect #3

–necessity of erasure-coded archival. An analysis of

traces collected from Facebook’s Memcached deployment shows

that caching workloads exhibit long-tail distributions, in which a

small percentage of keys appeared in most of the requests whereas

most keys repeated only a handful of times [

]. Therefore, it is not

economical to employ a single redundancy scheme for the entire

in-memory data (i.e., metadata, key and value in in-memory key-

value stores) within the data lifetime. Nowadays, some key-value

stores (e.g., Memcached in Facebook [

], Ring [

]) adopt a hybrid

redundancy strategy, where replication is applied to popular data,

while erasure codes are employed for unpopular data.

Generally, newly-loaded data is kept in a replication manner to

support high access parallelism. Since most of the new data are

infrequently accessed, it is wise to encode unpopular data replicas

using erasure codes to achieve high space eciency. We refer to

such an encoding process as ‘erasure-coded archival’.

1.2 Challenges and Strategies

We face two challenges during the course of designing an erasure-

coded archival scheme for in-memory stores.

ICPP 2019, August 5–8, 2019, Kyoto, Japan Bin Xu, Jianzhong Huang, Qiang Cao, and Xiao Qin

Challenge 1: How to reduce cross-rack trac while performing

erasure-encoding op erations? Cross-rack bandwidth is usually scarce

compared to intra-rack one; ideally, cross-rack trac should be elim-

inated. If data replicas are randomly distributed among multiple

racks, then subsequent sequential encoding will retrieve required

blocks via cross-rack transfer. Therefore, it makes sense to elabo-

rately distribute data-block replicas.

Challenge 2: How to holistically consider locality and exibility

in stripe construction? Non-sequential striping potentially enables

exible stripe construction for unpopular blocks. To exploit locality,

we have to restrict candidate members in a stripe to a small subset

of unpopular data blocks and; therefore, the exibility of stripe

construction is worsened. In this paper, locality is taken into account

while employing non-sequential striping.

It is arguably true that the replica placement is critical to reliabil-

ity, access locality, load balancing, and the like. A typical example is

the rack-aware data placement in HDFS [

]. To accomplish highly-

ecient erasure encoding operations for unpopular replicas, it is

expected to introduce an encoding-oriented placement for repli-

cas, which satises the following three requirements. (i) Rack-level

fault tolerance should be guaranteed for data-block replicas prior

to encoding. (ii) The primary copies of a group of sequential data

blocks should be placed to a rack; thus, blocks in the rack can con-

struct a complete stripe during encoding. (iii) Cross-rack trac

should be minimized after encoding, because encoded blocks may

be re-distributed to sustain rack-level fault tolerance.

To address the above requirements and challenges, we propose

an Encoding-oriented Replica Placement policy -

ERP

, in which

an Interleaved Declustering mechanism [

] is introduced to dis-

tribute in-memory data-block replicas among racks. Furthermore,

we elaborately design a Trac-ecient Erasure-coded Archival

scheme -

TEA

- for ERP-powered in-memory stores. The TEA

scheme has the following three salient features. (i) It alleviates

cross-rack trac raised by retrieving required data-block replicas.

(ii) It improves rack-level load balancing by distributing replicas via

load-aware primary-rack-selection approach. (iii) It mitigates block-

relocation operations launched to sustain rack-level and node-level

fault-tolerance.

1.3 Contributions

This work makes three main contributions:

•

We devise an encoding-oriented replica placement policy

(i.e., ERP) for data-block replicas in in-memory stores. ERP

enables subsequent erasure-encoding operations to alleviate

cross-rack trac while sustaining rack-level fault tolerance.

•

We design a trac-ecient erasure-coded archival scheme

(i.e., TEA) for ERP-powered in-memory stores. TEA ad-

dresses both cross-rack-trac and rack-level-load-balancing

issues in encoding-rack-designation and encoded-block-

relocation stages.

•

We implement a proof-of-concept prototype, where TEA and

four candidate encoding schemes are quantitatively evalu-

ated. The experiments illustrate that the TEA scheme not

only brings forth lower cross-rack trac than the other four

competitors, but also exhibits superb archival-throughput

and rack-level load-balancing performance.

1.4 Roadmap

The rest of the paper is organized as follows. Section 2 outlines

both erasure-coded archival techniques and schemes. ERP policy

and TEA scheme are detailed in Sections 3 and 4, respectively. We

present quantitative performance evaluation in Section 5. Section 6

concludes this paper.

2 RELATED WORK

2.1 Erasure-coded Archival Techniques

Replication and erasure coding are two main fault-tolerant tech-

niques. Replication and erasure coding perform well in the aspects

of access parallelism and space eciency, respectively [26]. Apart

from on-disk storage (e.g.,

GFS II

[

], HDFS [

]), in-memory stores

(e.g., Memcached in Facebook [

], Ring [

]) also adopt a hybrid

redundancy strategy, in which replication is applied to newly cre-

ated data whereas erasure coding is used to archive the same data

once it becomes unpopular. Such a transition from replication to

erasure coding is referred to as erasure-coded archival.

Among various erasure codes, Reed-Solomon (a.k.a., RS) codes

become the most popular choices owing to the RS codes’ maximum

distance separable (MDS) feature as well as high level of fault tol-

erance [

]. RS codes accomplish parity generation by adopting

simple linear combinations. Especially, as for (k+r,k) RS codes,

parity blocks {P

, ...,

} are generated by multiplying

data

blocks {D

, ...,

} with a k

r redundancy matrix[

]. Both

data blocks and

associated parity blocks constitute a stripe, and

(k+r,k) RS codes tolerate data loss of any r concurrent blocks.

Typically, the following four steps are involved in a (k+r,k) RS-

coded archival operation. (i) An encoding node retrieves one replica

of each of

data blocks from local and/or remote nodes; (ii) it

computes

parity blocks from the

data blocks using (k+r,k) RS

codes; (iii) it delivers

1 parity blocks to other nodes (Note:

it is of

1 blocks when one parity block is kept by the encoding

node); and (iv) only one replica is kept (i.e., not deleted) for each

data block while deleting the other replicas.

To accomplish a

-node fault-tolerant distribution for in-memory

stores,

data blocks and

parity blocks in a stripe are exclusively

placed to the main memories of k+r nodes during the archival steps

(iii) and (iv). In practice, a node may cache a data or parity block

in a stripe and a data or parity block in another stripe. Usually, a

data block is sealed from multiple small-sized objects in in-memory

stores [31][30].

2.2 Erasure-coded Archival Schemes

Most of erasure-coded archival schemes are focused on distributed

storage systems. For example, DP [

], RapidRAID [

], aHDFS [

EAR [

], DSC [

], Sice [

], and to name just a few. Dierently,

our TEA scheme aims at in-memory stores.

DP is an erasure-coded data archival scheme for storage clus-

ters, in which a chained-declustering layout is applied to organize

Mirrored

RAID-5

redundancy groups [

]. DP boosts data archival

performance by leveraging the existence of replicas and employing

a pipelined encoding process.

RapidRAID is a family of erasure codes that incorporate pipelined

erasure coding to speed up archival processes [

]. Unlike RS codes,

剩余9页未读，继续阅读

评论收藏

内容反馈

weixin_38581447

粉丝: 8
资源: 911

TEA: A Traffic-efficient Erasure-coded Archival Scheme for In-me...

最新资源

TEA: A Traffic-efficient Erasure-coded Archival Scheme for In-me...

Erasure Codes for Storage Applications

基于正则图的分布式存储系统快速修复代码

TEA

a new approach to archival storage.pdf

DR-Update: A Dual-level Relay Scheme in Erasure coded Storage Systems for Balanced Updates

分布式存储系统：HDFS：HDFS高级特性：ErasureCoding.docx

try-erasure:Swift Type-Erasure实施和基准

论文研究-Optimizing Delay-Limited Erasure Error Correction Schemes for Wireless Multicast.pdf

ceph知识树.pdf

An Efficient I/O-Redirection-Based Reconstruction Scheme for Erasure-Coded Storage Clusters

Efficient and Available In-memory KV-Store with Hybrid Erasure Coding and Replication

reed-solomon-erasure:Reed-Solomon纠删码的Rust实现

ocaml-reed-solomon-erasure:ReCam-Reed-Solomon纠删码的OCaml实现

reedsolomon:Go中的Reed-Solomon Erasure Code引擎，每个内核可以超过15GB

利用程序优化技术加速基于异或的擦除编码_Accelerating XOR-based Erasure Coding using

erasure:适用于（Linux、OSX、Windows）amd64x86-64 架构的 ISAL 的 Golang 包

garyProject00:文件存储-开源

ITUG729源代码

EC-FRM: A Novel Erasure Coding Framework toSpeed up Reads for Erasure Coded Cloud Storage Systems

ansible-role-glusterfs：Ansible角色-GlusterFS

马可夫链matlab源代码-remote-estimation-with-Bernoulli-erasure-channel:该存储库包含通过

ceph 可靠性方案

Hash key-based video encryption scheme for H.264AVC.pdf

03-Samual.Just-recovery_erasure_coding_cache_tiering.pdf

Python-zfec一个高效便携erasure编码工具

Fast-algorithms-of-Reed-Solomon-erasure-codes

HDFSErasureCodingDesign-20150206.pdf_hdfs_

coding-craft

rh-mongodb36-boost-type_erasure-1.60.0-2.el7.x86_64.rpm

rh-mongodb34-boost-type_erasure-1.60.0-2.el7.x86_64.rpm

最新资源