没有合适的资源?快使用搜索试试~ 我知道了~
TEA: A Traffic-efficient Erasure-coded Archival Scheme for In-me...
0 下载量 81 浏览量
2021-02-08
06:25:20
上传
评论
收藏 1.46MB PDF 举报
温馨提示
To achieve good trade-off between access performance and memory efficiency, it is appropriate to adopt replication and erasure coding to keep popular and unpopular in-memory datasets, respectively. An issue of redundancy transition from replication to erasure coding (a.k.a., erasure-coded archival) should be addressed for unpopular in-memory datasets, since caching workloads exhibit long-tail distributions and most in-memory data are unpopular..In this paper,we propose an encoding-oriented repli
资源推荐
资源详情
资源评论
TEA: A Traic-eicient Erasure-coded Archival Scheme for
In-memory Stores
Bin Xu
binxu@hust.edu.cn
Huazhong University of Sci.& Tech.
Wuhan, Hubei, China
Jianzhong Huang
∗
Qiang Cao
∗
Huazhong University of Sci.& Tech.
Wuhan, Hubei, China
Xiao Qin
xqin@auburn.edu
Auburn University
Auburn, AL 36849, USA
ABSTRACT
To achieve good trade-o between access performance and mem-
ory eciency, it is appropriate to adopt replication and erasure
coding to keep popular and unpopular in-memory datasets, re-
spectively. An issue of redundancy transition from replication to
erasure coding (a.k.a., erasure-coded archival) should be addressed
for unpopular in-memory datasets, since caching workloads exhibit
long-tail distributions and most in-memory data are unpopular.
In this paper, we propose an encoding-oriented replica placement
policy - ERP - by incorporating an interleaved declustering mecha-
nism, and design a trac-ecient erasure-coded archival schemes
-TEA - for ERP-powered in-memory stores. With ERP in place, TEA
embraces three salient features: (i) it alleviates cross-rack trac
raised by retrieving data-block replicas, (ii) it improves rack-level
load balancing by distributing replicas via load-aware primary-rack-
selection approach, and (iii) it mitigates block-relocation operations
launched to sustain rack-level fault-tolerance. The empirical results
show that TEA not only brings forth lower cross-rack trac than
four candidate encoding schemes, but also exhibits superb archival-
throughput and rack-level-balancing performance. In particular,
TEA accelerates archival throughput by at least 70.8%; and improves
rack-level load-balancing by a factor of more than 1.58x relative to
the four competitors.
CCS CONCEPTS
• Information systems →
Distributed storage;
• Computer sys-
tems organization → Re dundancy.
KEYWORDS
In-memory store, Erasure encoding, Replication, Archival
ACM Reference Format:
Bin Xu, Jianzhong Huang, Qiang Cao, and Xiao Qin. 2019. TEA: A Trac-
ecient Erasure-coded Archival Scheme for In-memory Stores. In 48th
International Conference on Parallel Processing (ICPP 2019), August 5–8, 2019,
Kyoto, Japan. ACM, New York, NY, USA, 10 pages. https://doi.org/10.1145/
3337821.3337826
∗
Jianzhong Huang (hjzh@hust.edu.cn) and Qiang Cao (caoqiang@hust.edu.cn) are the
joint corresponding authors.
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for prot or commercial advantage and that copies bear this notice and the full citation
on the rst page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specic permission and/or a
fee. Request permissions from permissions@acm.org.
ICPP 2019, August 5–8, 2019, Kyoto, Japan
© 2019 Association for Computing Machinery.
ACM ISBN 978-1-4503-6295-5/19/08... $15.00
https://doi.org/10.1145/3337821.3337826
1 INTRODUCTION
1.1 Motivation
The following three aspects motivate us to delve in the development
of an erasure-coded archival scheme for in-memory stores.
Aspect #1
–low access latency in in-memory stores. We are in
an era of data-driven business world. For example, data-intensive
analytics has become indispensable since enterprises want to gain
insights to products, services and marketing strategies from in-
creasing volumes of data. Commonly, a data-intensive application
is supported by a cluster consisting of hundreds of nodes and peta-
bytes of data. It is a technology trend to constitute an in-memory
store upon the cluster to achieve low-latency performance. A tra-
ditional case is that Facebook leverages Memcached as a building
block to construct a distributed key-value store facilitating the
world’s largest social network [19].
Aspect #2
–demands for redundancy strategies. Since volatile
DRAM only maintains data while it is powered, existing in-memory
stores accomplish memory-level fault-tolerance by applying repli-
cation and/or erasure coding. Replication is a simple yet eective
redundancy scheme. For instance, Repcached [
15
] and Bigmem-
ory [
28
] keep two replicas in memory among nodes. Compared
to the replication, erasure coding achieves higher space eciency
dened as a ratio of user data to the combination of user data
and redundancy data [
12
]. Unsurprisingly, space-ecient erasure
codes are also adopted by in-memory stores, e.g.,
EC-cache
[
21
],
Cocytus [31], MemEC [30], and Ring [24].
Aspect #3
–necessity of erasure-coded archival. An analysis of
traces collected from Facebook’s Memcached deployment shows
that caching workloads exhibit long-tail distributions, in which a
small percentage of keys appeared in most of the requests whereas
most keys repeated only a handful of times [
3
]. Therefore, it is not
economical to employ a single redundancy scheme for the entire
in-memory data (i.e., metadata, key and value in in-memory key-
value stores) within the data lifetime. Nowadays, some key-value
stores (e.g., Memcached in Facebook [
19
], Ring [
24
]) adopt a hybrid
redundancy strategy, where replication is applied to popular data,
while erasure codes are employed for unpopular data.
Generally, newly-loaded data is kept in a replication manner to
support high access parallelism. Since most of the new data are
infrequently accessed, it is wise to encode unpopular data replicas
using erasure codes to achieve high space eciency. We refer to
such an encoding process as ‘erasure-coded archival’.
1.2 Challenges and Strategies
We face two challenges during the course of designing an erasure-
coded archival scheme for in-memory stores.
ICPP 2019, August 5–8, 2019, Kyoto, Japan Bin Xu, Jianzhong Huang, Qiang Cao, and Xiao Qin
Challenge 1: How to reduce cross-rack trac while performing
erasure-encoding op erations? Cross-rack bandwidth is usually scarce
compared to intra-rack one; ideally, cross-rack trac should be elim-
inated. If data replicas are randomly distributed among multiple
racks, then subsequent sequential encoding will retrieve required
blocks via cross-rack transfer. Therefore, it makes sense to elabo-
rately distribute data-block replicas.
Challenge 2: How to holistically consider locality and exibility
in stripe construction? Non-sequential striping potentially enables
exible stripe construction for unpopular blocks. To exploit locality,
we have to restrict candidate members in a stripe to a small subset
of unpopular data blocks and; therefore, the exibility of stripe
construction is worsened. In this paper, locality is taken into account
while employing non-sequential striping.
It is arguably true that the replica placement is critical to reliabil-
ity, access locality, load balancing, and the like. A typical example is
the rack-aware data placement in HDFS [
4
]. To accomplish highly-
ecient erasure encoding operations for unpopular replicas, it is
expected to introduce an encoding-oriented placement for repli-
cas, which satises the following three requirements. (i) Rack-level
fault tolerance should be guaranteed for data-block replicas prior
to encoding. (ii) The primary copies of a group of sequential data
blocks should be placed to a rack; thus, blocks in the rack can con-
struct a complete stripe during encoding. (iii) Cross-rack trac
should be minimized after encoding, because encoded blocks may
be re-distributed to sustain rack-level fault tolerance.
To address the above requirements and challenges, we propose
an Encoding-oriented Replica Placement policy -
ERP
, in which
an Interleaved Declustering mechanism [
10
] is introduced to dis-
tribute in-memory data-block replicas among racks. Furthermore,
we elaborately design a Trac-ecient Erasure-coded Archival
scheme -
TEA
- for ERP-powered in-memory stores. The TEA
scheme has the following three salient features. (i) It alleviates
cross-rack trac raised by retrieving required data-block replicas.
(ii) It improves rack-level load balancing by distributing replicas via
load-aware primary-rack-selection approach. (iii) It mitigates block-
relocation operations launched to sustain rack-level and node-level
fault-tolerance.
1.3 Contributions
This work makes three main contributions:
•
We devise an encoding-oriented replica placement policy
(i.e., ERP) for data-block replicas in in-memory stores. ERP
enables subsequent erasure-encoding operations to alleviate
cross-rack trac while sustaining rack-level fault tolerance.
•
We design a trac-ecient erasure-coded archival scheme
(i.e., TEA) for ERP-powered in-memory stores. TEA ad-
dresses both cross-rack-trac and rack-level-load-balancing
issues in encoding-rack-designation and encoded-block-
relocation stages.
•
We implement a proof-of-concept prototype, where TEA and
four candidate encoding schemes are quantitatively evalu-
ated. The experiments illustrate that the TEA scheme not
only brings forth lower cross-rack trac than the other four
competitors, but also exhibits superb archival-throughput
and rack-level load-balancing performance.
1.4 Roadmap
The rest of the paper is organized as follows. Section 2 outlines
both erasure-coded archival techniques and schemes. ERP policy
and TEA scheme are detailed in Sections 3 and 4, respectively. We
present quantitative performance evaluation in Section 5. Section 6
concludes this paper.
2 RELATED WORK
2.1 Erasure-coded Archival Techniques
Replication and erasure coding are two main fault-tolerant tech-
niques. Replication and erasure coding perform well in the aspects
of access parallelism and space eciency, respectively [26]. Apart
from on-disk storage (e.g.,
GFS II
[
11
], HDFS [
5
]), in-memory stores
(e.g., Memcached in Facebook [
19
], Ring [
24
]) also adopt a hybrid
redundancy strategy, in which replication is applied to newly cre-
ated data whereas erasure coding is used to archive the same data
once it becomes unpopular. Such a transition from replication to
erasure coding is referred to as erasure-coded archival.
Among various erasure codes, Reed-Solomon (a.k.a., RS) codes
become the most popular choices owing to the RS codes’ maximum
distance separable (MDS) feature as well as high level of fault tol-
erance [
26
]. RS codes accomplish parity generation by adopting
simple linear combinations. Especially, as for (k+r,k) RS codes,
r
parity blocks {P
1
,
P
2
, ...,
P
r
} are generated by multiplying
k
data
blocks {D
1
,
D
2
, ...,
D
k
} with a k
×
r redundancy matrix[
18
]. Both
k
data blocks and
r
associated parity blocks constitute a stripe, and
(k+r,k) RS codes tolerate data loss of any r concurrent blocks.
Typically, the following four steps are involved in a (k+r,k) RS-
coded archival operation. (i) An encoding node retrieves one replica
of each of
k
data blocks from local and/or remote nodes; (ii) it
computes
r
parity blocks from the
k
data blocks using (k+r,k) RS
codes; (iii) it delivers
r
or
r-
1 parity blocks to other nodes (Note:
it is of
r-
1 blocks when one parity block is kept by the encoding
node); and (iv) only one replica is kept (i.e., not deleted) for each
data block while deleting the other replicas.
To accomplish a
r
-node fault-tolerant distribution for in-memory
stores,
k
data blocks and
r
parity blocks in a stripe are exclusively
placed to the main memories of k+r nodes during the archival steps
(iii) and (iv). In practice, a node may cache a data or parity block
in a stripe and a data or parity block in another stripe. Usually, a
data block is sealed from multiple small-sized objects in in-memory
stores [31][30].
2.2 Erasure-coded Archival Schemes
Most of erasure-coded archival schemes are focused on distributed
storage systems. For example, DP [
14
], RapidRAID [
20
], aHDFS [
7
],
EAR [
17
], DSC [
27
], Sice [
29
], and to name just a few. Dierently,
our TEA scheme aims at in-memory stores.
DP is an erasure-coded data archival scheme for storage clus-
ters, in which a chained-declustering layout is applied to organize
Mirrored
RAID-5
redundancy groups [
14
]. DP boosts data archival
performance by leveraging the existence of replicas and employing
a pipelined encoding process.
RapidRAID is a family of erasure codes that incorporate pipelined
erasure coding to speed up archival processes [
20
]. Unlike RS codes,
剩余9页未读,继续阅读
资源评论
weixin_38581447
- 粉丝: 8
- 资源: 911
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功