【免费】建立一个加密的，分布式的和可搜索的键值存储资源-CSDN文库

研究论文

需积分: 0 83 浏览量 2021-03-08 04:32:44 上传评论收藏 1.12MB PDF 举报

资源详情

资源评论

资源推荐

Building an Encrypted, Distributed, and Searchable

Key-value Store

Xingliang Yuan

†‡

, Xinyu Wang

†‡

, Cong Wang

†‡

, Chen Qian

, Jianxiong Lin

†

Department of Computer Science, City University of Hong Kong, China

‡

City University of Hong Kong Shenzhen Research Institute, China

Department of Computer Science, University of Kentucky, Unite States

{xl.y, j.lin}@my.cityu.edu.hk, {xinywang, congwang}@cityu.edu.hk, qian@cs.uky.edu

ABSTRACT

Modern distributed key-value stores are oﬀering superior

performance, incremental scalability, and ﬁne availability

for data-intensive computing and cloud-based applications.

Among those distributed data stores, the designs that en-

sure the conﬁdentiality of sensitive data, however, have not

been fully explored yet. In this paper, we focus on designing

and implementing an encrypted, distributed, and searchable

key-value store. It achieves strong protection on data pri-

vacy while preserving all the above prominent features of

plaintext systems. We ﬁrst design a secure data partition

algorithm that distributes encrypted data evenly across a

cluster of nodes. Based on this algorithm, we propose a se-

cure transformation layer that supports multiple data mod-

els in a privacy-preserving way, and implement two basic

APIs for the proposed encrypted key-value store. To en-

able secure search queries for secondary attributes of data,

we leverage searchable symmetric encryption to design the

encrypted secondary indexes which consider security, eﬃ-

ciency, and data locality simultaneously, and further enable

secure query processing in parallel. For completeness, we

present formal security analysis to demonstrate the strong

security strength of the proposed designs. We implement

the system prototype and deploy it to a cluster at Microsoft

Azure. Comprehensive performance evaluation is conducted

in terms of Put/Get throughput, Put/Get latency under

diﬀerent workloads, system scaling cost, and secure query

performance. The comparison with Redis shows that our

prototype can function in a practical manner.

Keywords

Key-value Store; Searchable Encryption

1. INTRODUCTION

In order to manage the persistently growing amount of

data, distributed key-value (KV) stores have become the

backbone of many public cloud services [11, 16, 33]. Their

Permission to make digital or hard copies of all or part of this work for personal or

classroom use is granted without fee provided that copies are not made or distributed

for proﬁt or commercial advantage and that copies bear this notice and the full cita-

tion on the ﬁrst page. Copyrights for components of this work owned by others than

ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-

publish, to post on servers or to redistribute to lists, requires prior speciﬁc permission

and/or a fee. Request permissions from Permissions@acm.org.

ASIA CCS ’16, May 30-June 03, 2016, Xi’an, China

 2016 ACM. ISBN 978-1-4503-4233-9/16/05. . . $15.00

DOI: http://dx.doi.org/10.1145/2897845.2897852

well-understood advantages include high performance, linear

scalability, continuous availability, and even great potentials

of high-level support on rich queries and multiple data mod-

els, as seen in a number of proposals and implementations of

recent KV stores [12, 14,17, 27]. However, with the growing

data breaches, privacy concerns in data outsourcing become

even more pressing than before. Recent works from both

cryptographic perspective, e.g., [5, 10, 15, 19] and database

perspective, e.g., [29, 32, 37], provided solutions with trade-

oﬀs among security, eﬃciency, and query functionality. Yet,

most of them focus on the setting of a centralized or log-

ically centralized server. They do not speciﬁcally consider

the features and the requirements in modern KV stores.

As known, KV stores are probably the simplest data stor-

age system, which stores pairs of primary keys and data

values, and allows to access data values when a primary

key is given. To beneﬁt a variety of data-driven applica-

tions, modern KV stores provide higher-level features. As

one promising feature, multiple richer data models such as

column-oriented, document and graph data are supported

on top of one united KV store. This feature eases the oper-

ational complexity for applications that require more than

one format of data [13, 14, 27]. For the other popular fea-

ture, many KV stores allow for the data access not just from

primary keys, but also from other attributes of data via sec-

ondary indexes, so as to enable more eﬃcient data access and

rich queries [12,16,33]. Although promising, building an en-

crypted, distributed KV stores still face gaps and encounter

challenges, especially for preserving the above salient fea-

tures in a privacy-preserving manner.

Limitations of prior work. One straightforward ap-

proach is to directly store encrypted data along with the

(possibly randomized) data identiﬁer/label [30]. However,

it only allows limited encrypted data retrieval by the iden-

tiﬁer/label, preventing from all possible queries via other

secondary attributes of data. Besides, this approach does

not consider the support of multiple data models.

Another seemingly plausible approach is to treat KV

stores as a blackbox dictionary, and to build an encrypted

secondary index, like the one proposed by Cash et al. in [5].

But this direct combination, though slightly enhancing the

ﬁrst approach by limiting the query support to the encrypted

index design, would inevitably suﬀer from secure queries

with long latency. Because they treat the distributed KV

store as a blackbox dictionary, the data locality in the en-

crypted query processing can hardly be preserved. In other

words, the node where the index is accessed could be dif-

ferent from the node where the matched data are stored.

547

Data$Applica*ons$

Secure$Trans.$Layer$

Encrypted$K9V$Store$

Secure$Na*ve$API$

Figure 1: The proposed system framework

This approach will introduce extra inter-node communi-

cation overhead, undesirably increasing the query latency.

And the issue will be further exacerbated in more complex

query processing such as aggregation and range queries.

Design goals. In this work, we aim to build an encrypted

and searchable KV store. For data conﬁdentiality, all the

data should remain in the encrypted form as long as they

leave from the client of the data application. Meanwhile, it

should also embrace a bunch of prominent features of non-

encrypted distributed KV stores, such as low latency, load

balancing, horizontal scalability, and ﬁne availability. In ad-

dition, multiple data models should be supported without

loss of data conﬁdentiality. And developers can not only

commit simple retrieval and update requests on a single en-

crypted data value but also execute secure queries via sec-

ondary attributes of data.

Design challenges. Achieving all the design goals simul-

taneously is a non-trivial task. As discussed, straightfor-

wardly using existing KV implementations for the security

counterpart is unsatisfying. Therefore, we propose to build

and evaluate our system by carefully designing the security

sub-components of our system whenever necessary, and cope

with the following challenges. The ﬁrst is to securely realize

the data partition algorithm across distributed nodes for en-

crypted data, while achieving high performance lookup, load

balancing, and linear scalability. Each encrypted data value

should only be located by a speciﬁed token generated from

an authorized client. Designing a customised secure data

partition algorithm allows us to pave the way for securely

preserving data locality and improving query eﬃciency.

The second challenge is to design an overlay that supports

multiple data models on top of the encrypted KV store. It

should be compatible with the secure data partition algo-

rithm while hiding auxiliary information of ciphertext, e.g.,

the structural information in column-oriented stores, which

is shown to be potentially vulnerable in terms of inference

attacks [26]. The third challenge is to design a framework for

encrypted and distributed indexes that enable secure queries

on given secondary attributes over designated nodes. The

index framework should address the data locality to avoid

inter-node interactions, and provide us a platform to easily

incorporate all the latest secure rich query designs into th

encrypted and distributed KV store.

Contributions. Our system design considerations are illus-

trated below to tackle the above challenges. We will ﬁrstly

use secure pseudo-random functions and standard symmet-

ric encryption to build an encrypted KV store, shown in Fig-

ure 1. Essentially, each KV pair contains a pseudo-random

label and an encrypted value: 1) it is simple yet secure, and

inherently compatible with the oﬀ-the-shelf data partition

algorithm (i.e., consistent hashing), achieving load balanc-

ing and incremental scalability; 2) it is a stepping stone such

that many other data models can be ﬂexibly built on top.

We then introduce a secure transformation layer between

the encrypted KV store and data applications, depicted in

Figure 1. It formulates an extensible abstraction, which

maps the data formated from diﬀerent data models to simple

KV pairs in a privacy-preserving fashion, i.e., both data val-

ues and their inherent relationships are strongly protected.

In this paper, we choose the column-oriented data model

as our ﬁrst instantiation, supported by wide column stores,

e.g., Cassandra [23] and HBase [16]. Afterwards, we also

detail the possible adaptation on graph data and document

data. As a result, it separates data management and data

manipulation from the storage back end, and accrues the

beneﬁts of the encrypted KV store.

In order to enable secure queries based on secondary at-

tributes of data, we propose a framework for encrypted local

indexes. This framework takes into consideration of dis-

tributed processing, KV store beneﬁts, and the ﬂexibility

of instantiating various encrypted secondary indexes in the

very beginning. By integrating our customised secure data

partition algorithm, we can always ensure that the encrypted

secondary index co-locates along with its own data on the

same node. Namely, secure queries are conducted over dis-

tributed nodes in parallel without extra connections, inter-

actions, or data movement. In this paper, we make our

ﬁrst attempt to support secure search queries that retrieve

encrypted data with matched attributes. The proposed in-

dex construction leverages searchable symmetric encryption

(SSE), a security framework for private keyword search, and

follows one of the latest SSE constructions with sublinear

time, asymptotically optimal space complexity, and prov-

able security [5].

In brief, our contributions are listed as follows:

• We present an encrypted, distributed, and searchable

key-value store that ensures strong data protection.

We propose a secure data partition algorithm and de-

velop two basic APIs for retrieval and update on a

single encrypted data value, i.e., Put/Get. When the

system scales out, the aﬀected encrypted data can be

relocated without loss of conﬁdentiality.

• We introduce a secure transformation layer that sup-

ports secure data modeling. It maps the encrypted

data formated from diﬀerent data models to encrypted

KV pairs while hiding the structural information of

data. This design is fully compatible with the pro-

posed secure data partition algorithm.

• We propose a framework of encrypted local indexes

that considers both data locality and incremental scal-

ability, where each local index resides the data on the

same node. Our design enables secure search queries

over encrypted and distributed data in parallel. In ad-

dition, we conduct formal security analysis to demon-

strate the strong security strength of this design.

• We develop the system prototype, and deploy it to

Microsoft Azure. The experiments show comprehen-

sive evaluation on Put/Get latency, throughput, in-

dex query performance, and system scaling costs. The

comparisons with Redis on plaintext data show that

548

our system functions in a practical manner with little

security overhead introduced.

Organization. Section 2 introduces the system architec-

ture and the threat models. Section 3 elaborates on the

system design. The security analysis is presented in Sec-

tion 4. The implementation and the performance evalua-

tion are given in Section 5. We discuss the related works in

Section 6, and conclude in Section 7.

2. SYSTEM MODEL

2.1 System Architecture

Figure 2 illustrates the architecture of our proposed pri-

vate KY store. It consists of three entities: the client of a

data application, the dispatcher, and a number of clustered

storage nodes, where the latter two entities are deployed

at the public cloud. We envision that the data application

would like to outsource a huge amount of data to a cloud-

powered data store while ensuring the data conﬁdentiality.In

general, the dispatcher and the nodes are oﬀ-premises com-

modity servers or virtual machines. They are programmed

to execute speciﬁc algorithms and operations.

Our system operates in a distributed framework that guar-

antees data privacy while preserving high performance and

incremental scalability. It distributes the encrypted data to

all the nodes evenly, which inherently handles heavy work-

loads without revealing the underlying values. The dis-

patcher deals with the secure Put/Get requests generated

from the client for update and retrieval on a single encrypted

data value. It routes the requests via a standard yet se-

cure data partition protocol, i.e., following the algorithm

of consistent hashing but over the encrypted domain. It is

like a logically centralized hub, forwarding all the requests

to the target nodes. To increase the throughput and avoid

single-point failure, it can be physically distributed and syn-

chronized between multiple nodes just like in HBase [16], or

cached to the client and updated periodically just like in Dy-

namo [11]. After the request routing, the nodes respond the

requests and send the encrypted values back to the client.

For the data in the simple KV model, the client will directly

generate secure Put/Get requests. While for the data in

other models, the secure transformation layer at the client

will transform the requests on encrypted structured/semi-

structured data into the secure KV Put/Get requests.

In order to query the encrypted data through secondary

attributes, the encrypted secondary index integrates the se-

cure data partition algorithm so that each node can index

its local data and process a given secure query in parallel.

The proposed secure search query generates tokens for each

of the nodes to provision them the search ability over dis-

tributed encrypted data. In addition, the general framework

of encrypted local indexes can readily support known secure

queries like counting, range, and aggregation. To handle

heavy workloads, new nodes are incrementally added with

moderate impact in our system. The aﬀected encrypted data

can be directly relocated via the secure data partition algo-

rithm by the corresponding nodes, while the aﬀected indexes

are moved in a client-assisted manner, i.e., being rebuilt at

the client for security considerations.

2.2 Threat Assumption

In this paper, we are targeting the threats from semi-

honest adversaries. They are interested in the data, the

Client'

Dispatcher'

Node'

Secure Token

Figure 2: System architecture

query attributes and the query results. In Figure 2, the

nodes are allocated in the oﬀ-premises public cloud. To

meet the requirements of performance and availability, they

have to be always online, and thus are vulnerable to security

breaches and unauthorized disclosures [21, 34].

On the one hand, outside unauthorized adversaries, who

compromise some or all of the nodes, gain the privilege to

access the hard disks and the physical memory. They at-

tempt to learn the sensitive information about the data. On

the other hand, cloud service providers may also be a threat.

They could faithfully maintain the nodes, and execute the

operations as required, but intentionally learn the data, or

unintentionally mishandle the data to third parties. We as-

sume that the client is in the trusted domain, and does not

collude with those adversaries. Besides, the communication

channels are assumed to be authenticated and encrypted

against eavesdropping. Our security goal is to protect data

conﬁdentiality throughout their life-cycles. User authenti-

cation and data integrity are orthogonal to our focus.

2.3 Cryptographic Primitives

A symmetric encryption scheme SE(KGen, Enc, Dec)

contains three algorithms: The key generation algorithm

KGen takes a security parameter k to return a secret key

K. The encryption algorithm Enc takes a key K and a

value V ∈ {0, 1}

∗

to return a ciphertext V

∗

∈ {0, 1}

∗

; The

decryption algorithm Dec takes K and V

∗

to return V . De-

ﬁne a family of pseudo-random functions F : K × X → R,

if for all probabilistic polynomial-time distinguishers Y ,

|P r[Y

F (k,·)

= 1|k ← K] − P r[Y

= 1|g ← {Func : X →

R}]| < negl(k), where negl(k) is a negligible function in k.

3. THE PROPOSED SYSTEM

This section elaborates on our system design. We ﬁrst in-

troduce the secure data partition algorithm which can dis-

tribute the encrypted data evenly across the nodes without

knowing the underlying value. Then we use the column-

oriented data model as the instantiation to present how our

system maps the data to encrypted KV pairs. Other data

models like graph and document data are also supported.

Subsequently, we implement two APIs for secure and fast ac-

cess of the encrypted KV store. To enable the secure query

on secondary attributes of data, we propose the construction

of encrypted local indexes. We then show how the system

scales out when new nodes are added.

3.1 Encrypted Key-value Store

We ﬁrst introduce the construction of the proposed en-

crypted and distributed KV store, where each KV pair con-

549

剩余11页未读，继续阅读

评论收藏

内容反馈

weixin_38635323

粉丝: 9
资源: 955

建立一个加密的，分布式的和可搜索的键值存储

评论0

最新资源

建立一个加密的，分布式的和可搜索的键值存储

评论0

Android-Kissme:一个提供加密键值存储的开源库

cpp-JustinDB一个高度可用的全球分布式键值数据存储

scalaris:Scalaris，一个分布式的交易键值存储

一个适合分布式网络的属性基加密方案.pdf

C#加密解密源码,有不可逆加密和可逆加密，可逆加密，MD5不可逆加密等几种方法

正版秒开云挂机宝环境下支持压缩加密的键值存储技术研究.pdf

基于国密算法的分布式加密存储研究.pdf

react-native-mmkv-storage：一个用于React Native的高效，小型和加密的移动键值存储框架

java1.6源码-FreshWorks_2021:建立一个支持基本CRD（创建，读取和删除）操作的基于文件的键值数据存储。该数据存储旨在用作

GhostDB是一个分布式的，内存中的通用键值数据存储，可在任何规模上提供微秒级的性能。-Golang开发

java1.6源码-backend:Freshworks–后端分配建立一个基于文件的键值数据存储，该存储支持基本CRD（创建，读取和删除）操作

基于国密算法的分布式加密存储研究

react-native-async-storage,用于react native的异步、持久、键值存储系统。.zip

trousseau：基于文件的加密键值存储

安全高效的分布式加密存储系统设计(英文).pdf

Morph:fire:加密的内存键值存储-C/C++开发

Python-minikeyvalue是一个用不到200行代码实现的非代理分布式键值存储

一种基于局域网的分布式加密存储和鉴权方法.pdf

hydria：一个简单的加密分布式状态容器

分布式接口加密安全方案.pdf

数据分块并行优化的分布式AES加密算法.pdf

云环境下支持可更新加密的分布式数据编码存储方案.pdf

MD5加密 和可逆加密解密

论文研究-基于敏感数据加密的分布式安全数据库服务研究.pdf

基于.NET的分布式系统用户密码加密方案设计

最新资源

MD5加密和可逆加密解密