分布式存储系统数据安全及性能研究.pdf资源-CSDN文库

版权申诉

154 浏览量 2022-06-21 14:12:38 上传评论收藏 1.31MB PDF 举报

分布式存储系统数据安全及性能研究分布式存储系统是指通过网络将多个计算机的存储资源集中起来，形成一个虚拟的存储设备，以存储界面形式提供存储服务。相比传统的集中式存储，分布式存储系统具有强大的可用性和可扩展性。CEPH 是一种基于对象存储的分布式存储系统，它可以通过添加普通服务器到集群中，轻松扩展存储规模到 PB 级别，并且具有无单点故障的特点。然而，CEPH 的 CRUSH 算法在不同的网络条件下没有考虑存储效率问题，_Default 使用的 StrawBucker 存储类型在大量查询操作中不适用，而 Treebucket 类型在数据迁移时迁移量太大。因此，本文对 CEPH 的存储性能进行了优化研究。本文分析了 CEPH 的缺陷，其默认使用 Primary-Replica 模型进行读写操作，使得系统在写操作时出现瓶颈，节点权重仅与容量相关，而不考虑网络延迟对存储性能的影响。为解决这个问题，本文提出了基于网络延迟的 CEPH 存储性能优化方法。该方法可以动态调整节点权重，当节点网络状况不好时，控制数据流入节点的数量，使数据更可能流入到网络延迟小的节点中。本文讨论了数据迁移对分布式存储系统性能的影响。在分布式存储系统中，数据迁移是一个非常重要的操作，它可以实现负载均衡、数据容灾等功能。但是，数据迁移也会带来一些问题，如迁移量大的问题、数据一致性的问题等。为解决这些问题，本文提出了基于 CRUSH 算法的数据迁移优化方法。该方法可以在保持数据一致性的同时，尽量减少迁移量，提高数据迁移效率。本文讨论了分布式存储系统的安全性问题。分布式存储系统的安全性是一个非常重要的问题，它关系到数据的安全性和可靠性。为解决这个问题，本文提出了基于访问控制和加密技术的分布式存储系统安全性解决方案。该方案可以确保数据的安全性和可靠性，同时也可以提高系统的可扩展性和可用性。本文对分布式存储系统的数据安全性和性能进行了深入研究，并提出了基于 CRUSH 算法的数据迁移优化方法、基于网络延迟的存储性能优化方法和基于访问控制和加密技术的安全性解决方案。这些方法和方案可以提高分布式存储系统的性能和安全性，为业务提供更好的支持。

资源详情

资源评论

– II –

进行迁移时的迁移量将达到最小，且不会改变集群存储拓扑结构，提升迁移效率。

关键词：CEPH;CRUSH;分布式存储;数据迁移;负载均衡;数据容灾;

万方数据

Abstract

III

Abstract

With the exponential growth of the amount of data, traditional centralized storage has

been unable to meet business needs. Unlike centralized storage, distributed storage

consolidates the storage resources of each common computer through the network and

consolidates it into a virtual storage device that provides storage services in the form of a

storage interface over the Internet.So distributed storage has a strong usability, scalability.

Ceph is a distributed storage system based on object storage. As a result of the use of

object storage, the data processing process is highly parallel, by adding a common server to

the cluster, Ceph can easily expand the storage scale to the PB level, and as one of the core

algorithm CRUSH (Controlled Replication Under Scalable Hashing) can Dynamic

computing data storage location, that make it a system without a single point of failure.

However, the CRUSH algorithm does not take into account the storage efficiency issues

under different network conditions, and the default use of the StrawBucker storage type in

the CRUSH algorithm does not apply to the use of a large number of query operations,

where the Treebucket type, when data is migrated, has too big migration. In view of these

shortcomings, this paper has done the following works:

1. Ceph default to use the Primary-Replica model to read and write operations, which

will make the system bottlenecks when in write operation, and the weight of the node

weight is only related to capacity, without considering the impact of network latency on

storage performance. This paper proposes a CEPH storage performance optimization

method based on network delay for this problem. The method can dynamically adjust the

node weight when the node network is in poor condition, control the amount of data

flowing into the node, and make the data larger probability into the node with small

network delay, so as to improve the performance of the system.

2. Nodes in the Ceph must be in the same network segment, when a server is attacked,

it is likely that all the servers have been attacked, the data’s security is not effective

protection. This paper considers the data need to secure storage, protection of user privacy

and other reasons, design and initial realize a multi-cloud center for the distributed storage

system. Data files through the system can be cutted into several blocks and stored in

multiple cloud centers, in order to ensure that if a cloud center is broken and it’s data file

can not be restored into a complete file and the migration efficiency of data when the node

fails. Two kinds of storage topology construction and disaster recovery scheme are studied.

It can improve the data migration speed and reduce the migration amount respectively:

(1) The two-tier topology construction program based on the cloud center.

万方数据

摘要 ..................................................................................................................................... I

Abstract ............................................................................................................................... III

第一章绪论 .......................................................................................................................... 1

§1.1 研究背景 .................................................................................................................................. 1

§1.2 研究意义及国内外研究现状................................................................................................... 1

§1.3 论文的主要内容 ...................................................................................................................... 3

§1.4 论文的组织结构 ...................................................................................................................... 4

第二章相关技术 .................................................................................................................. 5

§2.1 存储方式分类 .......................................................................................................................... 5

§2.2 Ceph 分布式存储系统 .............................................................................................................. 5

§2.2.1 基本组件架构 .............................................................................................................. 6

§2.2.2 数据寻址流程 .............................................................................................................. 7

§2.2.3 CRUSH(Controlled Replication Under Scalable Hashing)简介 ................................... 9

§2.2.4 Ceph 中的 FileStore .................................................................................................... 13

§2.3 大规模存储的关键技术 ........................................................................................................ 15

§2.3.1 磁盘阵列技术 ............................................................................................................ 15

§2.3.2 数据冗余技术 ............................................................................................................ 16

§2.3.3 数据库 ........................................................................................................................ 18

§2.3.4 负载均衡技术 ............................................................................................................ 19

§2.3.5 数据安全 .................................................................................................................... 20

§2.4 本章小结 ................................................................................................................................ 21

第三章基于网络延时的 CEPH 存储性能优化方法 ........................................................ 22

§3.1 CEPH 默认数据读写流程机制 .............................................................................................. 22

§3.2 CEPH 数据读写流程缺陷 ...................................................................................................... 23

§3.3 基于节点时延的优化方案..................................................................................................... 23

§3.4 实验与分析 ............................................................................................................................ 24

§3.4.1 实验设计 .................................................................................................................... 24

§3.4.2 实验结果与分析 ........................................................................................................ 25

§3.5 本章小结 ................................................................................................................................ 28

第四章面向多个云中心的数据存储系统设计与实现 .................................................... 29

§4.1 系统总体设计 ........................................................................................................................ 29

§4.2 集群监控节点设计 ................................................................................................................ 30

§4.2.1 Paxos 算法 ................................................................................................................... 30

§4.2.2 监控节点设计 ............................................................................................................ 32

§4.2.3 基于云中心的双层拓扑构建方案 ............................................................................ 36

万方数据

§4.3 集群数据节点设计 ................................................................................................................ 37

§4.3.1 一致性 hash 算法 ....................................................................................................... 37

§4.3.2 数据节点设计 ............................................................................................................ 38

§4.3.3 基于 RAID 热备盘技术的数据容灾方案 ................................................................. 39

§4.4 集群客户端节点设计 ............................................................................................................ 41

§4.4.1 集群客户端节点设计 ................................................................................................ 41

§4.4.2 系统运行流程 ............................................................................................................ 41

§4.4.3 动态 OSD 选择算法 .................................................................................................. 43

§4.5 实验与分析 ............................................................................................................................ 45

§4.5.1 面向多个云的数据存储系统测试 ............................................................................ 45

§4.5.2 容灾方案实验设计 .................................................................................................... 45

§4.5.3 实验结果与分析 ........................................................................................................ 47

§4.5.4 容灾方案总结 ............................................................................................................ 48

§4.6 本章小结 ................................................................................................................................ 48

第五章总结与展望 ............................................................................................................ 50

§5.1 工作总结 ................................................................................................................................ 50

§5.2 下一步工作 ............................................................................................................................ 51

参考文献 .............................................................................................................................. 52

致谢 .................................................................................................................................... 56

作者在攻读硕士期间主要研究成果 .................................................................................. 57

万方数据

剩余58页未读，继续阅读

评论收藏

内容反馈

版权申诉

分布式存储系统数据安全及性能研究.pdf

评论0

最新资源

分布式存储系统数据安全及性能研究.pdf

评论0

最新资源

相关推荐

分布式数据库系统体系结构.pdf

SOA实践指南-分布式系统设计的艺术.pdf

2013中国数据大会ppt（2）

2013中国数据库大会ppt（1）

2013中国数据库大会ppt（3）

全球软件开发大会2021上海站ppt合集(28个主题共84份).zip

MySQL数据库运维视频教程.zip

大型分布式网站架构设计与实践.带目录书签.完整版.rar

SOA实践指南 分布式系统设计的艺术.part3

MySQL性能调优与架构设计.pdf

大数据开源框架集锦.pdf

数据库架构设计.pdf

5G+工业互联网边缘计算行业研究.pdf

相关实用应用程序（Windows可用）

李飞飞自传 我看见的世界 The World I see

ChatGPT使用总结：150个ChatGPT提示词模板（完整版）

全国计算机二级WPSoffice精选350道选择题题库（含答案）.pdf

Visio2013 安装包及破解方法

COMSOL流固耦合+传热 应用流固耦合、流体传热、多物理场研究在不同流体介质中固体形变位移及传热情况

Hsp16.3蛋白纯化及其分子伴侣活性研究-实验报告-英文报告

西方印迹技术在蛋白质检测中的应用及其结果分析-实验报告-英文报告-WORD文档

第12章-光的衍射理论及其应用-工程光学(韩军)-PPT308页

SDS-PAGE实验：蛋白质分子量的测定与纯化效果分析-实验报告-英文报告-WORD文档

凝胶层析法在蛋白质分子质量测定中的应用-实验报告-WORD文档

eetop.cn-07-1射频电路设计理论与应用-王子宇 -课后答案1-10章

SOA实践指南分布式系统设计的艺术.part3

李飞飞自传我看见的世界 The World I see

COMSOL流固耦合+传热应用流固耦合、流体传热、多物理场研究在不同流体介质中固体形变位移及传热情况