MySQL文档存储和Elasticsearch作为数据复制解决方案中的替代后端的性能和配置特性分析-外文翻译资源-CSDN文库

需积分: 5 110 浏览量 2023-10-09 14:41:42 上传评论收藏 51KB DOCX 举报

资源推荐

资源详情

资源评论

外文原文

A Comparative Study of MongoDB and Document-Based MySQL

for Big Data Application Data Management

By: Győrödi Cornelia A.,DumşeBurescu Diana V.,Zmaranda Doina R.,Győrödi Robert Ş.

Source: [J]Big Data and Cognitive ComputingVolume 6, Issue 2. 2022. PP 49-49

1. Introduction

Currently, an explosion of data to be stored has been observed to originate from

social media, cloud computing services, and Internet of Things (IoT). The term,

“Internet of Things” actually refers to the combination of three distinct ideas: a large

number of ‘‘smart’’ objects, all connected to the Internet, with applications and

services using the data from these objects to create interactions. Nowadays, IoT

applications can be made to be very complex by using interdisciplinary approaches

and integrating several emerging technologies such as human–computer interactions,

machine learning, pattern recognition, and ubiquitous computing. Additionally,

several approaches and environments for conducting out analytics on clouds for Big

Data applications have appeared in recent years.

The widespread deployment of IoT drives the high growth of data, both in

quantity and category, thus leading to a need for the development of Big Data

applications. The large volume of data from IoT has three characteristics that conform

to the Big Data paradigm: (i) Abundant terminals that generate a large volume of data;

(ii) the data generated from IoT is usually semi-structured or unstructured; (iii) the

data of IoT is only useful when it is analyzed.

As the volume of data has increased exponentially and applications must handle

millions of users simultaneously and process a huge volume of unstructured and

complex data sets, a relational database model has serious limitations when it has to

handle that huge volume of data. These limitations have led to the development of

non-relational databases, also commonly known as NoSQL (Not Only SQL). This

huge number of unstructured and complex data sets, typically indicated with the term

Big Data, are characterized by a large volume, velocity, and variety, and cannot be

managed efficiently by using relational databases, due to their static structure. For this

reason, software developers have also begun to consider NoSQL data storage

solutions. In today’s context of Big Data, the developments in NoSQL databases have

achieved the right infrastructure which can very much be well-adapted to support the

heavy demands of Big Data.

NoSQL databases are extensively useful when they are needed to access and

analyze huge amounts of unstructured data or data that are stored remotely on

multiple virtual servers. A NoSQL database does not store information in the

traditional relational format. NoSQL databases are not built on tables and, in some

cases, they do not fully satisfy the properties of atomicity, consistency, isolation, and

durability (ACID). A feature that is common to almost all NoSQL databases is that

they handle individual items, which are identified by unique keys. Additionally, their

structures are flexible, in the sense that schemas are often relaxed or free schemas. A

classification that is based on different data models has been proposed in [6,8], it

groups NoSQL databases into four major families, each based on a different data

model: Key–value-stores databases (Redis, Riak, Amazon’s DynamoDB, and Project

Voldemort), column-oriented databases (HBase and Cassandra), document-based

databases (MongoDB, CouchDB, and the document-based MySQL), and graph

databases (Neo4j, OrientDB and Allegro Graph). From the several NoSQL databases

that we have today, this paper focuses on document-based model databases, choosing

two well-known NoSQL databases, MongoDB and document-based MySQL, and

analyzing their behavior in terms of the performance of CRUD operations.

To perform performance analysis, a server application has been developed and

presented in this paper. The application serves as a backend for streamlining the

activity of small service providers, using the two document-based MongoDB and

MySQL data-bases, with an emphasis on how to use query operations through which

the CRUD operations are performed and tested, the analysis being performed on the

response times of these for a data volume of up to 100,000 items.

The paper is organized as follows: The first section contains a short introduction

emphasizing the motivation of the paper, followed by Section 2, which gives a short

overview of the two databases features, followed by Section 3, which reviews the

related work. The structure of the databases, methods, and the testing architecture

used in this work is described in Section 4. The experimental results and their

analysis on the two databases in an application that uses large amounts of data are

presented in Section 5. Discussions and the analysis of the obtained results are made

in Section 6, followed by some conclusions in Section 7.

2. Overview of MongoDB and the Document-Based MySQL

MongoDB is the most popular type of NoSQL database, with a continuous and

secure rise in popularity since its launch. It is a cross-platform, open-source NoSQL

database that is document-based (which is written in C++), completely schema-free,

and manages JSON-style documents. Improvements to each version, and its flexible

structure, which can change quite often during its development, provides automatic

scaling with high performance and availability. The document-based MySQL is not so

popular yet, with MySQL providing a solution for non-relational databases only since

2018, starting with version 8.0, which has several similarities but also several

differences regarding the model approach to MongoDB, as shown in Table 1.

The structure of both databases is especially suitable for flexible applications

whose structure is not static from the beginning, and it is expected that there will be

many changes along the way. When it comes to large volumes of data—in the order

of millions, even if thousands of queries per second are allowed in any type of

database, the way in which they manage operations and the optimizations that come

with the package define their efficiency, both being optimized to operate upon a large

volume of data. However, in MongoDB, access is based on the roles defined for each

user, and in document-based MySQL, access is achieved by defining a username and

password, benefiting from all of the security features available in MySQL. Both

databases are available and free of charge, and can be used to develop individual or

small projects at no extra cost. In the case of large applications, monthly or annual

subscriptions appear for MongoDB, which involve a cost of several thousand dollars.

For document-based MySQL, this is not specified.

In terms of security, both databases provide security mechanisms.

Document-based MySQL is a relatively new database, but it benefits from all the

security mechanism features offered by MySQL: encryption, audit, authentication,

and firewalls; in addition, MongoDB adds role-based authentication, encryption, and

TLS/SSL configuration for clients.

3. Related Work

There are many studies that have been conducted to compare different relational

databases with NoSQL databases in terms of the implementation language, replication,

transactions, and scalability. The authors provide an overview of the different NoSQL

databases, in terms of the data model, query model, replication model, and

consistency model, without testing the CRUD operations performed upon them. In the

authors outlined the differences between the MySQL relational database and

MongoDB, a NoSQL database, through their integration in an online platform and

then through various operations being performed in parallel by many users. The

advantage of using the MongoDB database compared to relational MySQL was

highlighted by performed tests, concluding that the query times of the MongoDB

database were much lower than those of the relational MySQL.

The authors present in a comparative analysis between the NoSQL databases,

such as HBase, MongoDB, BigTable, and SimpleDB, and relational databases such as

MySQL, highlighting their limits in implementing a real application by performing

some tests on the databases, analyzing both simple and more complex queries. In the

Open Archival Information System (OAIS) was presented, which exploits the NoSQL

column-oriented Database (DB) Cassandra. As a result of the tests performed, they

noticed that in an undistributed infrastructure, Cassandra does not perform very well

compared to MySQL. Additionally, the authors propose a framework that aims at

analyzing semi-structured data applications using the MongoDB database. The

proposed framework focuses on the key aspects needed for semi-structured data

analytics in terms of data collection, data parsing, and data prediction. In the paper,

the authors focused mainly on comparing the execution speed of writes/insert and

update/read operations upon different benchmark workloads for seven NoSQL

database engines such as Redis, Memcached, Voldemort, OrientDB, MongoDB,

HBase, and Cassandra.

The Cassandra and MongoDB database systems were described, presenting a

comparative study of both systems by performing the tests on various workloads. The

study involved testing the operations—reading and writing, through progressive

increases in client numbers to perform the operations, in order to compare the two

solutions in terms of performance.

The authors performed a comparative analysis of the performance of three

non-relational databases, Redis, MongoDB, and Cassandra, by utilizing the YCSB

(Yahoo Cloud Service Benchmark) tool. The purpose of the analysis was to evaluate

the performance of the three non-relational databases when primarily performing

inserts, updates, scans, and reads using the YCSB tool by creating and running six

workloads. YCSB (Yahoo Cloud Service Benchmark Client) is a tool that is available

under an open-source license, and it allows for the benchmarking and comparison of

multiple systems by creating “workloads”.

An analysis of the state of the security of the most popular open-source databases,

representing both the relational and NoSQL databases, is described, and includes

MongoDB and MySQL. From a security point of view, both these databases need to

be properly configured so as to significantly reduce the risks of data exposure and

intrusion.

Between MongoDB and MySQL, several comparisons exist in the literature, most

of them focusing on a comparison with relational MySQL, and not with

document-based MySQL; for example, a login system project developed using Python

programming language was used to analyze the performance of MongoDB and

relational MySQL, based on the data-fetching speed from databases. This paper

performed an analysis of the two databases to decide which type of database was

more suitable for a login-based system. The paper presented presents information on

the upsides of the NoSQL databases over the relational databases during the

investigation of Big Data, by making a performance comparison of various queries

and commands in the MongoDB and relational MySQL. Additionally, the concepts of

NoSQL and the relational databases, together with their limitations. Consequently,

despite the fact that MongoDB has been approached in many scientific papers, to our

knowledge, at the time of writing this paper, no paper has focused directly on

comparing it with the document-based MySQL.

4. Method and Testing Architecture

For each database considered, an application was created in Java using IntelliJ

IDEA Community Edition (4 February 2020), which aims to develop a server for the

processing and storage of data on the frontend. When creating the testing architecture

setup, it was considered that it is very important to test the types of databases that

exactly fit the criteria that are imposed in an application that is similar to the one to be

developed, and not just by using their tools; such as for MongoDB, the MongoDB

web interface, or the Mongo shell, because there are differences, both in how to use

them and with regard to the response times, which if tested directly may seem easy

and fast, but in practice itself are found to be slower or more difficult to achieve.

The two applications are identical in terms of structure, with both containing the

objects that we need and a service class for each object, annotated with @Service. In

addition to these classes, each application contains a class within which there is a cron

(a process by which a method can be called automatically and repeatedly at a range

set by us, taking as a parameter a string that is composed of six digits separated by a

剩余24页未读，继续阅读

评论收藏

内容反馈

Q2643365023

粉丝: 868
资源: 45

MySQL文档存储和Elasticsearch作为数据复制解决方案中的替代后端的性能和配置特性分析-外文翻译

elasticsearch-6.3.0-API文档-中文版.zip

elasticsearch-rest-client-6.8.3-API文档-中文版.zip

elasticsearch-6.8.3-API文档-中文版.zip

elasticsearch-6.2.3-API文档-中文版.zip

elasticsearch-rest-high-level-client-6.8.3-API文档-中文版.zip

loolly-elasticsearch-definitive-guide 官网ES文档中文翻译|loolly-elasticsearch-definitive-guide-cn-master.zip

elasticsearch-6.8.3-API文档-中英对照版.zip

Go-go-mysql-elasticsearch-自动同步你的MySQL数据到Elasticsearch

特别有用的MySQL数据实时同步到ES轻松配置手册

elasticsearch-rest-high-level-client-6.8.3-API文档-中英对照版.zip

elasticsearch-6.2.3.jar中文-英文对照文档.zip

elasticsearch-rest-client-6.3.0-API文档-中文版.zip

elasticsearch-rest-client-6.3.0-API文档-中英对照版.zip

Python-同步mysql数据到elasticsearch的工具

ElasticSearch Java API 中文文档

MySQL-ElasticSearch数据同步工具py-mysql-elasticsearch-sync.zip

ElasticSearch 可扩展的开源弹性搜索解决方案

本地简单kettle抽MySQL数据到ES中 案例.zip

elasticsearch-5.5.1-API文档-中文版.zip

SwitchHosts

安卓期末大作业（AndroidStudio开发），垃圾分类助手app，分为前台后台，代码有注释，均能正常运行

Notepad++安装包

微信小程序源码-合集1.rar

2024北森能力测评题库.7z

Java面试八股文2023最新版

Linux Centos7 升级最新版OpenSSH-9.6p1 有脚本（支持离线）

Java第十五届蓝桥杯大赛软件JavaB组真题

ruoyi-vue-pro 芋道源码项目的表结构

ruoyi-vue-pro开发指南PDF下载

最新资源

本地简单kettle抽MySQL数据到ES中案例.zip