没有合适的资源?快使用搜索试试~ 我知道了~
数据库 数据分析 流水线 适配
资源推荐
资源详情
资源评论
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![gz](https://img-home.csdnimg.cn/images/20210720083447.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![gz](https://img-home.csdnimg.cn/images/20210720083447.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)
![7z](https://img-home.csdnimg.cn/images/20210720083312.png)
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![rar](https://img-home.csdnimg.cn/images/20210720083606.png)
![exe](https://img-home.csdnimg.cn/images/20210720083343.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![docx](https://img-home.csdnimg.cn/images/20210720083331.png)
![zip](https://img-home.csdnimg.cn/images/20210720083736.png)
![xlsx](https://img-home.csdnimg.cn/images/20210720083732.png)
![](https://csdnimg.cn/release/download_crawler_static/89346803/bg1.jpg)
See discussions, stats, and author profiles for this publication at: https://www.researchgate.net/publication/45338800
MonetDB/X100: Hyper-Pipelining Query Execution
Article · January 2005
Source: DBLP
CITATIONS
517
READS
2,328
3 authors, including:
Peter A. Boncz
Centrum Wiskunde & Informatica
143 PUBLICATIONS7,120 CITATIONS
SEE PROFILE
Niels Nes
Centrum Wiskunde & Informatica
49 PUBLICATIONS2,563 CITATIONS
SEE PROFILE
All content following this page was uploaded by Niels Nes on 21 May 2014.
The user has requested enhancement of the downloaded file.
![](https://csdnimg.cn/release/download_crawler_static/89346803/bg2.jpg)
MonetDB/X100: Hyper-Pipelining Query Execution
Peter Boncz, Marcin Zukowski, Niels Nes
CWI
Kruislaan 413
Amsterdam, The Netherlands
{P.Boncz,M.Zukowski,N.Nes}@cwi.nl
Abstract
Database systems tend to achieve only
low IPC (instructions-per-cycle) efficiency on
modern CPUs in compute-intensive applica-
tion areas like decision support, OLAP and
multimedia retrieval. This paper starts with
an in-depth investigation to the reason why
this happens, focusing on the TPC-H bench-
mark. Our analysis of various relational sys-
tems and MonetDB leads us to a new set of
guidelines for designing a query processor.
The second part of the paper describes the
architecture of our new X100 query engine
for the MonetDB system that follows these
guidelines. On the surface, it resembles a
classical Volcano-style engine, but the cru-
cial difference to base all execution on the
concept of vector processing makes it highly
CPU efficient. We evaluate the power of Mon-
etDB/X100 on the 100GB version of TPC-H,
showing its raw execution power to be between
one and two orders of magnitude higher than
previous technology.
1 Introduction
Modern CPUs can perform enormous amounts of cal-
culations per second, but only if they can find enough
independent work to exploit their parallel execution
capabilities. Hardware developments during the past
decade have significantly increased the speed difference
between a CPU running at full throughput and mini-
mal throughput, which can now easily be an order of
magnitude.
Permission to copy without fee all or part of this material is
granted provided that the copies are not made or distributed for
direct commercial advantage, the VLDB copyright notice and
the title of the publication and its date appear, and notice is
given that copying is by permission of the Very Large Data Base
Endowment. To copy otherwise, or to republish, requires a fee
and/or special permission from the Endowment.
Proceedings of the 2005 CIDR Conference
One would expect that query-intensive database
workloads such as decision support, OLAP, data-
mining, but also multimedia retrieval, all of which re-
quire many independent calculations, should provide
modern CPUs the opportunity to get near optimal IPC
(instructions-per-cycle) efficiencies.
However, research has shown that database systems
tend to achieve low IPC efficiency on modern CPUs in
these application areas [6, 3]. We question whether
it should really be that way. Going beyond the (im-
portant) topic of cache-conscious query processing, we
investigate in detail how relational database systems
interact with modern super-scalar CPUs in query-
intensive workloads, in particular the TPC-H decision
support benchmark.
The main conclusion we draw from this investiga-
tion is that the architecture employed by most DBMSs
inhibits compilers from using their most performance-
critical optimization techniques, resulting in low CPU
efficiencies. Particularly, the common way to im-
plement the popular Volcano [10] iterator model for
pipelined processing, leads to tuple-at-a-time execu-
tion, which causes both high interpretation overhead,
and hides opportunities for CPU parallelism from the
compiler.
We also analyze the performance of the main mem-
ory database system MonetDB
1
, developed in our
group, and its MIL query language [4]. MonetDB/MIL
uses a column-at-a-time execution model, and there-
fore does not suffer from problems generated by tuple-
at-a-time interpretation. However, its policy of full
column materialization causes it to generate large data
streams during query execution. On our decision sup-
port workload, we found MonetDB/MIL to become
heavily constrained by memory bandwidth, causing its
CPU efficiency to drop sharply.
Therefore, we argue for combining the column-wise
execution of MonetDB with the incremental material-
ization offered by Volcano-style pipelining.
We designed and implemented from scratch a new
query engine for the MonetDB system, called X100,
1
MonetDB is now in open-source, see monetdb.cwi.nl
![](https://csdnimg.cn/release/download_crawler_static/89346803/bg3.jpg)
that employs a vectorized query processing model.
Apart from achieving high CPU efficiency, Mon-
etDB/X100 is intended to scale up towards non main-
memory (disk-based) datasets. The second part of this
paper is dedicated to describing the architecture of
MonetDB/X100 and evaluating its performance on the
full TPC-H benchmark of size 100GB.
1.1 Outline
This paper is organized as follows. Section 2 provides
an introduction to modern super-scalar (or hyper-
pipelined) CPUs, covering the issues most relevant for
query evaluation performance. In Section 3, we study
TPC-H Query 1 as a micro-benchmark of CPU effi-
ciency, first for standard relational database systems,
then in MonetDB, and finally we descend into a stan-
dalone hand-coded implementation of this query to get
a baseline of maximum achievable raw performance.
Section 4 describes the architecture of our new X100
query processor for MonetDB, focusing on query exe-
cution, but also sketching topics like data layout, in-
dexing and updates.
In Section 5, we present a performance comparison
of MIL and X100 inside the Monet system on the TPC-
H benchmark. We discuss related work in Section 6,
before concluding in Section 7.
2 How CPUs Work
Figure 1 displays for each year in the past decade the
fastest CPU available in terms of MHz, as well as high-
est performance (one thing does not necessarily equate
the other), as well as the most advanced chip manu-
facturing technology in production that year.
The root cause for CPU MHz improvements is
progress in chip manufacturing process scales, that
typically shrink by a factor 1.4 every 18 months (a.k.a.
Moore’s law [13]). Every smaller manufacturing scale
means twice (the square of 1.4) as many, and twice
smaller transistors, as well as 1.4 times smaller wire
distances and signal latencies. Thus one would expect
CPU MHz to increase with inverted signal latencies,
but Figure 1 shows that clock speed has increased even
further. This is mainly done by pipelining: dividing
the work of a CPU instruction in ever more stages.
Less work per stage means that the CPU frequency
can be increased. While the 1988 Intel 80386 CPU
executed one instruction in one (or more) cycles, the
1993 Pentium already had a 5-stage pipeline, to be in-
creased in the 1999 PentiumIII to 14 while the 2004
Pentium4 has 31 pipeline stages.
Pipelines introduce two dangers: (i) if one instruc-
tion needs the result of a previous instruction, it can-
not be pushed into the pipeline right after it, but must
wait until the first instruction has passed through the
pipeline (or a significant fraction thereof), and (ii) in
case of IF-a-THEN-b-ELSE-c branches, the CPU must
130nm
250nm
500nm
pipelining
hyper−pipelining
Alpha21164A
350nm
Athlon
Pentium4
Alpha21164
Alpha21164B
POWER4
Itanium2
Alpha21064A
Alpha21064
1000
10000
1994 1996 1998 2000 2002
1000
10000
1994 1996 1998 2000 2002
1000
10000
1994 1996 1998 2000 2002
1000
10000
1994 1996 1998 2000 2002
inverted gate distance
CPU Performance (SPECcpu int+fp)
CPU MHz
Figure 1: A Decade of CPU Performance
predict whether a will evaluate to true or false. It
might guess the latter and put c into the pipeline, just
after a. Many stages further, when the evaluation of
a finishes, it may determine that it guessed wrongly
(i.e. mispredicted the branch), and then must flush
the pipeline (discard all instructions in it) and start
over with b. Obviously, the longer the pipeline, the
more instructions are flushed away and the higher the
performance penalty. Translated to database systems,
branches that are data-dependent, such as those found
in a selection operator on data with a selectivity that
is neither very high nor very low, are impossible to
predict and can significantly slow down query execu-
tion [17].
In addition, super-scalar CPUs
2
offer the possibility
to take multiple instructions into execution in parallel
if they are independent. That is, the CPU has not one,
but multiple pipelines. Each cycle, a new instruction
can be pushed into each pipeline, provided again they
are independent of all instructions already in execu-
tion. A super-scalar CPU can get to an IPC (Instruc-
tions Per Cycle) of > 1. Figure 1 shows that this has
allowed real-world CPU performance to increase faster
than CPU frequency.
Modern CPUs are balanced in different ways. The
Intel Itanium2 processor is a VLIW (Very Large In-
struction Word) processor with many parallel pipelines
(it can execute up to 6 instructions per cycle) with
only few (7) stages, and therefore a relatively low clock
speed of 1.5GHz. In contrast, the Pentium4 has its
very long 31-stage pipeline allowing for a 3.6GHz clock
speed, but can only execute 3 instructions per cycle.
Either way, to get to its theoretical maximum through-
put, an Itanium2 needs 7x6 = 42 independent instruc-
tions at any time, while the Pentium4 needs 31x3 = 93.
Such parallelism cannot always be found, and there-
fore many programs use the resources of the Itanium2
much better than the Pentium4, which explains why in
benchmarks the performance of both CPUs is similar,
despite the big clock speed difference.
2
Intel introduced the term hyper-pipelined as a synonym for
“super-scalar”, to market its Pentium4 CPU.
剩余13页未读,继续阅读
资源评论
![avatar-default](https://csdnimg.cn/release/downloadcmsfe/public/img/lazyLogo2.1882d7f4.png)
![avatar](https://profile-avatar.csdnimg.cn/2e68a074bee5432f94c1a7b976b64d1a_weixin_44043328.jpg!1)
weixin_44043328
- 粉丝: 0
- 资源: 2
上传资源 快速赚钱
我的内容管理 展开
我的资源 快来上传第一个资源
我的收益
登录查看自己的收益我的积分 登录查看自己的积分
我的C币 登录后查看C币余额
我的收藏
我的下载
下载帮助
![voice](https://csdnimg.cn/release/downloadcmsfe/public/img/voice.245cc511.png)
![center-task](https://csdnimg.cn/release/downloadcmsfe/public/img/center-task.c2eda91a.png)
最新资源
资源上传下载、课程学习等过程中有任何疑问或建议,欢迎提出宝贵意见哦~我们会及时处理!
点击此处反馈
![feedback](https://img-home.csdnimg.cn/images/20220527035711.png)
![feedback](https://img-home.csdnimg.cn/images/20220527035711.png)
![feedback-tip](https://img-home.csdnimg.cn/images/20220527035111.png)
安全验证
文档复制为VIP权益,开通VIP直接复制
![dialog-icon](https://csdnimg.cn/release/downloadcmsfe/public/img/green-success.6a4acb44.png)