RESEARCH ARTICLE
DeepComp: towards a balanced system design for high
performance computer systems
Mingfa ZHU (✉)
1,2
, Limin XIAO
1,2
, Li RUAN
2
, Qinfen HAO
2
1 State Key Laboratory of Software Development Environment, Beijing 100191, China
2 School of Computer Science and Engineering, Beihang University, Beijing 100191, China
© Higher Education Press and Springer-Verlag Berlin Heidelberg 2010
Abstract Today, cluster-based computing is the main-
stream architecture for high end computer systems.
Balanced system design is critical for large scale cluster
systems to achieve high efficiency. This paper addresses
the practice on DeepComp high end computer systems
toward a balanced system design. Methodologies of
designing balanced large scale cluster systems are given.
A method for balancing central processing unit (CPU) and
memory hierarchy is addressed. For balancing computing
nodes and I/O systems, two approaches are given:
maximum bandwidth criterion and maximum number of
computing nodes which can concurrently access I/O
systems. Experiences of Lenovo high end cluster systems
show that above methods are effective. Lenovo strategies
toward a balanced system design for both peta and 10 peta
scale high productivity computing systems (HPCSs).
Keywords high performance computer systems (HPCs),
high productivity computing systems (HPCSs), cluster,
balanced system design
1 Introduction
Since the middle of this decade, the cluster has been the
main stream architecture for high performance computer
systems (HPCs) or high end computer systems and has a
share greater than 80% in recent world TOP500 lists. A
number of factors helped this happen including technical
progresses in central processing unit (CPU) chips,
operating systems, interconnection networks, and high
Linpack and application efficiencies, which are achieved
by a balanced system design, application algorithm
optimization, and runtime optimization.
From an architecture point of view, a system is said to
be a balanced system if there is no bottle neck in any key
data channel and all devices are able to obtain data from
data suppliers in time for their own purposes. In a cluster
system, the main data channels are between the CPUs and
the main memory banks, within computing nodes, among
computing nodes, and in between computing nodes and
I/O systems, or storage (RAID disk). In another words, a
cluster system is a balanced system if the following are
true: if the data bandwidth of the memory hierarchy meets
the needs of all CPUs in any and all computing nodes, if
the power of the communication system matches the
computing power of all computing nodes, and if the
bandwidth of the I/O system meets the need of all
computing nodes.
Today, Moore’s Law of CPU chips with respect to
speed still holds. Although Moore’s Law with respect to
main memory capacity holds, memory access speed
increase much more slowly (about 10% each year) and
the gap between CPU speed and memory speed is getting
larger and larger. In large scale cluster systems, there are
serious bottle necks between the large number of
computing nodes and I/O systems, and bottle necks also
exist in communication systems which include intercon-
nection network hardware and message passing software
packages. Therefore, in a large scale cluster design, a
serious issue is the system balance issue which includes
balance between CPU computing power and memory data
supply power, balance between node computing power
Received August 17, 2010; accepted September 18, 2010
E-mail: zhumf@buaa.edu.cn
Front. Comput. Sci. China 2010, 4(4): 475–479
DOI 10.1007/s11704-010-0150-z