没有合适的资源?快使用搜索试试~ 我知道了~
Design of the Java HotSpot.pdf
需积分: 5 0 下载量 67 浏览量
2023-07-19
18:12:53
上传
评论
收藏 1.09MB PDF 举报
温馨提示
试读
32页
技术文档分享。
资源推荐
资源详情
资源评论
7
Design of the Java HotSpot
TM
Client
Compiler for Java 6
THOMAS KOTZMANN, CHRISTIAN WIMMER
and HANSPETER M
¨
OSSENB
¨
OCK
Johannes Kepler University Linz
and
THOMAS RODRIGUEZ, KENNETH RUSSELL, and DAVID COX
Sun Microsystems, Inc.
Version 6 of Sun Microsystems’ Java HotSpot
TM
VM ships with a redesigned version of the client
just-in-time compiler that includes several research results of the last years. The client compiler
is at the heart of the VM configuration used by default for interactive desktop applications. For
such applications, low startup and pause times are more important than peak performance. This
paper outlines the new architecture of the client compiler and shows how it interacts with the
VM. It presents the intermediate representation that now uses static single-assignment (SSA)
form and the linear scan algorithm for global register allocation. Efficient support for exception
handling and deoptimization fulfills the demands that are imposed by the dynamic features of
the Java programming language. The evaluation shows that the new client compiler generates
better code in less time. The popular SPECjvm98 benchmark suite is executed 45% faster, while
the compilation speed is also up to 40% better. This indicates that a carefully selected set of global
optimizations can also be integrated in just-in-time compilers that focus on compilation speed and
not on peak performance. In addition, the paper presents the impact of several optimizations on
execution and compilation speed. As the source code is freely available, the Java HotSpot
TM
VM and
the client compiler are the ideal basis for experiments with new feedback-directed optimizations
in a production-level Java just-in-time compiler. The paper outlines research projects that add fast
algorithms for escape analysis, automatic object inlining, and array bounds check elimination.
Categories and Subject Descriptors: D.3.4 [Programming Languages]: Processors—Compilers,
Optimization, Code generation
General Terms: Algorithms, Languages, Performance
Additional Key Words and Phrases: Java, compiler, just-in-time compilation, optimization,
intermediate representation, register allocation, deoptimization
Authors’ addresses: Thomas Kotzmann, Christian Wimmer, and Hanspeter M
¨
ossenb
¨
ock, Institute
for System Software, Christian Doppler Laboratory for Automated Software Engineering, Johannes
Kepler University Linz, Austria; email: {kotzmann, wimmer, moessenboeck}@ssw.jku.at.
Thomas Rodriguez, Kenneth Russell, and David Cox, Sun Microsystems, Inc., 4140 Network Circle,
Santa Clara, CA 95054; email: {thomas.rodriguez, kenneth.russell, david.cox}@sun.com.
Permission to make digital or hard copies of part or all of this work for personal or classroom use is
granted without fee provided that copies are not made or distributed for profit or direct commercial
advantage and that copies show this notice on the first page or initial screen of a display along
with the full citation. Copyrights for components of this work owned by others than ACM must be
honored. Abstracting with credit is permitted. To copy otherwise, to republish, to post on servers,
to redistribute to lists, or to use any component of this work in other works requires prior specific
permission and/or a fee. Permissions may be requested from Publications Dept., ACM, Inc., 2 Penn
Plaza, Suite 701, New York, NY 10121-0701 USA, fax +1 (212) 869-0481, or permissions@acm.org.
C
2008 ACM 1544-3566/2008/05-ART7 $5.00 DOI 10.1145/1369396.1370017 http://doi.acm.org/
10.1145/1369396.1370017
ACM Transactions on Architecture and Code Optimization, Vol. 5, No. 1, Article 7, Publication date: May 2008.
7:2
•
T. Kotzmann et al.
ACM Reference Format:
Kotzmann, T., Wimmer, C., M
¨
ossenb
¨
ock, H., Rodriguez, T., Russell, K., and Cox, D. 2008.
Design of the Java HotSpot
TM
client compiler for Java 6. ACM Trans. Architec. Code
Optim. 5, 1, Article 7 (May 2008), 32 pages. DOI = 10.1145/1369396.1370017 http://doi.acm.org/
10.1145/1369396.1370017.
1. INTRODUCTION
About 2 years after the release of Java SE 5.0, Sun Microsystems completed
version 6 of the Java platform. In contrast to the previous version, it did not in-
troduce any changes in the language, but provided a lot of improvements behind
the scenes [Coward 2006]. Especially the Java HotSpot
TM
client compiler, one
of the two just-in-time compilers in Sun Microsystems’ virtual machine, was
subject to several modifications and promises a notable gain in performance.
Even Sun’s development process differs from previous releases. For the first
time in the history of Java, weekly source snapshots of the project have been
published on the Internet [Sun Microsystems, Inc. 2006b]. This approach is part
of Sun’s new ambitions in the open-source area [OpenJDK 2007] and encourages
developers throughout the world to submit their contributions and bugfixes.
The collaboration on this project requires a thorough understanding of the
virtual machine, so it is more important than ever to describe and explain its
parts and their functionality. At the current point in time, however, the available
documentation of internals is rather sparse and incomplete. This paper gives
an insight into those parts of the virtual machine that are responsible for or af-
fected by the just-in-time compilation of bytecodes. It contributes the following:
—It starts with an overview of how just-in-time compilation is embedded in the
virtual machine and when it is invoked.
—It describes the structure of the client compiler and its compilation phases.
—It explains the different intermediate representations and the operations
that are performed on them.
—It discusses ongoing research on advanced compiler optimizations and details
how these optimizations are affected by just-in-time compilation.
—It evaluates the performance gains both compared to the previous version of
Sun’s JDK and to competitive products.
Despite the fact that this paper focuses on characteristics of Sun’s VM and
presupposes some knowledge about compilers, it not only addresses people who
are actually confronted with the source code but everybody who is interested
in the internals of the JVM or in compiler optimization research. Therefore, it
will give both a general insight into involved algorithms and describe how they
are implemented in Sun’s VM.
1.1 Architecture of the Java HotSpot
TM
VM
Java achieves portability by translating source code [Gosling et al. 2005] into
platform-independent bytecodes. To run Java programs on a particular plat-
form, a Java virtual machine [Lindholm and Yellin 1999] must exist for that
ACM Transactions on Architecture and Code Optimization, Vol. 5, No. 1, Article 7, Publication date: May 2008.
Design of the Java HotSpot Client Compiler for Java 6
•
7:3
client compiler
server compiler
JIT compiler
interpreter
stop & copy
mark & compact
...
garbage collector
bytecodes
native method
debugging info
object maps
machine code
compiles
interprets
generates
uses
heapstacksmethod
young generation
old generation
permanent generation
thread
1
thread
n
...
collectsaccesses
accesses
Fig. 1. Architecture of the Java HotSpot
TM
VM.
platform. It executes bytecodes after checking that they do not compromise
the security or reliability of the underlying machine. Sun Microsystems’
implementation of such a virtual machine is called Java HotSpot
TM
VM [Sun
Microsystems, Inc. 2006a].
The overall architecture is shown in Figure 1. The execution of a Java pro-
gram starts in the interpreter, which steps through the bytecodes of a method
and executes a code template for each instruction. Only the most frequently
called methods, referred to as hot spots, are scheduled for just-in-time (JIT)
compilation. As most classes used in a method are loaded during interpreta-
tion, information about them is already available at the time of JIT compilation.
This information allows the compiler to inline more methods and to generate
better optimized machine code.
If a method contains a long-running loop, it may be compiled regardless
of its invocation frequency. The VM counts the number of backward branches
taken and, when a threshold is reached, it suspends interpretation and compiles
the running method. A new stack frame for the native method is set up and
initialized to match the interpreter’s stack frame. Execution of the method
then continues using the machine code of the native method. Switching from
interpreted to compiled code in the middle of a running method is called on-
stack-replacement (OSR) [H
¨
olzle and Ungar 1994; Fink and Qian 2003].
The Java HotSpot
TM
VM has two alternative just-in-time compilers: the
server and the client compiler. The server compiler [Paleczny et al. 2001] is
a highly optimizing compiler tuned for peak performance at the cost of com-
pilation speed. Low compilation speed is acceptable for long-running server
applications, because compilation impairs performance only during the warm-
up phase and can usually be done in the background if multiple processors are
available.
For interactive client programs with graphical user interfaces, however, re-
sponse time is more important than peak performance. For this purpose, the
client compiler was designed to achieve a trade-off between the performance
of the generated machine code and compilation speed [Griesemer and Mitrovic
2000]. This paper presents the architecture of the revised client compiler in the
JDK 6.
ACM Transactions on Architecture and Code Optimization, Vol. 5, No. 1, Article 7, Publication date: May 2008.
7:4
•
T. Kotzmann et al.
All modern Java virtual machines implement synchronization with a thin
lock scheme [Agesen et al. 1999; Bacon et al. 1998]. Sun’s JDK 6 extends
this concept by biased locking [Russell and Detlefs 2006], which uses concepts
similar to Kawachiya et al. [2002]. Previously, an object was locked always
atomically just in case that two threads synchronize on it at the same time. In
the context of biased locking, a pointer to the current thread is stored in the
header of an object when it is locked for the first time. The object is then said
to be biased toward the thread. As long as the object is locked and unlocked by
the same thread, synchronizations need not be atomic.
The generational garbage collector [Ungar 1984] of the Java HotSpot
TM
VM
manages dynamically allocated memory. It uses exact garbage collection tech-
niques, so every object and every pointer to an object must be precisely known at
GC time. This is essential for supporting compacting collection algorithms. The
memory is split into three generations: a young generation for newly allocated
objects, an old generation for long-lived objects, and a permanent generation
for internal data structures.
New objects are allocated sequentially in the young generation. Since each
thread has a separate thread-local allocation buffer (TLAB), allocation opera-
tions are multithread-safe without any locking. When the young generation fills
up, a stop-and-copy garbage collection is initiated. When objects have survived
a certain number of collection cycles, they are promoted to the old generation,
which is collected by a mark-and-compact algorithm [Jones and Lins 1996].
The Java HotSpot
TM
VM also provides various other garbage collectors [Sun
Microsystems, Inc. 2006c]. Parallel garbage collectors for server machines with
large physical memories and multiple CPUs distribute the work among mul-
tiple threads, thus decreasing the garbage collection overhead and increasing
the application throughput. A concurrent mark-and-sweep algorithm [Boehm
et al. 1991; Printezis and Detlefs 2000] allows the user program to continue its
execution while dead objects are reclaimed.
Exact garbage collection requires information about pointers to heap objects.
For machine code, this information is contained in object maps (also called oop
maps) created by the JIT compiler. Besides, the compiler creates debugging
information that maps the state of a compiled method back to the state of the
interpreter. This enables aggressive compiler optimizations, because the VM
can deoptimize [H
¨
olzle et al. 1992] back to a safe state when the assumptions
under which an optimization was performed are invalidated (see Section 2.6).
The machine code, the object maps, and the debugging information are stored
together in a so-called native method object. Garbage collection and deoptimiza-
tion are allowed to occur only at some discrete points in the program, called
safepoints, such as backward branches, method calls, return instructions, and
operations that may throw an exception.
Apart from advanced JIT compilers, sophisticated mechanisms for synchro-
nization, and state-of-the-art garbage collectors, the new Java HotSpot
TM
VM
also features object packing functionality to minimize the wasted space be-
tween data types of different sizes, on-the-fly class redefinition, and full-speed
debugging. It is available in 32-bit and 64-bit editions for the Solaris operating
system on SPARC and Intel platforms, for Linux, and for Microsoft Windows.
ACM Transactions on Architecture and Code Optimization, Vol. 5, No. 1, Article 7, Publication date: May 2008.
Design of the Java HotSpot Client Compiler for Java 6
•
7:5
front end back end
LIR generation
HIR generation
code generation
bytecodes HIR LIR machine code
optimization register allocation
Fig. 2. Structure of the Java HotSpot
TM
client compiler.
1.2 Design Changes of the Client Compiler
The design changes of the client compiler for version 6 focus on a more aggres-
sive optimization of the machine code. In its original design, the client compiler
implemented only a few high-impact optimizations, whereas the server com-
piler performed global optimizations across basic block boundaries, such as
global value numbering or loop unrolling. The goal for Java 6 was to adopt
some of these optimizations for the client compiler.
Before Java 6, the high-level intermediate representation (HIR) of the client
compiler was not suitable for global optimizations. It was not in static single-
assignment (SSA) form [Cytron et al. 1991] and required local variables to
be explicitly loaded and stored. The new HIR is in SSA form. Load and store
instructions for local variables are eliminated by keeping track of the virtual
registers that contain the variables’ current values [M
¨
ossenb
¨
ock 2000].
Another major design change was the implementation of a linear scan regis-
ter allocator [M
¨
ossenb
¨
ock and Pfeiffer 2002; Wimmer and M
¨
ossenb
¨
ock 2005].
The previous approach was to allocate a register immediately before an in-
struction and free it after the instruction has been processed. Only if a register
remained unassigned throughout a method, it was used to cache the value of
a frequently accessed local variable. This algorithm was simple and fast, but
resulted in a large number of memory loads and stores. The linear scan regis-
ter allocation produces more efficient machine code and is still faster than the
graph coloring algorithm used by the server compiler.
The design changes and the SSA form, in particular, facilitate a new family
of global optimizations. Ongoing research projects deal with escape analysis
and automatic object inlining to reduce the costs associated with memory man-
agement. These projects are described in Section 3.
2. STRUCTURE OF THE CLIENT COMPILER
The client compiler is a just-in-time compiler that aims at a low startup time and
a small memory footprint. The compilation of a method is split into three phases,
allowing more optimizations to be done than in a single pass over the bytecodes.
All information communicated between the phases is stored in intermediate
representations of the program.
Figure 2 shows the structure of the client compiler. First, a high-level inter-
mediate representation (HIR) of the compiled method is built via an abstract
interpretation of the bytecodes. It consists of a control-flow graph (CFG), whose
basic blocks are singly linked lists of instructions. The HIR is in static single-
assignment (SSA) form, which means that for every variable there is just a
ACM Transactions on Architecture and Code Optimization, Vol. 5, No. 1, Article 7, Publication date: May 2008.
剩余31页未读,继续阅读
资源评论
weixin_44079197
- 粉丝: 1220
- 资源: 599
上传资源 快速赚钱
- 我的内容管理 展开
- 我的资源 快来上传第一个资源
- 我的收益 登录查看自己的收益
- 我的积分 登录查看自己的积分
- 我的C币 登录后查看C币余额
- 我的收藏
- 我的下载
- 下载帮助
安全验证
文档复制为VIP权益,开通VIP直接复制
信息提交成功